Citizen Science on Hive: Report #3 - Analysing simulated particle collisions using MadAnalysis5

This is my new report on the physics participative project on Hive started by @lemouth. You can follow all the entries by the group of hivers participating in the tag #citizenscience.

In the previous entries, we have installed a software capable of simulating particle collisions (Report #1) in the Large Hadron Collider at CERN, and used it to simulate in our computers 10000 proton collisions to generate a pair top-antitop quarks (Report #2). In this third report, we are taking the outputs of our collisions, and analysing it. After a particle "crash", some new particles are created and evolve, and thus, they are detected by the detectors at the LHC.

In case you want to join or follow this adventure, here below I leave a table with all the original entries and my output. In any case, there are also very good entries done by all the people participating in this activity (#citizenscience). They have been very useful to help me in my reports both with their own reports and their comments.

#Original PostMy report
IntroTowards a citizen science particle physics research project on Hive
1Citizen science particle physics project on Hive - Let’s get started!Report #1
2Citizen science on Hive - simulating top quark production at CERN’s Large Hadron ColliderReport #2
3Citizen science on Hive - detector effects and event reconstructionThis entry

Task 1 - Installing and configuring MadAnalysis5

MadAnalysis5 is available from GitHub. In my case, since I am working with a new Virtual Machine, the versioning software git was not installed in my computer. That has an easy solution, I went to a new terminal and typed:
sudo apt install git

Note: as a recommendation, I always do sudo apt update and sudo apt upgrade to check for possible updates in my system before installing anything new. But that is probably just a not-so-necessary habit.

Then we can "clone" (copy/download) the repository to our computer by navigating to the folder where we want it and doing:
git clone


The report by @travelingmercies put me on the hint that I had installed python 3.7 in my virtual machine and that prevents the software from working since it is looking for python 2. Her report has a solution for OSX. In order to make my Ubuntu based system to use python 3 by default I have entered the following in my terminal:

sudo update-alternatives --install /usr/bin/python python /usr/bin/python3 10


Before launching madanalysis5, we are going to install some required dependencies:

  1. Matplotlib:
    sudo apt install python3-matplotlib

  2. Latex and pdflatex. Note that @isnochys (report) and @gentleshaid (report) have not installed them and have been able to successfully finish the activity. In case you have not heard about it, Latex is a very common scientific documents and graphics preparation system where you "program" what you can see (documents, plots, formulas…). I installed the basics, some fonts and an extra package containing pdflatex:
    sudo apt install texlive-latex-base texlive-fonts-recommendend texlive-fonts-extra texlive-latex-extra


And now, we can launch madanalysis5. It is as simple as doing:

This command may vary depending on from where folder you are right now, just note that madanalysis5 is the folder that is downloaded from Github.

The software only detects 1 available core on my virtual machine and will ask for the maximum I want to dedicate to this analysis. I had no more alternative than selecting 1.


When MadAnalysis starts, it is giving the packages that are active. Following the guidelines for this report, the required ones are: Matplotlib, pdflatex (optional) and latex (optional).


Task 2 - Installing zlib and fastjet

Next step is to install two additional packages for event reconstruction (fastjet) and to read the compressed event files (zlib).

For that, remaining inside madanalysis5, we do:
install zlib
Process will install, restart MadAnalysis 5 and test itself again. Then we can continue with Fastjet:
install fastjet

This took a while, but now I can see how these packets are active in my installation of MadAnalysis5:


Task 3 - Top-antitop simulations

And here comes the moment when we start using the file that we generated in the last report. File is called tag_1_pythia8_events.hepmc.gz and it is inside the folder that was created with our outputs, in my case


The first step is to use a simulation of the detector to reconstruct the higher-level objects that one of the LHC experiments/detector would have found.

In order to run the detector simulator, we run from the madanalysis5 folder:

./bin/ma5 -R madanalysis/input/ATLAS_default.ma5


I think one of those names sounds familiar to me... By the way, ATLAS stands for A toroidal LHC apparatus and it is dedicated to study the Standard Model and Beyond Standard Model physics with proton collisions -like it is our objective-, but it is not the only experiment in the LHC. And by the way, as you can see in the photo bellow, the whole assembly is huge!

ID: CERN-PHOTO-202201-006-18. Source CERN, use allowed for educational purposes.

Then we indicate to the program where to find the file, the output name and launch the simulation:

import <Folder-where-I-installed-MG5_aMC>/report2/Events/run_01_decayed_1/tag_1_pythia8_events.hepmc.gz
set main.outputfile = myevents.lhe.gz


And after this process, a new file of less than 7MB was created as an output and located in ANALYSIS_0/Output/SAF/_defaultset/lheEvents0_0

Task 5 - Output analysis

Finally, it is time to analyse the file we have created and check the number of events created in our simulations.

Important part, we close (exit) the detector simulator and run madanalysis5 normally:


I put this in spotlight because I got stuck there, trying to process again the file with the MA5 with the detector simulator active. In this way, my program was trying to reconstruct again the file, performing the same processing that we did before in a file that was already reconstructed, returning an error.

Now, we indicate which is our file and name it ttbar:
import ANALYSIS_0/Output/SAF/_defaultset/lheEvents0_0/myevents.lhe.gz as ttbar

We need to set our production cross section, this value was given to us during the processing that we did in report #2. In case you do not remember which was the value, the information of our previous event generation can be retrieved and checked nicely in some HTML files that are created as part of the process and are stored in our output folder. My output was called report2 and the location will vary in your installation, but you can see in the screenshot that it should be available in a file called crossx.html that you can open with your web browser.


Then we do:
set ttbar.xsection = 505.491

And finally, the type of plot (histogram) and we execute:
plot NAPID


If we type open, a new HTML page opens with our report and the plots we are looking for:


In his post, @lemouth explains that our top-antitop decay producing a b-jet (b/b~) and a W boson. And they can decay into electron (e+/e-) and missing energy (ve/ve~), or a muon (mu+/mu-~) and missing energy, or a tau (ta+/ta-) and missing energy or two jets (g). And apparently, the probabilities are not uniform based on what we see on the graph. Additionally, jets and photons (a) may come from radiation.

Finished for today

This is the end of this episode of the #citizenscience project. It has been interesting to run our file and see some results. I hope with time I will start understanding more the concepts and keep digging during the next incoming weeks.

Technically this part has been quite easy, it is fun to notice how naked a new computer is, and how we need to keep installing even the simplest of the dependencies.

And it is so rewarding now that we can start to see plots and stuff coming! Thanks to everyone for the hints and tips.

3 columns
2 columns
1 column