Citizen science on Hive - top quark production at the Large Hadron Collider - solutions

It is almost a month ago that I released the fourth episode of our citizen science project on Hive. Time flies fast, I know…

In this last episode, there were some assignments to which I promised to deliver the answers shortly later. Unfortunately, offline family time, a conference, and now COVID delayed this. However, patience is a great virtue (and a bad excuse ;) ).

In the present post, I provide the solutions of the various exercises that I proposed. A new episode of our project is planned to be released hopefully very soon (I started to write it)… It will be dedicated to the simulation of a potential new physics signal at CERN’s Large Hadron Collider.

Let’s first start with a small recaps of the previous episodes, which could be useful for anyone who would like to embark in this project. It is still time! It suffices to start with the four episodes below and publish associated reports on chain (with the #citizenscience tag). They will be reviewed and commented out, and I am always available on chain and on the STEMsocial Discord server to help.

Just a special mention to @isnochys, @metabs and @mengene. Are you still interested in this project? Are you stuck somewhere? Don’t hesitate to to reach me out if needed.

[Credits: geralt (Pixabay)]

As usual, before moving on with the main material for the exercises of this week, I acknowledge all participants to this project and supporters from our community: @agmoore, @agreste, @aiovo, @alexanderalexis, @amestyj, @darlingtonoperez, @eniolw, @firstborn.pob, @gentleshaid, @gtg, @isnochys, @ivarbjorn, @linlove, @mengene, @mintrawa, @robotics101, @servelle, @travelingmercies and @yaziris. Please let me know if you want to be added or removed from this list.

1 - Jet multiplicity

In order to obtain the distribution in the number of jets associated with our simulated events, we mimic the syntax introduced for leptons and photons. The only difference is related to the usage of the label j (for jets) instead of l (for leptons) or a (for photons).

We draw the plots twice, once without any restriction on the jets, and once after focusing only on jets with a transverse momentum larger than 25 GeV. This is achieved through the code:

ma5> plot N(j) 10 0 10
ma5> select (j) PT > 25
ma5> plot N(j) 10 0 10
ma5> submit
ma5> open

The important feature to notice here is that I requested histograms containing 10 bins (ranging from 0 to 10). This allows to have a small enough number of events in the overflow bin (less than 0.1% in my case).

Controlling overflow and underflow bins is always a good practice, and this can easily be achieved through the HTML page generated by the code. We get (with the statistics available from the generated web page):

[Credits: @lemouth]

The same exercise can be done with b-jets now (making use of the the symbol b in the code).

ma5> plot N(b) 5 0 5
ma5> select (b) PT > 25
ma5> plot N(b) 5 0 5
ma5> submit
ma5> open

Here, a histogram containing 5 bins is enough and there is no stress to have with the overflow bin (it is empty in my case). We obtain as a result:

[Credits: @lemouth]

As can be seen there are little variations between the two figures. The reason is simple. The b-jets originate in the considered signal from the decay of massive top quarks. They should therefore carry a significant fraction of the available (mass and kinetic) energy of the decaying top quarks. Therefore, the restrictions imposed on the b-jets (25 GeV is only 1/7 of the top quark mass) almost automatically selects all available b-jets. The two figures are therefore almost identical.

2 - Lepton multiplicity after selection

In the course of the analysis, we imposed the number of leptons to be equal to 1. Therefore, if we plot the lepton multiplicity before and after the cut, we should see the distribution reducing to a histogram including a single populated bin, centred on 1. Any event featuring 0 or at least 2 leptons should indeed get rejected by the selection.

This is achieved through the commands

ma5> plot N(l) 5 0 5
ma5> select N(l)==1
ma5> plot N(l) 5 0 5

that leads to the figure:

[Credits: @lemouth]

The behaviour observed is that expected. Everything is thus fine!

3 - Missing transverse energy spectrum

The final question asked in the previous episode of our project concerned the the missing transverse energy distribution. I mentioned somewhere in that blog that the MET keyword was associated with this quantity. It corresponds to the total amount of energy carried away by the invisible particles produced in a (simulated) collision.

Here, we considered the production of two top quarks, one of them decaying into 1 b-jet and two lighter jets, and the other decaying into 1 b-jet, 1 lepton and a neutrino. The neutrino is an invisible particle and therefore leaves the detector… undetected. We should thus have a significant amount of missing energy in the events.

This can be tested by typing in the command line interface of the code:

ma5> plot MET 50 0 350 [logY]

I have required a histogram of 50 bins ranging from 0 GeV to 350 GeV, the bounds allowing to have not too many events populating the overflow bin (0.5%) and a reasonable bin size related to the number of generated events (so that we are not too sensitive to statistical fluctuations). The results read:

[Credits: @lemouth]

As expected, we observed a peak at about 50 GeV, which is what could be expected from neutrinos typically produced in LHC collisions.

Summary: deciphering top pair production at the LHC

As promised, I finally released the solutions to the assignments I proposed in the fourth episode of our citizen science project on Hive. This post hence contains what should be done to solve the three questions raised in the assignment that I proposed 3 weeks ago.

I apologise again in the delay of releasing this. I just under-estimated how life could keep us busy… Please be ready for episode 5, that I hope to release very soon (I first need to catch up with a few French adaptations of the previous physics blogs before doing so).

In the meantime, feel free to come back to me if needed, or join the effort! It is never too late!

3 columns
2 columns
1 column