As the particle physics community needs higher and higher precision in order to test our current model of the subatomic world, larger and larger datasets are necessary. With upgrades scheduled for the detectors of colliding-beam experiments around the world, and specifically at the Large Hadron Collider (LHC) at CERN, more collisions and more complex interactions are expected. This directly implies an increase in data produced and consequently in the computational resources needed to process them.
In a world where the climate crisis becomes an ever more pressing concern, and with the ballooning electricity needs of artificial intelligence, developing new methods and algorithms in order to minimize the energy costs of compute becomes a priority. Along with the new architectures and hardware available, algorithms need to be adapted to reduce compute waste.
At CERN, the amount of data produced is gargantuan: so big in fact that a year's worth of raw LHC data would roughly amount to the digital store capacity available in the entire world. This is why the data have to be heavily filtered and selected in real time before being permanently stored. This data can then be used to perform physics analyses, in order to expand our current understanding of the universe and improve the Standard Model of physics.
This real time filtering, known as triggering, involves complex processing happening often at frequencies as high as 40~MHz. This thesis contributes to understanding how machine learning models can be efficiently deployed in such environments, in order to maximize throughput and minimize energy consumption. Inevitably, modern hardware designed for such tasks and contemporary algorithms are needed in order to meet the challenges posed by the stringent, high-frequency data rates.
In this work, I present our graph neural network-based pipeline, developed for charged particles track reconstruction at the LHCb experiment at CERN. The pipeline was implemented end to end inside LHCb's first-level trigger, entirely on GPUs. Its performance was compared against the classical tracking algorithms currently in production at LHCb. The pipeline was also accelerated on the FPGA architecture, and its performance in terms of power consumption and processing speed was compared against the GPU implementation.
All in all, the work provides a thorough study of the nuances of deploying complex machine learning models in demanding, high-frequency data environments on heterogeneous computing architectures. Nonetheless, the field still has quite some progress to do in order to meet the challenges posed by the future accelerator experiments.