SymbolFit: Automatic Parametric Modeling with Symbolic Regression

Autor: Tsoi, Ho Fung, Rankin, Dylan, Caillol, Cecile, Cranmer, Miles, Dasu, Sridhara, Duarte, Javier, Harris, Philip, Lipeles, Elliot, Loncar, Vladimir
Rok vydání: 2024
Předmět:
Druh dokumentu: Working Paper
Popis: We introduce SymbolFit, a framework that automates parametric modeling by using symbolic regression to perform a machine-search for functions that fit the data, while simultaneously providing uncertainty estimates in a single run. Traditionally, constructing a parametric model to accurately describe binned data has been a manual and iterative process, requiring an adequate functional form to be determined before the fit can be performed. The main challenge arises when the appropriate functional forms cannot be derived from first principles, especially when there is no underlying true closed-form function for the distribution. In this work, we address this problem by utilizing symbolic regression, a machine learning technique that explores a vast space of candidate functions without needing a predefined functional form, treating the functional form itself as a trainable parameter. Our approach is demonstrated in data analysis applications in high-energy physics experiments at the CERN Large Hadron Collider (LHC). We demonstrate its effectiveness and efficiency using five real proton-proton collision datasets from new physics searches at the LHC, namely the background modeling in resonance searches for high-mass dijet, trijet, paired-dijet, diphoton, and dimuon events. We also validate the framework using several toy datasets with one and more variables.
Comment: 53 pages, 35 figures. Under review
Databáze: arXiv