FastRealBoostBins: An ensemble classifier for fast predictions implemented in Python via numba.jit and numba.cuda

Autor: Przemysław Klęsk
Jazyk: angličtina
Rok vydání: 2024
Předmět:
Zdroj: SoftwareX, Vol 26, Iss , Pp 101644- (2024)
Druh dokumentu: article
ISSN: 2352-7110
DOI: 10.1016/j.softx.2024.101644
Popis: Taking advantage of Numba (a high-performance just-in-time Python compiler), we provide a fast operating implementation of a boosting algorithm in which bins with logit transform values play the role of “weak learners”. The software comes as a Python class compliant with scikit-learn library. It allows to choose between CPU and GPU computations for each of the two stages: fit and predict (decision function). The efficiency of implementation has been confirmed on large data sets where the total of array entries (sample size × features count) was of order 1010 at fit stage and 108 at predict stage. In the case of GPU-based fit, the main boosting loop is designed as five CUDA kernels responsible for: weights binning, computing logits, computing exponential errors, finding the error minimizer, and examples reweighting. The GPU-based predict is computed by a single CUDA kernel. We apply suitable reduction patterns and mutexes to carry out summations and ‘argmin’ operations. To test the predict stage performance, we compare FastRealBoostBins against state-of-the-art classifiers from sklearn.ensemble using large data sets and focusing on response times. In an additional experiment, we make our classifiers operate as object detectors under heavy computational load (over 60k queries per a video frame using ensembles of size 2048).
Databáze: Directory of Open Access Journals