Tree-based learning on amperometric time series data demonstrates high accuracy for classification

Autor: Krishnan, Jeyashree, Lian, Zeyu, Oomen, Pieter E., Amir-Aref, Mohaddeseh, He, Xiulan, Majdi, Soodabeh, Schuppert, Andreas, Ewing, Andrew
Zdroj: International Journal of Data Science and Analytics; 20240101, Issue: Preprints p1-16, 16p
Abstrakt: Elucidating exocytosis processes provides insights into cellular neurotransmission mechanisms and may have potential in research on neurodegenerative diseases. Amperometry is an established electrochemical method for detection of neurotransmitters released from and stored inside cells. An important aspect of the amperometry method is the sub-millisecond temporal resolution of the current recordings which usually leads to several hundreds of gigabytes of high-quality data. In this study, we present a universal method for the classification with respect to diverse amperometric datasets using well-established data-driven approaches in computational science. We demonstrate a very high prediction accuracy (≥95%). This includes an end-to-end systematic machine learning workflow for amperometric time series datasets consisting of pre-processing; feature extraction; model identification; training and testing, followed by feature importance evaluation—all implemented. We tested the method on heterogeneous amperometric time series datasets generated using different experimental approaches, chemical stimulations, electrode types, and varying recording times. We identified a certain overarching set of common features across these datasets which enables accurate predictions. Further, we showed that information relevant for the classification of amperometric traces is neither in the spiky segments alone, nor can it be retrieved from just the temporal structure of spikes. In fact, the transients between spikes and the trace baselines carry essential information for a successful classification, thereby strongly demonstrating that an effective feature representation of amperometric time series requires the full time series. To our knowledge, this is one of the first studies that propose a scheme for machine learning, and in particular, supervised learning on full amperometry time series data.
Databáze: Supplemental Index