Popis: |
Heart rate is a crucial metric in health monitoring. Traditional computer vision solutions estimate cardiac signals by detecting physical manifestations of heartbeats, such as facial discoloration caused by blood oxygenation changes, from subject videos using regression methods. As continuous signals are more complex and expensive to de-noise, this study introduces an alternative approach, employing end-to-end classification models to remotely derive a discrete representation of cardiac signals from face videos. These visual cardiac signal classifiers are trained on discretized cardiac signals, a novel pre-processing method with limited precedent in health monitoring literature. Consequently, various methods to convert continuous cardiac signals into binary form are presented, and their impact on training is evaluated. An implementation of this approach, the temporal shift convolutional attention binary classifier, is presented using the regression-based convolutional attention network architecture. The classifier and a baseline regression model are trained and tested using publicly available and locally collected datasets designed for heart signal detection from face video. The model performance is then assessed based on the heart rate error from the extracted cardiac signals. Results show the proposed method outperforms the baseline on the UBFC-rPPG dataset, reducing cross-dataset root mean square error from 2.33 to 1.63 beats per minute. However, both models struggled to generalize to the PURE dataset, with root mean square errors of 12.40 and 16.29 beats per minute, respectively. Additionally, the proposed approach reduces the computational complexity of model output post-processing, enhancing its suitability for real-time applications and deployment on systems with restricted resources. |