Temporal Dynamics of Workplace Acoustic Scenes: Egocentric Analysis and Prediction
Autor: | Shrikanth S. Narayanan, Karel Mundnich, Tiantian Feng, Arindam Jati, Amrutha Nadarajan, Benjamin Girault, Raghuveer Peri |
---|---|
Rok vydání: | 2021 |
Předmět: |
Audio signal
Acoustics and Ultrasonics Artificial neural network Computer science business.industry Speech recognition Deep learning Perspective (graphical) Sound recording and reproduction Computational Mathematics Identification (information) Recurrent neural network Computer Science (miscellaneous) Artificial intelligence Electrical and Electronic Engineering business Association (psychology) |
Zdroj: | IEEE/ACM Transactions on Audio, Speech, and Language Processing. 29:756-769 |
ISSN: | 2329-9304 2329-9290 |
DOI: | 10.1109/taslp.2021.3050265 |
Popis: | Identification of the acoustic environment from an audio recording, also known as acoustic scene classification, is an active area of research. In this paper, we study dynamically-changing background acoustic scenes from the egocentric perspective of an individual in a workplace. In a novel data collection setup, wearable sensors were deployed on individuals to collect audio signals within a built environment, while Bluetooth-based hubs continuously tracked the individual's location which represents the acoustic scene at a certain time. The data of this paper come from 170 hospital workers gathered continuously during work shifts for a 10 week period. In the first part of our study, we investigate temporal patterns in the egocentric sequence of acoustic scenes encountered by an employee, and the association of those patterns with factors such as job-role and daily routine of the individual. Motivated by evidence of multifaceted effects of ambient sounds on human psychology, we also analyze the association of the temporal dynamics of the perceived acoustic scenes with particular behavioral traits of the individual. Experiments reveal rich temporal patterns in the acoustic scenes experienced by the individuals during their work shifts, and a strong association of those patterns with various constructs related to job-roles and behavior of the employees. In the second part of our study, we employ deep learning models to predict the temporal sequence of acoustic scenes from the egocentric audio signal. We propose a two-stage framework where a recurrent neural network is trained on top of the latent acoustic representations learned by a segment-level neural network. The experimental results show the efficacy of the proposed system in predicting sequence of acoustic scenes, highlighting the existence of underlying temporal patterns in the acoustic scenes experienced in workplace. |
Databáze: | OpenAIRE |
Externí odkaz: |