Temporal Dynamics of Workplace Acoustic Scenes: Egocentric Analysis and Prediction

Autor:	Shrikanth S. Narayanan, Karel Mundnich, Tiantian Feng, Arindam Jati, Amrutha Nadarajan, Benjamin Girault, Raghuveer Peri
Rok vydání:	2021
Předmět:	Audio signal Acoustics and Ultrasonics Artificial neural network Computer science business.industry Speech recognition Deep learning Perspective (graphical) Sound recording and reproduction Computational Mathematics Identification (information) Recurrent neural network Computer Science (miscellaneous) Artificial intelligence Electrical and Electronic Engineering business Association (psychology)
Zdroj:	IEEE/ACM Transactions on Audio, Speech, and Language Processing. 29:756-769
ISSN:	2329-9304 2329-9290
DOI:	10.1109/taslp.2021.3050265
Popis:	Identification of the acoustic environment from an audio recording, also known as acoustic scene classification, is an active area of research. In this paper, we study dynamically-changing background acoustic scenes from the egocentric perspective of an individual in a workplace. In a novel data collection setup, wearable sensors were deployed on individuals to collect audio signals within a built environment, while Bluetooth-based hubs continuously tracked the individual's location which represents the acoustic scene at a certain time. The data of this paper come from 170 hospital workers gathered continuously during work shifts for a 10 week period. In the first part of our study, we investigate temporal patterns in the egocentric sequence of acoustic scenes encountered by an employee, and the association of those patterns with factors such as job-role and daily routine of the individual. Motivated by evidence of multifaceted effects of ambient sounds on human psychology, we also analyze the association of the temporal dynamics of the perceived acoustic scenes with particular behavioral traits of the individual. Experiments reveal rich temporal patterns in the acoustic scenes experienced by the individuals during their work shifts, and a strong association of those patterns with various constructs related to job-roles and behavior of the employees. In the second part of our study, we employ deep learning models to predict the temporal sequence of acoustic scenes from the egocentric audio signal. We propose a two-stage framework where a recurrent neural network is trained on top of the latent acoustic representations learned by a segment-level neural network. The experimental results show the efficacy of the proposed system in predicting sequence of acoustic scenes, highlighting the existence of underlying temporal patterns in the acoustic scenes experienced in workplace.
Databáze:	OpenAIRE
Externí odkaz:	https://explore.openaire.eu/search/publication?articleId=doi_________::d808496767745dc7129eab8fcec52bfd https://doi.org/10.1109/taslp.2021.3050265 Zobrazit plný text záznamu