Musical instrument classifier for early childhood percussion instruments.

Autor: Rufino B; Bloorview Research Institute, Holland Bloorview Kids Rehabilitation Hospital, Toronto, Ontario, Canada.; Institute of Biomedical Engineering, University of Toronto, Toronto, Ontario, Canada., Khan A; Bloorview Research Institute, Holland Bloorview Kids Rehabilitation Hospital, Toronto, Ontario, Canada., Dutta T; Institute of Biomedical Engineering, University of Toronto, Toronto, Ontario, Canada.; KITE, Toronto Rehabilitation Institute, University Health Network, Toronto, Ontario, Canada., Biddiss E; Bloorview Research Institute, Holland Bloorview Kids Rehabilitation Hospital, Toronto, Ontario, Canada.; Institute of Biomedical Engineering, University of Toronto, Toronto, Ontario, Canada.; Rehabilitation Sciences Institute, University of Toronto, Toronto, Ontario, Canada.
Jazyk: angličtina
Zdroj: PloS one [PLoS One] 2024 Apr 02; Vol. 19 (4), pp. e0299888. Date of Electronic Publication: 2024 Apr 02 (Print Publication: 2024).
DOI: 10.1371/journal.pone.0299888
Abstrakt: While the musical instrument classification task is well-studied, there remains a gap in identifying non-pitched percussion instruments which have greater overlaps in frequency bands and variation in sound quality and play style than pitched instruments. In this paper, we present a musical instrument classifier for detecting tambourines, maracas and castanets, instruments that are often used in early childhood music education. We generated a dataset with diverse instruments (e.g., brand, materials, construction) played in different locations with varying background noise and play styles. We conducted sensitivity analyses to optimize feature selection, windowing time, and model selection. We deployed and evaluated our best model in a mixed reality music application with 12 families in a home setting. Our dataset was comprised of over 369,000 samples recorded in-lab and 35,361 samples recorded with families in a home setting. We observed the Light Gradient Boosting Machine (LGBM) model to perform best using an approximate 93 ms window with only 12 mel-frequency cepstral coefficients (MFCCs) and signal entropy. Our best LGBM model was observed to perform with over 84% accuracy across all three instrument families in-lab and over 73% accuracy when deployed to the home. To our knowledge, the dataset compiled of 369,000 samples of non-pitched instruments is first of its kind. This work also suggests that a low feature space is sufficient for the recognition of non-pitched instruments. Lastly, real-world deployment and testing of the algorithms created with participants of diverse physical and cognitive abilities was also an important contribution towards more inclusive design practices. This paper lays the technological groundwork for a mixed reality music application that can detect children's use of non-pitched, percussion instruments to support early childhood music education and play.
Competing Interests: I have read the journal’s policy and the authors of this manuscript have the following competing interests: Holland Bloorview is supporting the creation of a company called Pearl Interactives to commercialize products like Bootle Band so that it can be made widely available to those who can benefit from it. Elaine Biddiss and Ajmal Khan are shareholders in Pearl Interactives and may financially benefit from this interest if Pearl Interactives is successful in marketing products related to this research including Bootle Band. The terms of this arrangement have been reviewed and approved by Holland Bloorview Kids Rehabilitation Hospital and the University of Toronto in accordance with its policy on objectivity in research. We will continue to actively monitor, mitigate and manage any conflicts of interest. Our goal is to remain transparent and committed to the best interests of study participants, patients and families. This does not alter our adherence to PLOS ONE policies on sharing data and materials.
(Copyright: © 2024 Rufino et al. This is an open access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.)
Databáze: MEDLINE
Nepřihlášeným uživatelům se plný text nezobrazuje