A conditional random field system for beat tracking

Autor: Thomas Fillon, Simon Durand, Cyril Joder, Slim Essid
Přispěvatelé: HAL, TelecomParis, Signal, Statistique et Apprentissage (S2A), Laboratoire Traitement et Communication de l'Information (LTCI), Institut Mines-Télécom [Paris] (IMT)-Télécom Paris-Institut Mines-Télécom [Paris] (IMT)-Télécom Paris, Département Traitement du Signal et des Images (TSI), Télécom ParisTech-Centre National de la Recherche Scientifique (CNRS)
Jazyk: angličtina
Rok vydání: 2015
Předmět:
Conditional random field
Computer science
business.industry
[INFO.INFO-TS] Computer Science [cs]/Signal and Image Processing
Speech recognition
[SCCO.NEUR]Cognitive science/Neuroscience
Feature extraction
[SCCO.NEUR] Cognitive science/Neuroscience
Probabilistic logic
Statistical model
Pattern recognition
[INFO.INFO-LG] Computer Science [cs]/Machine Learning [cs.LG]
[INFO.INFO-SD] Computer Science [cs]/Sound [cs.SD]
Beat detection
Sound recording and reproduction
[INFO.INFO-LG]Computer Science [cs]/Machine Learning [cs.LG]
[INFO.INFO-TS]Computer Science [cs]/Signal and Image Processing
[INFO.INFO-SD]Computer Science [cs]/Sound [cs.SD]
Artificial intelligence
Hidden Markov model
business
Beat (music)
[SPI.SIGNAL]Engineering Sciences [physics]/Signal and Image processing
ComputingMilieux_MISCELLANEOUS
[SPI.SIGNAL] Engineering Sciences [physics]/Signal and Image processing
Zdroj: IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)
IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), Apr 2015, Brisbane, Australia
ICASSP
Popis: In the present work, we introduce a new probabilistic model for the task of estimating beat positions in a musical audio recording, instantiating the Conditional Random Field (CRF) framework. Our approach takes its strength from a sophisticated temporal modeling of the audio observations, accounting for local tempo variations which are readily represented in the CRF model proposed using well-chosen potentials. The system is experimentally evaluated by studying its performance on 3 datasets of 1394 music excerpts of various western music styles and comparatively to 4 reference systems in the light of 6 reference evaluation metrics. The results show that the proposed system tracks perceptively coherent pulses and is very effective in estimating the beat positions while further work is needed to find the correct salient tempo.
Databáze: OpenAIRE