Audio Indexing Including Frequency Tracking of Simultaneous Multiple Sources in Speech and Music

Autor:	Régine André-Obrecht, Julie Mauclair, M. Le Coz, Julien Pinquier
Přispěvatelé:	Institut de recherche en informatique de Toulouse ( IRIT ), Institut National Polytechnique [Toulouse] ( INP ) -Université Toulouse 1 Capitole ( UT1 ) -Université Toulouse - Jean Jaurès ( UT2J ) -Université Toulouse III - Paul Sabatier ( UPS ), Université Fédérale Toulouse Midi-Pyrénées-Université Fédérale Toulouse Midi-Pyrénées-Centre National de la Recherche Scientifique ( CNRS ), Université Paris Descartes - Paris 5 ( UPD5 ), Équipe Structuration, Analyse et MOdélisation de documents Vidéo et Audio (IRIT-SAMoVA), Institut de recherche en informatique de Toulouse (IRIT), Université Toulouse 1 Capitole (UT1), Université Fédérale Toulouse Midi-Pyrénées-Université Fédérale Toulouse Midi-Pyrénées-Université Toulouse - Jean Jaurès (UT2J)-Université Toulouse III - Paul Sabatier (UT3), Université Fédérale Toulouse Midi-Pyrénées-Centre National de la Recherche Scientifique (CNRS)-Institut National Polytechnique (Toulouse) (Toulouse INP), Université Fédérale Toulouse Midi-Pyrénées-Université Toulouse 1 Capitole (UT1), Université Fédérale Toulouse Midi-Pyrénées, Université Toulouse III - Paul Sabatier (UT3), Université Paris Descartes - Paris 5 (UPD5), Centre National de la Recherche Scientifique - CNRS (FRANCE), Institut National Polytechnique de Toulouse - INPT (FRANCE), Université Toulouse III - Paul Sabatier - UT3 (FRANCE), Université Toulouse - Jean Jaurès - UT2J (FRANCE), Université Toulouse 1 Capitole - UT1 (FRANCE), Université Paris Descartes - Paris V (FRANCE), Institut National Polytechnique de Toulouse - Toulouse INP (FRANCE), Université Toulouse 1 Capitole (UT1)-Université Toulouse - Jean Jaurès (UT2J)-Université Toulouse III - Paul Sabatier (UT3), Université Fédérale Toulouse Midi-Pyrénées-Université Fédérale Toulouse Midi-Pyrénées-Centre National de la Recherche Scientifique (CNRS)-Institut National Polytechnique (Toulouse) (Toulouse INP)
Jazyk:	angličtina
Rok vydání:	2013
Předmět:	[ INFO.INFO-TS ] Computer Science [cs]/Signal and Image Processing Computer science Speech recognition 02 engineering and technology computer.software_genre Tracking (particle physics) [ INFO.INFO-CV ] Computer Science [cs]/Computer Vision and Pattern Recognition [cs.CV] Field (computer science) Whole systems Database index [INFO.INFO-AI]Computer Science [cs]/Artificial Intelligence [cs.AI] 030507 speech-language pathology & audiology 03 medical and health sciences Traitement des images [INFO.INFO-TS]Computer Science [cs]/Signal and Image Processing Indexation musicale [ INFO.INFO-TI ] Computer Science [cs]/Image Processing 0202 electrical engineering electronic engineering information engineering Traitement du signal et de l'image Polyphony Segmentation Audio signal processing [ INFO.INFO-AI ] Computer Science [cs]/Artificial Intelligence [cs.AI] Synthèse d'image et réalité virtuelle Multipitch 020208 electrical & electronic engineering Search engine indexing [INFO.INFO-CV]Computer Science [cs]/Computer Vision and Pattern Recognition [cs.CV] Vision par ordinateur et reconnaissance de formes [ INFO.INFO-GR ] Computer Science [cs]/Graphics [cs.GR] Intelligence artificielle [INFO.INFO-GR]Computer Science [cs]/Graphics [cs.GR] [INFO.INFO-TI]Computer Science [cs]/Image Processing [eess.IV] 0305 other medical science computer
Zdroj:	Proceedings of CBMI 2013 11th International Workshop on Content-Based Multimedia Indexing (CBMI 2013) 11th International Workshop on Content-Based Multimedia Indexing (CBMI 2013), Jun 2013, Veszprem, Hungary. pp. 23-25, 2013 11th International Workshop on Content-Based Multimedia Indexing (CBMI 2013), Jun 2013, Veszprem, Hungary. pp. 23-25 CBMI
Popis:	National audience; In this paper, we present a complete system for audio indexing. This system is based state-of-the-art methods of Speech-Music-Noise segmentation and Monophonic/Polyphonic estimation. After those methods we propose an original system of superposed sources detection. This approach is based on the analysis of the evolution of the predominant frequencies. In order to validate the whole system we used different corpora : Radio broadcasts, studio music and degraded field records. The first results are encouraging and show the potential of our approach which is generic and can be used on both music and speech contents.
Databáze:	OpenAIRE
Externí odkaz:	https://explore.openaire.eu/search/publication?articleId=doi_dedup___::dbbb3e6b0695f0108567a07257800af6 https://hal.archives-ouvertes.fr/hal-01228711/file/lecoz_12639.pdf Zobrazit plný text záznamu