Audio Indexing Including Frequency Tracking of Simultaneous Multiple Sources in Speech and Music

Autor: Régine André-Obrecht, Julie Mauclair, M. Le Coz, Julien Pinquier
Přispěvatelé: Institut de recherche en informatique de Toulouse ( IRIT ), Institut National Polytechnique [Toulouse] ( INP ) -Université Toulouse 1 Capitole ( UT1 ) -Université Toulouse - Jean Jaurès ( UT2J ) -Université Toulouse III - Paul Sabatier ( UPS ), Université Fédérale Toulouse Midi-Pyrénées-Université Fédérale Toulouse Midi-Pyrénées-Centre National de la Recherche Scientifique ( CNRS ), Université Paris Descartes - Paris 5 ( UPD5 ), Équipe Structuration, Analyse et MOdélisation de documents Vidéo et Audio (IRIT-SAMoVA), Institut de recherche en informatique de Toulouse (IRIT), Université Toulouse 1 Capitole (UT1), Université Fédérale Toulouse Midi-Pyrénées-Université Fédérale Toulouse Midi-Pyrénées-Université Toulouse - Jean Jaurès (UT2J)-Université Toulouse III - Paul Sabatier (UT3), Université Fédérale Toulouse Midi-Pyrénées-Centre National de la Recherche Scientifique (CNRS)-Institut National Polytechnique (Toulouse) (Toulouse INP), Université Fédérale Toulouse Midi-Pyrénées-Université Toulouse 1 Capitole (UT1), Université Fédérale Toulouse Midi-Pyrénées, Université Toulouse III - Paul Sabatier (UT3), Université Paris Descartes - Paris 5 (UPD5), Centre National de la Recherche Scientifique - CNRS (FRANCE), Institut National Polytechnique de Toulouse - INPT (FRANCE), Université Toulouse III - Paul Sabatier - UT3 (FRANCE), Université Toulouse - Jean Jaurès - UT2J (FRANCE), Université Toulouse 1 Capitole - UT1 (FRANCE), Université Paris Descartes - Paris V (FRANCE), Institut National Polytechnique de Toulouse - Toulouse INP (FRANCE), Université Toulouse 1 Capitole (UT1)-Université Toulouse - Jean Jaurès (UT2J)-Université Toulouse III - Paul Sabatier (UT3), Université Fédérale Toulouse Midi-Pyrénées-Université Fédérale Toulouse Midi-Pyrénées-Centre National de la Recherche Scientifique (CNRS)-Institut National Polytechnique (Toulouse) (Toulouse INP)
Jazyk: angličtina
Rok vydání: 2013
Předmět:
[ INFO.INFO-TS ] Computer Science [cs]/Signal and Image Processing
Computer science
Speech recognition
02 engineering and technology
computer.software_genre
Tracking (particle physics)
[ INFO.INFO-CV ] Computer Science [cs]/Computer Vision and Pattern Recognition [cs.CV]
Field (computer science)
Whole systems
Database index
[INFO.INFO-AI]Computer Science [cs]/Artificial Intelligence [cs.AI]
030507 speech-language pathology & audiology
03 medical and health sciences
Traitement des images
[INFO.INFO-TS]Computer Science [cs]/Signal and Image Processing
Indexation musicale
[ INFO.INFO-TI ] Computer Science [cs]/Image Processing
0202 electrical engineering
electronic engineering
information engineering

Traitement du signal et de l'image
Polyphony
Segmentation
Audio signal processing
[ INFO.INFO-AI ] Computer Science [cs]/Artificial Intelligence [cs.AI]
Synthèse d'image et réalité virtuelle
Multipitch
020208 electrical & electronic engineering
Search engine indexing
[INFO.INFO-CV]Computer Science [cs]/Computer Vision and Pattern Recognition [cs.CV]
Vision par ordinateur et reconnaissance de formes
[ INFO.INFO-GR ] Computer Science [cs]/Graphics [cs.GR]
Intelligence artificielle
[INFO.INFO-GR]Computer Science [cs]/Graphics [cs.GR]
[INFO.INFO-TI]Computer Science [cs]/Image Processing [eess.IV]
0305 other medical science
computer
Zdroj: Proceedings of CBMI 2013
11th International Workshop on Content-Based Multimedia Indexing (CBMI 2013)
11th International Workshop on Content-Based Multimedia Indexing (CBMI 2013), Jun 2013, Veszprem, Hungary. pp. 23-25, 2013
11th International Workshop on Content-Based Multimedia Indexing (CBMI 2013), Jun 2013, Veszprem, Hungary. pp. 23-25
CBMI
Popis: National audience; In this paper, we present a complete system for audio indexing. This system is based state-of-the-art methods of Speech-Music-Noise segmentation and Monophonic/Polyphonic estimation. After those methods we propose an original system of superposed sources detection. This approach is based on the analysis of the evolution of the predominant frequencies. In order to validate the whole system we used different corpora : Radio broadcasts, studio music and degraded field records. The first results are encouraging and show the potential of our approach which is generic and can be used on both music and speech contents.
Databáze: OpenAIRE