Zobrazeno 1 - 10
of 19
pro vyhledávání: '"Neil Zeghidour"'
Autor:
Curtis Hawthorne, Ian Simon, Adam Roberts, Neil Zeghidour, Joshua Gardner, Ethan Manilow, Jesse Engel
An ideal music synthesizer should be both interactive and expressive, generating high-fidelity audio in realtime for arbitrary combinations of instruments and notes. Recent neural synthesizers have exhibited a tradeoff between domain-specific models
Externí odkaz:
https://explore.openaire.eu/search/publication?articleId=doi_dedup___::019a9d97d82f0aa02fdba08e33ec2ae2
Autor:
Sarthak Yadav, Neil Zeghidour
Publikováno v:
Interspeech 2022.
Deep audio classification, traditionally cast as training a deep neural network on top of mel-filterbanks in a supervised fashion, has recently benefited from two independent lines of work. The first one explores "learnable frontends", i.e., neural m
Autor:
Ahmed Omran, Neil Zeghidour, Zalán Borsos, Félix de Chaumont Quitry, Malcolm Slaney, Marco Tagliasacchi
We present a method to separate speech signals from noisy environments in the embedding space of a neural audio codec. We introduce a new training procedure that allows our model to produce structured encodings of audio waveforms given by embedding v
Externí odkaz:
https://explore.openaire.eu/search/publication?articleId=doi_dedup___::6473357c939ed63b0190c9ad50003c7d
Publikováno v:
ICASSP
We introduce COLA, a self-supervised pre-training approach for learning a general-purpose representation of audio. Our approach is based on contrastive learning: it learns a representation which assigns high similarity to audio segments extracted fro
Self-supervised pre-training using so-called "pretext" tasks has recently shown impressive performance across a wide range of modalities. In this work, we advance self-supervised learning from permutations, by pre-training a model to reorder shuffled
Externí odkaz:
https://explore.openaire.eu/search/publication?articleId=doi_dedup___::8298cbfad3acfded0c47287643ada4a8
We present SoundStream, a novel neural audio codec that can efficiently compress speech, music and general audio at bitrates normally targeted by speech-tailored codecs. SoundStream relies on a model architecture composed by a fully convolutional enc
Externí odkaz:
https://explore.openaire.eu/search/publication?articleId=doi_dedup___::5d13b3c7d1b57cc65bca8078c8aaebc4
Publikováno v:
ICASSP
We propose CHARM, a method for training a single neural network across inconsistent input channels. Our work is motivated by Electroencephalography (EEG), where data collection protocols from different headsets result in varying channel ordering and
Externí odkaz:
https://explore.openaire.eu/search/publication?articleId=doi_dedup___::2bc885ea513b6c02e9655f4517a40ce3
http://arxiv.org/abs/2010.13694
http://arxiv.org/abs/2010.13694
Autor:
David Grangier, Neil Zeghidour
We introduce Wavesplit, an end-to-end source separation system. From a single mixture, the model infers a representation for each source and then estimates each source signal given the inferred representations. The model is trained to jointly perform
Externí odkaz:
https://explore.openaire.eu/search/publication?articleId=doi_dedup___::fb36dbad375e9a93cbc914f2c8bd7182
Autor:
Neil Zeghidour, Juliette Millet
Publikováno v:
ICASSP
ICASSP-2019-IEEE International Conference on Acoustics, Speech and Signal Processing
ICASSP-2019-IEEE International Conference on Acoustics, Speech and Signal Processing, May 2019, Brighton, United Kingdom
ICASSP-2019-IEEE International Conference on Acoustics, Speech and Signal Processing
ICASSP-2019-IEEE International Conference on Acoustics, Speech and Signal Processing, May 2019, Brighton, United Kingdom
Speech classifiers of paralinguistic traits traditionally learn from diverse hand-crafted low-level features, by selecting the relevant information for the task at hand. We explore an alternative to this selection, by learning jointly the classifier,
Externí odkaz:
https://explore.openaire.eu/search/publication?articleId=doi_dedup___::152814676bf1c3d8df1680ab7b6d010e
https://hal.archives-ouvertes.fr/hal-02274504
https://hal.archives-ouvertes.fr/hal-02274504
Autor:
Gabriel Synnaeve, Ronan Collobert, Yossi Adi, Neil Zeghidour, Nicolas Usunier, Vitaliy Liptchinsky
Publikováno v:
ICASSP
Transcribed datasets typically contain speaker identity for each instance in the data. We investigate two ways to incorporate this information during training: Multi-Task Learning and Adversarial Learning. In multi-task learning, the goal is speaker