Zobrazeno 1 - 10
of 4 990
pro vyhledávání: '"Carbonneau"'
Discovering a lexicon from unlabeled audio is a longstanding challenge for zero-resource speech processing. One approach is to search for frequently occurring patterns in speech. We revisit this idea with DUSTED: Discrete Unit Spoken-TErm Discovery.
Externí odkaz:
http://arxiv.org/abs/2408.14390
Real world deployments of word alignment are almost certain to cover both high and low resource languages. However, the state-of-the-art for this task recommends a different model class depending on the availability of gold alignment training data fo
Externí odkaz:
http://arxiv.org/abs/2407.12881
Grammatical Error Detection (GED) methods rely heavily on human annotated error corpora. However, these annotations are unavailable in many low-resource languages. In this paper, we investigate GED in this context. Leveraging the zero-shot cross-ling
Externí odkaz:
http://arxiv.org/abs/2407.11854
Autor:
Carbonneau, Nathaniel
Titre de l'écran-titre (visionné le 8 février 2024)
Derrière l'école se cache la question des finalités éducatives. Et ces finalités, à travers les époques, témoignent immanquablement d'idéaux axiologiques propres à une aire géopol
Derrière l'école se cache la question des finalités éducatives. Et ces finalités, à travers les époques, témoignent immanquablement d'idéaux axiologiques propres à une aire géopol
Externí odkaz:
https://hdl.handle.net/20.500.11794/135025
We introduce UPose3D, a novel approach for multi-view 3D human pose estimation, addressing challenges in accuracy and scalability. Our method advances existing pose estimation frameworks by improving robustness and flexibility without requiring direc
Externí odkaz:
http://arxiv.org/abs/2404.14634
Autor:
Dib, Abdallah, Hafemann, Luiz Gustavo, Got, Emeline, Anderson, Trevor, Fadaeinejad, Amin, Cruz, Rafael M. O., Carbonneau, Marc-Andre
Reconstructing an avatar from a portrait image has many applications in multimedia, but remains a challenging research problem. Extracting reflectance maps and geometry from one image is ill-posed: recovering geometry is a one-to-many mapping problem
Externí odkaz:
http://arxiv.org/abs/2312.13091
Audio diffusion models can synthesize a wide variety of sounds. Existing models often operate on the latent domain with cascaded phase recovery modules to reconstruct waveform. This poses challenges when generating high-fidelity audio. In this paper,
Externí odkaz:
http://arxiv.org/abs/2311.08667
Voice conversion aims to transform source speech into a different target voice. However, typical voice conversion systems do not account for rhythm, which is an important factor in the perception of speaker identity. To bridge this gap, we introduce
Externí odkaz:
http://arxiv.org/abs/2307.06040