Výsledky vyhledávání

Akademický článek

Recent progress in the synthesis and applications of vertically aligned carbon nanotube materials

Autor: Huang Shan, Du Xianfeng, Ma Mingbo, Xiong Lilong

Publikováno v: Nanotechnology Reviews, Vol 10, Iss 1, Pp 1592-1623 (2021)

Vertically aligned carbon nanotube (VACNT) materials is a promising candidate in different fields. The intrinsic performance of VACNTs, such as a large specific surface area, high conductivity, and especially its vertical conductive channel, stands o

Externí odkaz: https://doaj.org/article/324d5cb1e5f14c56babcf04b630e1f1f

Zobrazit plný text záznamu

Report

VoiceShop: A Unified Speech-to-Speech Framework for Identity-Preserving Zero-Shot Voice Editing

Autor: Anastassiou, Philip, Tang, Zhenyu, Peng, Kainan, Jia, Dongya, Li, Jiaxin, Tu, Ming, Wang, Yuping, Wang, Yuxuan, Ma, Mingbo

We present VoiceShop, a novel speech-to-speech framework that can modify multiple attributes of speech, such as age, gender, accent, and speech style, in a single forward pass while preserving the input speaker's timbre. Previous works have been cons

Externí odkaz: http://arxiv.org/abs/2404.06674

Zobrazit plný text záznamu

Report

Efficient Neural Music Generation

Autor: Lam, Max W. Y., Tian, Qiao, Li, Tang, Yin, Zongyu, Feng, Siyuan, Tu, Ming, Ji, Yuliang, Xia, Rui, Ma, Mingbo, Song, Xuchen, Chen, Jitong, Wang, Yuping, Wang, Yuxuan

Recent progress in music generation has been remarkably advanced by the state-of-the-art MusicLM, which comprises a hierarchy of three LMs, respectively, for semantic, coarse acoustic, and fine acoustic modelings. Yet, sampling with the MusicLM requi

Externí odkaz: http://arxiv.org/abs/2305.15719

Zobrazit plný text záznamu

Report

Zero-Shot Accent Conversion using Pseudo Siamese Disentanglement Network

Autor: Jia, Dongya, Tian, Qiao, Peng, Kainan, Li, Jiaxin, Chen, Yuanzhe, Ma, Mingbo, Wang, Yuping, Wang, Yuxuan

The goal of accent conversion (AC) is to convert the accent of speech into the target accent while preserving the content and speaker identity. AC enables a variety of applications, such as language learning, speech content creation, and data augment

Externí odkaz: http://arxiv.org/abs/2212.05751

Zobrazit plný text záznamu

Report

Data-Driven Adaptive Simultaneous Machine Translation

Autor: Xun, Guangxu, Ma, Mingbo, Bian, Yuchen, Cai, Xingyu, Huang, Jiaji, Zheng, Renjie, Chen, Junkun, Yuan, Jiahong, Church, Kenneth, Huang, Liang

In simultaneous translation (SimulMT), the most widely used strategy is the wait-k policy thanks to its simplicity and effectiveness in balancing translation quality and latency. However, wait-k suffers from two major limitations: (a) it is a fixed p

Externí odkaz: http://arxiv.org/abs/2204.12672

Zobrazit plný text záznamu

Report

A$^3$T: Alignment-Aware Acoustic and Text Pretraining for Speech Synthesis and Editing

Autor: Bai, He, Zheng, Renjie, Chen, Junkun, Li, Xintong, Ma, Mingbo, Huang, Liang

Recently, speech representation learning has improved many speech-related tasks such as speech recognition, speech classification, and speech-to-text translation. However, all the above tasks are in the direction of speech understanding, but for the

Externí odkaz: http://arxiv.org/abs/2203.09690

Zobrazit plný text záznamu

Akademický článek

Stable aluminum metal anodes with high ionic conductivity and high aluminophilic site

Autor: Wang, Shixin, Guo, Yuan, Du, Xianfeng, Liang, Zhongshuai, Ma, Mingbo, Xie, Yuehong, You, Wenzhi, Meng, Yi, Li, Dong, Liu, Mingxia, Liu, Yifan

Publikováno v: In Chemical Engineering Journal 15 August 2024 494

Zobrazit plný text záznamu

Report

Direct Simultaneous Speech-to-Text Translation Assisted by Synchronized Streaming ASR

Autor: Chen, Junkun, Ma, Mingbo, Zheng, Renjie, Huang, Liang

Simultaneous speech-to-text translation is widely useful in many scenarios. The conventional cascaded approach uses a pipeline of streaming ASR followed by simultaneous MT, but suffers from error propagation and extra latency. To alleviate these issu

Externí odkaz: http://arxiv.org/abs/2106.06636

Zobrazit plný text záznamu

Report

Fused Acoustic and Text Encoding for Multimodal Bilingual Pretraining and Speech Translation

Autor: Zheng, Renjie, Chen, Junkun, Ma, Mingbo, Huang, Liang

Recently, representation learning for text and speech has successfully improved many language related tasks. However, all existing methods suffer from two limitations: (a) they only learn from one input modality, while a unified representation for bo

Externí odkaz: http://arxiv.org/abs/2102.05766

Zobrazit plný text záznamu

Report

MAM: Masked Acoustic Modeling for End-to-End Speech-to-Text Translation

Autor: Chen, Junkun, Ma, Mingbo, Zheng, Renjie, Huang, Liang

End-to-end Speech-to-text Translation (E2E-ST), which directly translates source language speech to target language text, is widely useful in practice, but traditional cascaded approaches (ASR+MT) often suffer from error propagation in the pipeline.

Externí odkaz: http://arxiv.org/abs/2010.11445

Zobrazit plný text záznamu

Vyhledávací nástroje:

Upřesnit hledání