Výsledky vyhledávání - "Yao-Hung Hubert Tsai"

HuBERT: Self-Supervised Speech Representation Learning by Masked Prediction of Hidden Units

Autor: Ruslan Salakhutdinov, Wei-Ning Hsu, Yao-Hung Hubert Tsai, Benjamin Bolte, Abdelrahman Mohamed, Kushal Lakhotia

Publikováno v: IEEE/ACM Transactions on Audio, Speech, and Language Processing. 29:3451-3460

Self-supervised approaches for speech representation learning are challenged by three unique problems: (1) there are multiple sound units in each input utterance, (2) there is no lexicon of input sound units during the pre-training phase, and (3) sou

Externí odkaz: https://explore.openaire.eu/search/publication?articleId=doi_dedup___::0957d60add1b42cc248fbc0a8ed9ba6a
https://doi.org/10.1109/taslp.2021.3122291

Zobrazit plný text záznamu

Transfer Neural Trees: Semi-Supervised Heterogeneous Domain Adaptation and Beyond

Autor: Tzu-Ming Harry Hsu, Ming-Syan Chen, Wei-Yu Chen, Yao-Hung Hubert Tsai, Yu-Chiang Frank Wang

Publikováno v: IEEE Transactions on Image Processing. 28:4620-4633

Heterogeneous domain adaptation (HDA) addresses the task of associating data not only across dissimilar domains but also described by different types of features. Inspired by the recent advances of neural networks and deep learning, we propose a deep

Externí odkaz: https://explore.openaire.eu/search/publication?articleId=doi_dedup___::2dd8ac9f6c5aa766f29c26c453244a7e
https://doi.org/10.1109/tip.2019.2912126

Zobrazit plný text záznamu

Hubert: How Much Can a Bad Teacher Benefit ASR Pre-Training?

Autor: Yao-Hung Hubert Tsai, Ruslan Salakhutdinov, Abdelrahman Mohamed, Wei-Ning Hsu, Benjamin Bolte

Publikováno v: ICASSP

Compared to vision and language applications, self-supervised pre-training approaches for ASR are challenged by three unique problems: (1) There are multiple sound units in each input utterance, (2) With audio-only pre-training, there is no lexicon o

Externí odkaz: https://explore.openaire.eu/search/publication?articleId=doi_________::b71b32a2818ca834301ec82a6601d39d
https://doi.org/10.1109/icassp39728.2021.9414460

Zobrazit plný text záznamu

LSMI-Sinkhorn: Semi-supervised Mutual Information Estimation with Optimal Transport

Autor: Yanbin Liu, Yao-Hung Hubert Tsai, Ruslan Salakhutdinov, Yi Yang, Tam Le, Makoto Yamada

Publikováno v: Machine Learning and Knowledge Discovery in Databases. Research Track ISBN: 9783030864859
ECML/PKDD (1)

Estimating mutual information is an important statistics and machine learning problem. To estimate the mutual information from data, a common practice is preparing a set of paired samples \(\{({\boldsymbol{x}}_i,{\boldsymbol{y}}_i)\}_{i = 1}^n\) \({\

Externí odkaz: https://explore.openaire.eu/search/publication?articleId=doi_dedup___::9d15d3f499bf04a7b76f39549552ed06
https://hdl.handle.net/10453/158670

Zobrazit plný text záznamu

Multimodal Routing: Improving Local and Global Interpretability of Multimodal Language Analysis

Autor: Martin Q. Ma, Ruslan Salakhutdinov, Yao-Hung Hubert Tsai, Louis-Philippe Morency, Muqiao Yang

Publikováno v: Proc Conf Empir Methods Nat Lang Process
EMNLP (1)

The human language can be expressed through multiple sources of information known as modalities, including tones of voice, facial gestures, and spoken language. Recent multimodal learning with strong performances on human-centric tasks such as sentim

Externí odkaz: https://explore.openaire.eu/search/publication?articleId=doi_dedup___::2a316325d5586a57757f5b5dfd0df090
https://europepmc.org/articles/PMC8106385/

Zobrazit plný text záznamu

Complex Transformer: A Framework for Modeling Complex-Valued Sequence

Autor: Muqiao Yang, Dongyu Li, Martin Q. Ma, Yao-Hung Hubert Tsai, Ruslan Salakhutdinov

Publikováno v: ICASSP

While deep learning has received a surge of interest in a variety of fields in recent years, major deep learning models barely use complex numbers. However, speech, signal and audio data are naturally complex-valued after Fourier Transform, and studi

Externí odkaz: https://explore.openaire.eu/search/publication?articleId=doi_dedup___::093680f0002dd007b01f929e8bcdd023
http://arxiv.org/abs/1910.10202

Zobrazit plný text záznamu

Multimodal Transformer for Unaligned Multimodal Language Sequences

Autor: J. Zico Kolter, Shaojie Bai, Yao-Hung Hubert Tsai, Paul Pu Liang, Ruslan Salakhutdinov, Louis-Philippe Morency

Publikováno v: Proc Conf Assoc Comput Linguist Meet
ACL (1)

Human language is often multimodal, which comprehends a mixture of natural language, facial gestures, and acoustic behaviors. However, two major challenges in modeling such multimodal human language time-series data exist: 1) inherent data non-alignm

Externí odkaz: https://explore.openaire.eu/search/publication?articleId=doi_dedup___::607eaba393c26061ace81cf0ea2e4630
http://arxiv.org/abs/1906.00295

Zobrazit plný text záznamu

Video Relationship Reasoning Using Gated Spatio-Temporal Energy Graph

Autor: Louis-Philippe Morency, Yao-Hung Hubert Tsai, Ali Farhadi, Santosh K. Divvala, Ruslan Salakhutdinov

Publikováno v: CVPR

Visual relationship reasoning is a crucial yet challenging task for understanding rich interactions across visual concepts. For example, a relationship 'man, open, door' involves a complex relation 'open' between concrete entities 'man, door'. While

Externí odkaz: https://explore.openaire.eu/search/publication?articleId=doi_dedup___::98eba210a0f209d21b839f22a6bc762e
https://doi.org/10.1109/cvpr.2019.01067

Zobrazit plný text záznamu

Strong and Simple Baselines for Multimodal Utterance Embeddings

Autor: Paul Pu Liang, Louis-Philippe Morency, Yao Chong Lim, Ruslan Salakhutdinov, Yao-Hung Hubert Tsai

Publikováno v: NAACL-HLT (1)

Human language is a rich multimodal signal consisting of spoken words, facial expressions, body gestures, and vocal intonations. Learning representations for these spoken utterances is a complex research problem due to the presence of multiple hetero

Externí odkaz: https://explore.openaire.eu/search/publication?articleId=doi_dedup___::75ecab574548aaf8210d66309bc84259
http://arxiv.org/abs/1906.02125

Zobrazit plný text záznamu

Vyhledávací nástroje:

Upřesnit hledání