Zobrazeno 1 - 10
of 19
pro vyhledávání: '"Yao-Hung Hubert Tsai"'
Autor:
Mathis Petrovich, Chao Liang, Ryoma Sato, Yanbin Liu, Yao-Hung Hubert Tsai, Linchao Zhu, Yi Yang, Ruslan Salakhutdinov, Makoto Yamada
Publikováno v:
Machine Learning and Knowledge Discovery in Databases ISBN: 9783031264184
Externí odkaz:
https://explore.openaire.eu/search/publication?articleId=doi_________::cdfca78cdb271e8736438d301bda7572
https://doi.org/10.1007/978-3-031-26419-1_18
https://doi.org/10.1007/978-3-031-26419-1_18
Autor:
Ruslan Salakhutdinov, Wei-Ning Hsu, Yao-Hung Hubert Tsai, Benjamin Bolte, Abdelrahman Mohamed, Kushal Lakhotia
Publikováno v:
IEEE/ACM Transactions on Audio, Speech, and Language Processing. 29:3451-3460
Self-supervised approaches for speech representation learning are challenged by three unique problems: (1) there are multiple sound units in each input utterance, (2) there is no lexicon of input sound units during the pre-training phase, and (3) sou
Publikováno v:
IEEE Transactions on Image Processing. 28:4620-4633
Heterogeneous domain adaptation (HDA) addresses the task of associating data not only across dissimilar domains but also described by different types of features. Inspired by the recent advances of neural networks and deep learning, we propose a deep
Autor:
Yao-Hung Hubert Tsai, Ruslan Salakhutdinov, Abdelrahman Mohamed, Wei-Ning Hsu, Benjamin Bolte
Publikováno v:
ICASSP
Compared to vision and language applications, self-supervised pre-training approaches for ASR are challenged by three unique problems: (1) There are multiple sound units in each input utterance, (2) With audio-only pre-training, there is no lexicon o
Publikováno v:
Machine Learning and Knowledge Discovery in Databases. Research Track ISBN: 9783030864859
ECML/PKDD (1)
ECML/PKDD (1)
Estimating mutual information is an important statistics and machine learning problem. To estimate the mutual information from data, a common practice is preparing a set of paired samples \(\{({\boldsymbol{x}}_i,{\boldsymbol{y}}_i)\}_{i = 1}^n\) \({\
Externí odkaz:
https://explore.openaire.eu/search/publication?articleId=doi_dedup___::9d15d3f499bf04a7b76f39549552ed06
https://hdl.handle.net/10453/158670
https://hdl.handle.net/10453/158670
Autor:
Martin Q. Ma, Ruslan Salakhutdinov, Yao-Hung Hubert Tsai, Louis-Philippe Morency, Muqiao Yang
Publikováno v:
Proc Conf Empir Methods Nat Lang Process
EMNLP (1)
EMNLP (1)
The human language can be expressed through multiple sources of information known as modalities, including tones of voice, facial gestures, and spoken language. Recent multimodal learning with strong performances on human-centric tasks such as sentim
Externí odkaz:
https://explore.openaire.eu/search/publication?articleId=doi_dedup___::2a316325d5586a57757f5b5dfd0df090
https://europepmc.org/articles/PMC8106385/
https://europepmc.org/articles/PMC8106385/
Publikováno v:
ICASSP
While deep learning has received a surge of interest in a variety of fields in recent years, major deep learning models barely use complex numbers. However, speech, signal and audio data are naturally complex-valued after Fourier Transform, and studi
Externí odkaz:
https://explore.openaire.eu/search/publication?articleId=doi_dedup___::093680f0002dd007b01f929e8bcdd023
http://arxiv.org/abs/1910.10202
http://arxiv.org/abs/1910.10202
Autor:
J. Zico Kolter, Shaojie Bai, Yao-Hung Hubert Tsai, Paul Pu Liang, Ruslan Salakhutdinov, Louis-Philippe Morency
Publikováno v:
Proc Conf Assoc Comput Linguist Meet
ACL (1)
ACL (1)
Human language is often multimodal, which comprehends a mixture of natural language, facial gestures, and acoustic behaviors. However, two major challenges in modeling such multimodal human language time-series data exist: 1) inherent data non-alignm
Externí odkaz:
https://explore.openaire.eu/search/publication?articleId=doi_dedup___::607eaba393c26061ace81cf0ea2e4630
http://arxiv.org/abs/1906.00295
http://arxiv.org/abs/1906.00295
Autor:
Louis-Philippe Morency, Yao-Hung Hubert Tsai, Ali Farhadi, Santosh K. Divvala, Ruslan Salakhutdinov
Publikováno v:
CVPR
Visual relationship reasoning is a crucial yet challenging task for understanding rich interactions across visual concepts. For example, a relationship 'man, open, door' involves a complex relation 'open' between concrete entities 'man, door'. While
Autor:
Paul Pu Liang, Louis-Philippe Morency, Yao Chong Lim, Ruslan Salakhutdinov, Yao-Hung Hubert Tsai
Publikováno v:
NAACL-HLT (1)
Human language is a rich multimodal signal consisting of spoken words, facial expressions, body gestures, and vocal intonations. Learning representations for these spoken utterances is a complex research problem due to the presence of multiple hetero
Externí odkaz:
https://explore.openaire.eu/search/publication?articleId=doi_dedup___::75ecab574548aaf8210d66309bc84259
http://arxiv.org/abs/1906.02125
http://arxiv.org/abs/1906.02125