Zobrazeno 1 - 10
of 96
pro vyhledávání: '"Brian Roark"'
Publikováno v:
Computational Linguistics, Vol 47, Iss 2, Pp 221-254 (2021)
AbstractWeighted finite automata (WFAs) are often used to represent probabilistic models, such as n-gram language models, because among other things, they are efficient for recognition tasks in time and space. The probabilistic source to be represent
Externí odkaz:
https://doaj.org/article/8e316cf7fcdd4ed29b9d0947f464f413
Publikováno v:
Transactions of the Association for Computational Linguistics, Vol 8, Pp 1-18 (2020)
AbstractWe present methods for calculating a measure of phonotactic complexity—bits per phoneme— that permits a straightforward cross-linguistic comparison. When given a word, represented as a sequence of phonemic segments such as symbols in the
Externí odkaz:
https://doaj.org/article/947852ef831b483fa5eb29a65f0d4dce
Autor:
Hao Zhang, Richard Sproat, Axel H. Ng, Felix Stahlberg, Xiaochang Peng, Kyle Gorman, Brian Roark
Publikováno v:
Computational Linguistics, Vol 45, Iss 2, Pp 293-337 (2019)
Machine learning, including neural network techniques, have been applied to virtually every domain in natural language processing. One problem that has been somewhat resistant to effective machine learning solutions is text normalization for speech a
Externí odkaz:
https://doaj.org/article/90dc08a28df744cfa0e164d9471b7751
Autor:
Brian Roark, Richard Sproat
The book will appeal to scholars and advanced students of morphology, syntax, computational linguistics and natural language processing (NLP). It provides a critical and practical guide to computational techniques for handling morphological and synta
Publikováno v:
Computational Linguistics. :1-34
Weighted finite automata (WFAs) are often used to represent probabilistic models, such as ngram language models, because among other things, they are efficient for recognition tasks in time and space. The probabilistic source to be represented as a W
Publikováno v:
Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies
NAACL-HLT
NAACL-HLT
This work presents an information-theoretic operationalisation of cross-linguistic non-arbitrariness. It is not a new idea that there are small, cross-linguistic associations between the forms and meanings of words. For instance, it has been claimed
Externí odkaz:
https://explore.openaire.eu/search/publication?articleId=doi_dedup___::87f2ae863ba8bd73c29b53ca164ee684
https://hdl.handle.net/20.500.11850/518985
https://hdl.handle.net/20.500.11850/518985
Ad hoc abbreviations are commonly found in informal communication channels that favor shorter messages. We consider the task of reversing these abbreviations in context to recover normalized, expanded versions of abbreviated messages. The problem is
Externí odkaz:
https://explore.openaire.eu/search/publication?articleId=doi_dedup___::954d664dfcd5b184aba34b91fd572f81
Publikováno v:
EACL (System Demonstrations)
This paper presents an open-source library for efficient low-level processing of ten major South Asian Brahmic scripts. The library provides a flexible and extensible framework for supporting crucial operations on Brahmic scripts, such as NFC, visual
Publikováno v:
ICASSP
Multilingual Automated Speech Recognition (ASR) systems allow for the joint training of data-rich and data-scarce languages in a single model. This enables data and parameter sharing across languages, which is especially beneficial for the data-scarc
Externí odkaz:
https://explore.openaire.eu/search/publication?articleId=doi_dedup___::6a151204da3e0d461f9892fc4c305984
http://arxiv.org/abs/2004.09571
http://arxiv.org/abs/2004.09571
Publikováno v:
Transactions of the Association for Computational Linguistics, 8
Transactions of the Association for Computational Linguistics, Vol 8, Pp 1-18 (2020)
Transactions of the Association for Computational Linguistics, Vol 8, Pp 1-18 (2020)
We present methods for calculating a measure of phonotactic complexity—bits per phoneme— that permits a straightforward cross-linguistic comparison. When given a word, represented as a sequence of phonemic segments such as symbols in the internat
Externí odkaz:
https://explore.openaire.eu/search/publication?articleId=doi_dedup___::6f8b92f552ab34c7fbabbe6c0d63df38
https://hdl.handle.net/20.500.11850/462324
https://hdl.handle.net/20.500.11850/462324