Zobrazeno 1 - 10
of 12
pro vyhledávání: '"Thomas Lippincott"'
Autor:
Ben Van Durme, Thomas Lippincott
Publikováno v:
Proceedings of the Second Workshop on Data Science with Human in the Loop: Language Advances.
Language identification (LID), the task of determining the natural language of a given text, is an essential first step in most NLP pipelines. While generally a solved problem for documents of sufficient length and languages with ample training data,
Publikováno v:
WANLP@ACL 2019
Our submission to the MADAR shared task on Arabic dialect identification employed a language modeling technique called Prediction by Partial Matching, an ensemble of neural architectures, and sources of additional data for training word embeddings an
Autor:
Thomas Lippincott
Publikováno v:
LaTeCH@NAACL-HLT
This work considers a task from traditional literary criticism: annotating a structured, composite document with information about its sources. We take the Documentary Hypothesis, a prominent theory regarding the composition of the first five books o
Autor:
Annabelle Carrell, Thomas Lippincott
Publikováno v:
PEOPLES@NAACL-HTL
Twitter is a ubiquitous source of micro-blog social media data, providing the academic, industrial, and public sectors real-time access to actionable information. A particularly attractive property of some tweets is *geo-tagging*, where a user accoun
Autor:
Thomas Lippincott
Publikováno v:
BlackboxNLP@EMNLP
There is a long-standing interest in understanding the internal behavior of neural networks. Deep neural architectures for natural language processing (NLP) are often accompanied by explanations for their effectiveness, from general observations (e.g
Publikováno v:
Journal of Biomedical Informatics. 46(2):228-237
Background: Biomedical natural language processing (NLP) applications that have access to detailed resources about the linguistic characteristics of biomedical language demonstrate improved performance on tasks such as relation extraction and syntact
Publikováno v:
ACL (1)
We present a novel way of generating unseen words, which is useful for certain applications such as automatic speech recognition or optical character recognition in low-resource languages. We test our vocabulary generator on seven low-resource langua
Publikováno v:
BMC Bioinformatics, Vol 12, Iss 1, p 212 (2011)
BMC Bioinformatics
BMC Bioinformatics
Background Applications of Natural Language Processing (NLP) technology to biomedical texts have generated significant interest in recent years. In this paper we identify and investigate the phenomenon of linguistic subdomain variation within the bio
Externí odkaz:
https://explore.openaire.eu/search/publication?articleId=doi_dedup___::f4ea7dd0acbba9c54f07218dd2113b1f
http://www.dspace.cam.ac.uk/handle/1810/238287
http://www.dspace.cam.ac.uk/handle/1810/238287
Publikováno v:
Computational Linguistics and Intelligent Text Processing ISBN: 9783642003813
CICLing
CICLing
We describe a semantic clustering method designed to address shortcomings in the common bag-of-words document representation for functional semantic classification tasks. The method uses WordNet-based distance metrics to construct a similarity matrix
Externí odkaz:
https://explore.openaire.eu/search/publication?articleId=doi_________::759bde60012255df8af66c3424a6c593
https://doi.org/10.1007/978-3-642-00382-0_41
https://doi.org/10.1007/978-3-642-00382-0_41