Výsledky vyhledávání - "Thomas Lippincott"

Active learning and negative evidence for language identification

Publikováno v: Proceedings of the Second Workshop on Data Science with Human in the Loop: Language Advances.

Language identification (LID), the task of determining the natural language of a given text, is an essential first step in most NLP pipelines. While generally a solved problem for documents of sufficient length and languages with ample training data,

Externí odkaz: https://explore.openaire.eu/search/publication?articleId=doi_________::60456eaac5d4db00a8b762c7182a28fd
https://doi.org/10.18653/v1/2021.dash-1.8

Zobrazit plný text záznamu

JHU System Description for the MADAR Arabic Dialect Identification Shared Task

Autor: Paul McNamee, Kevin Duh, Thomas Lippincott, Pamela Shapiro

Publikováno v: WANLP@ACL 2019

Our submission to the MADAR shared task on Arabic dialect identification employed a language modeling technique called Prediction by Partial Matching, an ensemble of neural architectures, and sources of additional data for training word embeddings an

Externí odkaz: https://explore.openaire.eu/search/publication?articleId=doi_________::91f376086dc3894b5a8e70bea18e4b0a
https://doi.org/10.18653/v1/w19-4634

Zobrazit plný text záznamu

Graph convolutional networks for exploring authorship hypotheses

Autor: Thomas Lippincott

Publikováno v: LaTeCH@NAACL-HLT

This work considers a task from traditional literary criticism: annotating a structured, composite document with information about its sources. We take the Documentary Hypothesis, a prominent theory regarding the composition of the first five books o

Externí odkaz: https://explore.openaire.eu/search/publication?articleId=doi_________::e69a4b27b5ae48438a68dca18e2544cc
https://doi.org/10.18653/v1/w19-2510

Zobrazit plný text záznamu

Observational Comparison of Geo-tagged and Randomly-drawn Tweets

Autor: Annabelle Carrell, Thomas Lippincott

Publikováno v: PEOPLES@NAACL-HTL

Twitter is a ubiquitous source of micro-blog social media data, providing the academic, industrial, and public sectors real-time access to actionable information. A particularly attractive property of some tweets is *geo-tagging*, where a user accoun

Externí odkaz: https://explore.openaire.eu/search/publication?articleId=doi_________::8e44ec42186f8f61299b52ce65c66169
https://doi.org/10.18653/v1/w18-1107

Zobrazit plný text záznamu

Portable, layer-wise task performance monitoring for NLP models

Autor: Thomas Lippincott

Publikováno v: BlackboxNLP@EMNLP

There is a long-standing interest in understanding the internal behavior of neural networks. Deep neural architectures for natural language processing (NLP) are often accompanied by explanations for their effectiveness, from general observations (e.g

Externí odkaz: https://explore.openaire.eu/search/publication?articleId=doi_________::496d2301c89dd2e30fe65b30e59bc9af
https://doi.org/10.18653/v1/w18-5445

Zobrazit plný text záznamu

Acquisition and evaluation of verb subcategorization resources for biomedicine

Autor: Anna Korhonen, Karin Verspoor, Thomas Lippincott, Laura Rimell, Helen L. Johnson

Publikováno v: Journal of Biomedical Informatics. 46(2):228-237

Background: Biomedical natural language processing (NLP) applications that have access to detailed resources about the linguistic characteristics of biomedical language demonstrate improved performance on tasks such as relation extraction and syntact

Externí odkaz: https://explore.openaire.eu/search/publication?articleId=doi_dedup___::d3db868a04cdf33be3bfcf22e33f3dcf

Zobrazit plný text záznamu

Fluency detection on communication networks

Autor: Benjamin Van Durme, Thomas Lippincott

Publikováno v: EMNLP

Externí odkaz: https://explore.openaire.eu/search/publication?articleId=doi_________::ab734ab3564b92e3449a72e54288cf48
https://doi.org/10.18653/v1/d16-1107

Zobrazit plný text záznamu

Unsupervised Morphology-Based Vocabulary Expansion

Autor: Owen Rambow, Thomas Lippincott, Nizar Habash, Mohammad Sadegh Rasooli

Publikováno v: ACL (1)

We present a novel way of generating unseen words, which is useful for certain applications such as automatic speech recognition or optical character recognition in low-resource languages. We test our vocabulary generator on seven low-resource langua

Externí odkaz: https://explore.openaire.eu/search/publication?articleId=doi_________::759900adfb922cd4f7ad630d4d9a440c
https://doi.org/10.3115/v1/p14-1127

Zobrazit plný text záznamu

Exploring subdomain variation in biomedical language

Autor: Anna Korhonen, Diarmuid Ó Séaghdha, Thomas Lippincott

Publikováno v: BMC Bioinformatics, Vol 12, Iss 1, p 212 (2011)
BMC Bioinformatics

Background Applications of Natural Language Processing (NLP) technology to biomedical texts have generated significant interest in recent years. In this paper we identify and investigate the phenomenon of linguistic subdomain variation within the bio

Externí odkaz: https://explore.openaire.eu/search/publication?articleId=doi_dedup___::f4ea7dd0acbba9c54f07218dd2113b1f
http://www.dspace.cam.ac.uk/handle/1810/238287

Zobrazit plný text záznamu

Semantic Clustering for a Functional Text Classification Task

Autor: Thomas Lippincott, Rebecca J. Passonneau

Publikováno v: Computational Linguistics and Intelligent Text Processing ISBN: 9783642003813
CICLing

We describe a semantic clustering method designed to address shortcomings in the common bag-of-words document representation for functional semantic classification tasks. The method uses WordNet-based distance metrics to construct a similarity matrix

Externí odkaz: https://explore.openaire.eu/search/publication?articleId=doi_________::759bde60012255df8af66c3424a6c593
https://doi.org/10.1007/978-3-642-00382-0_41

Zobrazit plný text záznamu

Vyhledávací nástroje:

Upřesnit hledání