Výsledky vyhledávání - "Frank Wm. Tompa"

Computer-Assisted Cohort Identification in Practice

Autor: Besat Kassaie, Elizabeth L. Irving, Frank Wm. Tompa

Publikováno v: ACM Transactions on Computing for Healthcare. 3:1-28

The standard approach to expert-in-the-loop machine learning is active learning, where, repeatedly, an expert is asked to annotate one or more records and the machine finds a classifier that respects all annotations made until that point. We propose

Externí odkaz: https://explore.openaire.eu/search/publication?articleId=doi_________::7e7f2920bbd18484f05b2aa2847686cd
https://doi.org/10.1145/3483411

Zobrazit plný text záznamu

Dowsing for Math Answers

Autor: Yin Ki Ng, Frank Wm. Tompa, Besat Kassaie, Dallas J. Fraser

Publikováno v: Lecture Notes in Computer Science ISBN: 9783030852504
CLEF

Mathematical Information Retrieval (MathIR) focuses on using mathematical formulas and terminology to search and retrieve documents that include mathematical content. To index mathematical documents, we convert each formula into a token list that is

Externí odkaz: https://explore.openaire.eu/search/publication?articleId=doi_________::69bdedd44bed37a9c55a2b2bd029a64f
https://doi.org/10.1007/978-3-030-85251-1_16

Zobrazit plný text záznamu

A Framework for Extracted View Maintenance

Autor: Frank Wm. Tompa, Besat Kassaie

Publikováno v: DocEng

When information extraction programs (extractors) are applied to documents, they create relations that store facts found in the documents. In this work, we formalize and address the problem of keeping such extracted relations consistent with source d

Externí odkaz: https://explore.openaire.eu/search/publication?articleId=doi_________::825aa2badb2e03c6b02712a9b96124b7
https://doi.org/10.1145/3395027.3419592

Zobrazit plný text záznamu

Cross-lingual text alignment for fine-grained plagiarism detection

Autor: Nava Ehsan, Azadeh Shakery, Frank Wm. Tompa

Publikováno v: Journal of Information Science. 45:443-459

Fast and easy access to a wide range of documents in various languages, in conjunction with the wide availability of translation and editing tools, has led to the need to develop effective tools for detecting cross-lingual plagiarism. Given a suspici

Externí odkaz: https://explore.openaire.eu/search/publication?articleId=doi_________::fd02fde2068dfff755280356cd500cee
https://doi.org/10.1177/0165551518787696

Zobrazit plný text záznamu

Predictable and Consistent Information Extraction

Autor: Besat Kassaie, Frank Wm. Tompa

Publikováno v: DocEng

Information extraction programs (extractors) can be applied to documents to isolate structured versions of some content, that is, to create tabular records corresponding to facts found in the documents. If the data in an extracted table needs to be u

Externí odkaz: https://explore.openaire.eu/search/publication?articleId=doi_________::9bbb5055fc8cb560c0fe4ee923e2c2e3
https://doi.org/10.1145/3342558.3345391

Zobrazit plný text záznamu

Fashioning a Search Engine to Support Humanities Research

Autor: Frank Wm. Tompa

Publikováno v: DocEng

Scholarship in the humanities often requires the ability to search curated electronic corpora and to display search results in a variety of formats. Challenges that need to be addressed include transforming the texts into a suitable form, typically X

Externí odkaz: https://explore.openaire.eu/search/publication?articleId=doi_________::f86e69e2f3c23cc97596c5042805ad3b
https://doi.org/10.1145/3209280.3209520

Zobrazit plný text záznamu

Choosing Math Features for BM25 Ranking with Tangent-L

Autor: Andrew Kane, Dallas J. Fraser, Frank Wm. Tompa

Publikováno v: DocEng

Combining text and mathematics when searching in a corpus with extensive mathematical notation remains an open problem. Recent results for Tangent-3 on the math and text retrieval task at NTCIR-12, for example, have room for improvement, even though

Externí odkaz: https://explore.openaire.eu/search/publication?articleId=doi_________::bd2aff4369d6d5a52b1bac7dc75bc6dc
https://doi.org/10.1145/3209280.3209527

Zobrazit plný text záznamu

Split-Lists and Initial Thresholds for WAND-based Search

Autor: Andrew Kane, Frank Wm. Tompa

Publikováno v: SIGIR

We examine search engine performance for rank-safe query execution using the WAND and state-of-the-art BMW algorithms. Supported by extensive experiments, we suggest two approaches to improve query performance: initial list thresholds should be used

Externí odkaz: https://explore.openaire.eu/search/publication?articleId=doi_________::872d505c5833fba6deb55538e86f2866
https://doi.org/10.1145/3209978.3210066

Zobrazit plný text záznamu

Partial materialization for online analytical processing over multi-tagged document collections

Autor: Grzegorz Drzadzewski, Frank Wm. Tompa

Publikováno v: Knowledge and Information Systems. 47:697-732

The New York Times Annotated Corpus, the ACM Digital Library, and PubMed are three prototypical examples of document collections in which each document is tagged with keywords or phrases. Such collections can be viewed as high-dimensional document cu

Externí odkaz: https://explore.openaire.eu/search/publication?articleId=doi_dedup___::a4fd072b720b498fea91cdc152e372af
https://doi.org/10.1007/s10115-015-0871-2

Zobrazit plný text záznamu

Small-Term Distribution for Disk-Based Search

Autor: Frank Wm. Tompa, Andrew Kane

Publikováno v: DocEng

A disk-based search system distributes a large index across multiple disks on one or more machines, where documents are typically assigned to disks at random in order to achieve load balancing. However, random distribution degrades clustering, which

Externí odkaz: https://explore.openaire.eu/search/publication?articleId=doi_________::ec38a202590cc158db635af13e842d47
https://doi.org/10.1145/3103010.3103022

Zobrazit plný text záznamu

Vyhledávací nástroje:

Upřesnit hledání