Zobrazeno 1 - 10
of 11
pro vyhledávání: '"Hinrich Schuetze"'
We introduce FLOTA (Few Longest Token Approximation), a simple yet effective method to improve the tokenization of pretrained language models (PLMs). FLOTA uses the vocabulary of a standard tokenizer but tries to preserve the morphological structure
Externí odkaz:
https://explore.openaire.eu/search/publication?articleId=doi_dedup___::b79b8113940276390bf1db3658d8e88a
https://ora.ox.ac.uk/objects/uuid:d4089dea-f88f-4695-a6ce-2a8cc101ee25
https://ora.ox.ac.uk/objects/uuid:d4089dea-f88f-4695-a6ce-2a8cc101ee25
In this work, we propose a flow-adapter architecture for unsupervised NMT. It leverages normalizing flows to explicitly model the distributions of sentence-level latent representations, which are subsequently used in conjunction with the attention me
Externí odkaz:
https://explore.openaire.eu/search/publication?articleId=doi_dedup___::b51a8e09b234864780d1b591bfd918cc
http://arxiv.org/abs/2204.12225
http://arxiv.org/abs/2204.12225
Publikováno v:
Findings of the Association for Computational Linguistics: NAACL 2022.
Comprising sixteen independent chapters, this book covers recent advancements and emerging pathways within human-friendly robotics on physical and cognitive levels. Each chapter presents a novel work presented at HFR 2023 by researchers from various
Publikováno v:
IIiX
Interactive Information Retrieval refers to the branch of Information Retrieval that considers the retrieval process with respect to a wide range of contexts, which may affect the user's information seeking experience. The identification and represen
Externí odkaz:
https://explore.openaire.eu/search/publication?articleId=doi_dedup___::580abd416e371690cd59ce7493a7d2c3
http://arxiv.org/abs/1704.01610
http://arxiv.org/abs/1704.01610
Autor:
Thomas Müller, Hinrich Schuetze
Publikováno v:
HLT-NAACL
We present a comparative investigation of word representations for part-of-speech (POS) and morphological tagging, focusing on scenarios with considerable differences between training and test data where a robust approach is necessary. Instead of ada
Publikováno v:
SIGIR
Information Retrieval in technical domains like physics is characterised by long and precise queries, whose meaning is strongly influenced by term context and domain. We treat this as a disambiguation problem, and present initial findings of a retrie
Publikováno v:
Document Recognition and Retrieval
In this paper, we describe a system for performing browsing and retrieval on a collection of web images and associated text on an HTML page. Browsing is combined with retrieval to help a user locate interesting portions of the corpus, without the nee
Autor:
Hinrich Schuetze, Yoram Singer
Publikováno v:
ACL
We present a new approach to disambiguating syntactically ambiguous words in context, based on Variable Memory Markov (VMM) models. In contrast to fixed-length Markov models, which predict based on fixed-lenth histories, variable memory Markov models
Autor:
Christopher Manning, Hinrich Schütze
Statistical approaches to processing natural language text have become dominant in recent years. This foundational text is the first comprehensive introduction to statistical natural language processing (NLP) to appear. The book contains all the theo