Zobrazeno 1 - 6
of 6
pro vyhledávání: '"Dopson, Dave"'
Publikováno v:
Advances in Neural Information Processing Systems 36 (2023) 3189-3204
This paper introduces SOAR: Spilling with Orthogonality-Amplified Residuals, a novel data indexing technique for approximate nearest neighbor (ANN) search. SOAR extends upon previous approaches to ANN search, such as spill trees, that utilize multipl
Externí odkaz:
http://arxiv.org/abs/2404.00774
Tokenization is a fundamental preprocessing step for almost all NLP tasks. In this paper, we propose efficient algorithms for the WordPiece tokenization used in BERT, from single-word tokenization to general text (e.g., sentence) tokenization. When t
Externí odkaz:
http://arxiv.org/abs/2012.15524
Many emerging use cases of data mining and machine learning operate on large datasets with data from heterogeneous sources, specifically with both sparse and dense components. For example, dense deep neural network embedding vectors are often used in
Externí odkaz:
http://arxiv.org/abs/1903.08690
Autor:
Dopson, Dave
Thesis (M. Eng. and S.B.)--Massachusetts Institute of Technology, Dept. of Electrical Engineering and Computer Science, 2005.
Includes bibliographical references (leaves 51-52).
SoftECC is software memory integrity checking agent. SoftECC r
Includes bibliographical references (leaves 51-52).
SoftECC is software memory integrity checking agent. SoftECC r
Externí odkaz:
http://hdl.handle.net/1721.1/36769
Publikováno v:
Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing.
Tokenization is a fundamental preprocessing step for almost all NLP tasks. In this paper, we propose efficient algorithms for the WordPiece tokenization used in BERT, from single-word tokenization to general text (e.g., sentence) tokenization. When t
Akademický článek
Tento výsledek nelze pro nepřihlášené uživatele zobrazit.
K zobrazení výsledku je třeba se přihlásit.
K zobrazení výsledku je třeba se přihlásit.