Zobrazeno 1 - 10
of 10
pro vyhledávání: '"Kitaev, Nikita"'
We propose a novel type of balanced clustering algorithm to approximate attention. Attention complexity is reduced from $O(N^2)$ to $O(N \log N)$, where $N$ is the sequence length. Our algorithm, SMYRF, uses Locality Sensitive Hashing (LSH) in a nove
Externí odkaz:
http://arxiv.org/abs/2010.05315
We propose a method for unsupervised parsing based on the linguistic notion of a constituency test. One type of constituency test involves modifying the sentence via some transformation (e.g. replacing the span with a pronoun) and then judging the re
Externí odkaz:
http://arxiv.org/abs/2010.03146
We propose procedures for evaluating and strengthening contextual embedding alignment and show that they are useful in analyzing and improving multilingual BERT. In particular, after our proposed alignment procedure, BERT exhibits significantly impro
Externí odkaz:
http://arxiv.org/abs/2002.03518
Large Transformer models routinely achieve state-of-the-art results on a number of tasks but training these models can be prohibitively costly, especially on long sequences. We introduce two techniques to improve the efficiency of Transformers. For o
Externí odkaz:
http://arxiv.org/abs/2001.04451
Neural parsers obtain state-of-the-art results on benchmark treebanks for constituency parsing -- but to what degree do they generalize to other domains? We present three results about the generalization of neural parsers in a zero-shot setting: trai
Externí odkaz:
http://arxiv.org/abs/1907.04347
We present KERMIT, a simple insertion-based approach to generative modeling for sequences and sequence pairs. KERMIT models the joint distribution and its decompositions (i.e., marginals and conditionals) using a single neural network and, unlike muc
Externí odkaz:
http://arxiv.org/abs/1906.01604
Autor:
Kitaev, Nikita, Klein, Dan
We present a constituency parsing algorithm that, like a supertagger, works by assigning labels to each word in a sentence. In order to maximally leverage current neural architectures, the model scores each word's tags in parallel, with minimal task-
Externí odkaz:
http://arxiv.org/abs/1904.09745
We show that constituency parsing benefits from unsupervised pre-training across a variety of languages and a range of pre-training conditions. We first compare the benefits of no pre-training, fastText, ELMo, and BERT for English and find that BERT
Externí odkaz:
http://arxiv.org/abs/1812.11760
Autor:
Kitaev, Nikita, Klein, Dan
We demonstrate that replacing an LSTM encoder with a self-attentive architecture can lead to improvements to a state-of-the-art discriminative constituency parser. The use of attention makes explicit the manner in which information is propagated betw
Externí odkaz:
http://arxiv.org/abs/1805.01052
Autor:
Kim, Jin-Hwa, Kitaev, Nikita, Chen, Xinlei, Rohrbach, Marcus, Zhang, Byoung-Tak, Tian, Yuandong, Batra, Dhruv, Parikh, Devi
In this work, we propose a goal-driven collaborative task that combines language, perception, and action. Specifically, we develop a Collaborative image-Drawing game between two agents, called CoDraw. Our game is grounded in a virtual world that cont
Externí odkaz:
http://arxiv.org/abs/1712.05558