Zobrazeno 1 - 10
of 33
pro vyhledávání: '"Vania, Clara"'
Publikováno v:
Computational Linguistics, Vol 46, Iss 2, Pp 335-385 (2020)
Despite an ever-growing number of word representation models introduced for a large number of languages, there is a lack of a standardized technique to provide insights into what is captured by these models. Such insights would help the community to
Externí odkaz:
https://doaj.org/article/59f9f3ff91d3454eb0a319a7567c3d11
Autor:
Whitehouse, Chenxi, Vania, Clara, Aji, Alham Fikri, Christodoulopoulos, Christos, Pierleoni, Andrea
Extracting structured and grounded fact triples from raw text is a fundamental task in Information Extraction (IE). Existing IE datasets are typically collected from Wikipedia articles, using hyperlinks to link entities to the Wikidata knowledge base
Externí odkaz:
http://arxiv.org/abs/2305.14293
Autor:
Huang, Ningyuan, Deshpande, Yash R., Liu, Yibo, Alberts, Houda, Cho, Kyunghyun, Vania, Clara, Calixto, Iacer
We propose a method to make natural language understanding models more parameter efficient by storing knowledge in an external knowledge graph (KG) and retrieving from this KG using a dense index. Given (possibly multilingual) downstream task data, e
Externí odkaz:
http://arxiv.org/abs/2206.13163
Autor:
Batsuren, Khuyagbaatar, Goldman, Omer, Khalifa, Salam, Habash, Nizar, Kieraś, Witold, Bella, Gábor, Leonard, Brian, Nicolai, Garrett, Gorman, Kyle, Ate, Yustinus Ghanggo, Ryskina, Maria, Mielke, Sabrina J., Budianskaya, Elena, El-Khaissi, Charbel, Pimentel, Tiago, Gasser, Michael, Lane, William, Raj, Mohit, Coler, Matt, Samame, Jaime Rafael Montoya, Camaiteri, Delio Siticonatzi, Sagot, Benoît, Rojas, Esaú Zumaeta, Francis, Didier López, Oncevay, Arturo, Bautista, Juan López, Villegas, Gema Celeste Silva, Hennigen, Lucas Torroba, Ek, Adam, Guriel, David, Dirix, Peter, Bernardy, Jean-Philippe, Scherbakov, Andrey, Bayyr-ool, Aziyana, Anastasopoulos, Antonios, Zariquiey, Roberto, Sheifer, Karina, Ganieva, Sofya, Cruz, Hilaria, Karahóǧa, Ritván, Markantonatou, Stella, Pavlidis, George, Plugaryov, Matvey, Klyachko, Elena, Salehi, Ali, Angulo, Candy, Baxi, Jatayu, Krizhanovsky, Andrew, Krizhanovskaya, Natalia, Salesky, Elizabeth, Vania, Clara, Ivanova, Sardana, White, Jennifer, Maudslay, Rowan Hall, Valvoda, Josef, Zmigrod, Ran, Czarnowska, Paula, Nikkarinen, Irene, Salchak, Aelita, Bhatt, Brijesh, Straughn, Christopher, Liu, Zoey, Washington, Jonathan North, Pinter, Yuval, Ataman, Duygu, Wolinski, Marcin, Suhardijanto, Totok, Yablonskaya, Anna, Stoehr, Niklas, Dolatian, Hossep, Nuriah, Zahroh, Ratan, Shyam, Tyers, Francis M., Ponti, Edoardo M., Aiton, Grant, Arora, Aryaman, Hatcher, Richard J., Kumar, Ritesh, Young, Jeremiah, Rodionova, Daria, Yemelina, Anastasia, Andrushko, Taras, Marchenko, Igor, Mashkovtseva, Polina, Serova, Alexandra, Prud'hommeaux, Emily, Nepomniashchaya, Maria, Giunchiglia, Fausto, Chodroff, Eleanor, Hulden, Mans, Silfverberg, Miikka, McCarthy, Arya D., Yarowsky, David, Cotterell, Ryan, Tsarfaty, Reut, Vylomova, Ekaterina
The Universal Morphology (UniMorph) project is a collaborative effort providing broad-coverage instantiated normalized morphological inflection tables for hundreds of diverse world languages. The project comprises two major thrusts: a language-indepe
Externí odkaz:
http://arxiv.org/abs/2205.03608
We present IndoNLI, the first human-elicited NLI dataset for Indonesian. We adapt the data collection protocol for MNLI and collect nearly 18K sentence pairs annotated by crowd workers and experts. The expert-annotated data is used exclusively as a t
Externí odkaz:
http://arxiv.org/abs/2110.14566
Autor:
Vania, Clara, Htut, Phu Mon, Huang, William, Mungra, Dhara, Pang, Richard Yuanzhe, Phang, Jason, Liu, Haokun, Cho, Kyunghyun, Bowman, Samuel R.
Recent years have seen numerous NLP datasets introduced to evaluate the performance of fine-tuned models on natural language understanding tasks. Recent results from large pretrained models, though, show that many of these datasets are largely satura
Externí odkaz:
http://arxiv.org/abs/2106.00840
Autor:
Nangia, Nikita, Sugawara, Saku, Trivedi, Harsh, Warstadt, Alex, Vania, Clara, Bowman, Samuel R.
Crowdsourcing is widely used to create data for common natural language understanding tasks. Despite the importance of these datasets for measuring and refining model understanding of language, there has been little focus on the crowdsourcing methods
Externí odkaz:
http://arxiv.org/abs/2106.00794
Large-scale natural language inference (NLI) datasets such as SNLI or MNLI have been created by asking crowdworkers to read a premise and write three new hypotheses, one for each possible semantic relationships (entailment, contradiction, and neutral
Externí odkaz:
http://arxiv.org/abs/2010.06122
Pretrained language models, especially masked language models (MLMs) have seen success across many NLP tasks. However, there is ample evidence that they use the cultural biases that are undoubtedly present in the corpora they are trained on, implicit
Externí odkaz:
http://arxiv.org/abs/2010.00133
Autor:
Alberts, Houda, Huang, Teresa, Deshpande, Yash, Liu, Yibo, Cho, Kyunghyun, Vania, Clara, Calixto, Iacer
An exciting frontier in natural language understanding (NLU) and generation (NLG) calls for (vision-and-) language models that can efficiently access external structured knowledge repositories. However, many existing knowledge bases only cover limite
Externí odkaz:
http://arxiv.org/abs/2008.09150