Výsledky vyhledávání - "Papadimitriou, Isabel"

Report

Autor: Kallini, Julie, Papadimitriou, Isabel, Futrell, Richard, Mahowald, Kyle, Potts, Christopher

Chomsky and others have very directly claimed that large language models (LLMs) are equally capable of learning languages that are possible and impossible for humans to learn. However, there is very little published experimental evidence to support s

Externí odkaz: http://arxiv.org/abs/2401.06416

Zobrazit plný text záznamu

Report

Separating the Wheat from the Chaff with BREAD: An open-source benchmark and metrics to detect redundancy in text

Autor: Caswell, Isaac, Wang, Lisa, Papadimitriou, Isabel

Data quality is a problem that perpetually resurfaces throughout the field of NLP, regardless of task, domain, or architecture, and remains especially severe for lower-resource languages. A typical and insidious issue, affecting both training data an

Externí odkaz: http://arxiv.org/abs/2311.06440

Zobrazit plný text záznamu

Report

Injecting structural hints: Using language models to study inductive biases in language learning

Autor: Papadimitriou, Isabel, Jurafsky, Dan

Both humans and large language models are able to learn language without explicit structural supervision. What inductive biases make this learning possible? We address this fundamental cognitive question by leveraging transformer language models: we

Externí odkaz: http://arxiv.org/abs/2304.13060

Zobrazit plný text záznamu

Report

Multilingual BERT has an accent: Evaluating English influences on fluency in multilingual models

Autor: Papadimitriou, Isabel, Lopez, Kezia, Jurafsky, Dan

While multilingual language models can improve NLP performance on low-resource languages by leveraging higher-resource languages, they also reduce average performance on all languages (the 'curse of multilinguality'). Here we show another problem wit

Externí odkaz: http://arxiv.org/abs/2210.05619

Zobrazit plný text záznamu

Report

When classifying grammatical role, BERT doesn't care about word order... except when it matters

Autor: Papadimitriou, Isabel, Futrell, Richard, Mahowald, Kyle

Because meaning can often be inferred from lexical semantics alone, word order is often a redundant cue in natural language. For example, the words chopped, chef, and onion are more likely used to convey "The chef chopped the onion," not "The onion c

Externí odkaz: http://arxiv.org/abs/2203.06204

Zobrazit plný text záznamu

Report

Oolong: Investigating What Makes Transfer Learning Hard with Controlled Studies

Autor: Wu, Zhengxuan, Tamkin, Alex, Papadimitriou, Isabel

When we transfer a pretrained language model to a new language, there are many axes of variation that change at once. To disentangle the impact of different factors like syntactic similarity and vocabulary similarity, we propose a set of controlled t

Externí odkaz: http://arxiv.org/abs/2202.12312

Zobrazit plný text záznamu

Report

On the Opportunities and Risks of Foundation Models

Autor: Bommasani, Rishi, Hudson, Drew A., Adeli, Ehsan, Altman, Russ, Arora, Simran, von Arx, Sydney, Bernstein, Michael S., Bohg, Jeannette, Bosselut, Antoine, Brunskill, Emma, Brynjolfsson, Erik, Buch, Shyamal, Card, Dallas, Castellon, Rodrigo, Chatterji, Niladri, Chen, Annie, Creel, Kathleen, Davis, Jared Quincy, Demszky, Dora, Donahue, Chris, Doumbouya, Moussa, Durmus, Esin, Ermon, Stefano, Etchemendy, John, Ethayarajh, Kawin, Fei-Fei, Li, Finn, Chelsea, Gale, Trevor, Gillespie, Lauren, Goel, Karan, Goodman, Noah, Grossman, Shelby, Guha, Neel, Hashimoto, Tatsunori, Henderson, Peter, Hewitt, John, Ho, Daniel E., Hong, Jenny, Hsu, Kyle, Huang, Jing, Icard, Thomas, Jain, Saahil, Jurafsky, Dan, Kalluri, Pratyusha, Karamcheti, Siddharth, Keeling, Geoff, Khani, Fereshte, Khattab, Omar, Koh, Pang Wei, Krass, Mark, Krishna, Ranjay, Kuditipudi, Rohith, Kumar, Ananya, Ladhak, Faisal, Lee, Mina, Lee, Tony, Leskovec, Jure, Levent, Isabelle, Li, Xiang Lisa, Li, Xuechen, Ma, Tengyu, Malik, Ali, Manning, Christopher D., Mirchandani, Suvir, Mitchell, Eric, Munyikwa, Zanele, Nair, Suraj, Narayan, Avanika, Narayanan, Deepak, Newman, Ben, Nie, Allen, Niebles, Juan Carlos, Nilforoshan, Hamed, Nyarko, Julian, Ogut, Giray, Orr, Laurel, Papadimitriou, Isabel, Park, Joon Sung, Piech, Chris, Portelance, Eva, Potts, Christopher, Raghunathan, Aditi, Reich, Rob, Ren, Hongyu, Rong, Frieda, Roohani, Yusuf, Ruiz, Camilo, Ryan, Jack, Ré, Christopher, Sadigh, Dorsa, Sagawa, Shiori, Santhanam, Keshav, Shih, Andy, Srinivasan, Krishnan, Tamkin, Alex, Taori, Rohan, Thomas, Armin W., Tramèr, Florian, Wang, Rose E., Wang, William, Wu, Bohan, Wu, Jiajun, Wu, Yuhuai, Xie, Sang Michael, Yasunaga, Michihiro, You, Jiaxuan, Zaharia, Matei, Zhang, Michael, Zhang, Tianyi, Zhang, Xikun, Zhang, Yuhui, Zheng, Lucia, Zhou, Kaitlyn, Liang, Percy

AI is undergoing a paradigm shift with the rise of models (e.g., BERT, DALL-E, GPT-3) that are trained on broad data at scale and are adaptable to a wide range of downstream tasks. We call these models foundation models to underscore their critically

Externí odkaz: http://arxiv.org/abs/2108.07258

Zobrazit plný text záznamu

Report

Quality at a Glance: An Audit of Web-Crawled Multilingual Datasets

Autor: Kreutzer, Julia, Caswell, Isaac, Wang, Lisa, Wahab, Ahsan, van Esch, Daan, Ulzii-Orshikh, Nasanbayar, Tapo, Allahsera, Subramani, Nishant, Sokolov, Artem, Sikasote, Claytone, Setyawan, Monang, Sarin, Supheakmungkol, Samb, Sokhar, Sagot, Benoît, Rivera, Clara, Rios, Annette, Papadimitriou, Isabel, Osei, Salomey, Suarez, Pedro Ortiz, Orife, Iroro, Ogueji, Kelechi, Rubungo, Andre Niyongabo, Nguyen, Toan Q., Müller, Mathias, Müller, André, Muhammad, Shamsuddeen Hassan, Muhammad, Nanda, Mnyakeni, Ayanda, Mirzakhalov, Jamshidbek, Matangira, Tapiwanashe, Leong, Colin, Lawson, Nze, Kudugunta, Sneha, Jernite, Yacine, Jenny, Mathias, Firat, Orhan, Dossou, Bonaventure F. P., Dlamini, Sakhile, de Silva, Nisansa, Ballı, Sakine Çabuk, Biderman, Stella, Battisti, Alessia, Baruwa, Ahmed, Bapna, Ankur, Baljekar, Pallavi, Azime, Israel Abebe, Awokoya, Ayodele, Ataman, Duygu, Ahia, Orevaoghene, Ahia, Oghenefego, Agrawal, Sweta, Adeyemi, Mofetoluwa

Publikováno v: Transactions of the Association for Computational Linguistics (2022) 10: 50-72

With the success of large-scale pre-training and multilingual modeling in Natural Language Processing (NLP), recent years have seen a proliferation of large, web-mined text datasets covering hundreds of languages. We manually audit the quality of 205

Externí odkaz: http://arxiv.org/abs/2103.12028

Zobrazit plný text záznamu

Report

Deep Subjecthood: Higher-Order Grammatical Features in Multilingual BERT

Autor: Papadimitriou, Isabel, Chi, Ethan A., Futrell, Richard, Mahowald, Kyle

We investigate how Multilingual BERT (mBERT) encodes grammar by examining how the high-order grammatical feature of morphosyntactic alignment (how different languages define what counts as a "subject") is manifested across the embedding spaces of dif

Externí odkaz: http://arxiv.org/abs/2101.11043

Zobrazit plný text záznamu

Report

Learning Music Helps You Read: Using Transfer to Study Linguistic Structure in Language Models

Autor: Papadimitriou, Isabel, Jurafsky, Dan

We propose transfer learning as a method for analyzing the encoding of grammatical structure in neural language models. We train LSTMs on non-linguistic data and evaluate their performance on natural language to assess which kinds of data induce gene

Externí odkaz: http://arxiv.org/abs/2004.14601

Zobrazit plný text záznamu

Vyhledávací nástroje:

Upřesnit hledání