Zobrazeno 1 - 10
of 28
pro vyhledávání: '"SAINZ, OSCAR"'
Autor:
Sainz, Oscar, García-Ferrero, Iker, Jacovi, Alon, Campos, Jon Ander, Elazar, Yanai, Agirre, Eneko, Goldberg, Yoav, Chen, Wei-Lin, Chim, Jenny, Choshen, Leshem, D'Amico-Wong, Luca, Dell, Melissa, Fan, Run-Ze, Golchin, Shahriar, Li, Yucheng, Liu, Pengfei, Pahwa, Bhavish, Prabhu, Ameya, Sharma, Suryansh, Silcock, Emily, Solonko, Kateryna, Stap, David, Surdeanu, Mihai, Tseng, Yu-Min, Udandarao, Vishaal, Wang, Zengzhi, Xu, Ruijie, Yang, Jinglin
The 1st Workshop on Data Contamination (CONDA 2024) focuses on all relevant aspects of data contamination in natural language processing, where data contamination is understood as situations where evaluation data is included in pre-training corpora u
Externí odkaz:
http://arxiv.org/abs/2407.21530
Cross-lingual transfer-learning is widely used in Event Extraction for low-resource languages and involves a Multilingual Language Model that is trained in a source language and applied to the target language. This paper studies whether the typologic
Externí odkaz:
http://arxiv.org/abs/2404.06392
Autor:
Etxaniz, Julen, Sainz, Oscar, Perez, Naiara, Aldabe, Itziar, Rigau, German, Agirre, Eneko, Ormazabal, Aitor, Artetxe, Mikel, Soroa, Aitor
Publikováno v:
Proceedings of the 62nd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pages 14952--14972. 2024
We introduce Latxa, a family of large language models for Basque ranging from 7 to 70 billion parameters. Latxa is based on Llama 2, which we continue pretraining on a new Basque corpus comprising 4.3M documents and 4.2B tokens. Addressing the scarci
Externí odkaz:
http://arxiv.org/abs/2403.20266
Autor:
Sainz, Oscar, Campos, Jon Ander, García-Ferrero, Iker, Etxaniz, Julen, de Lacalle, Oier Lopez, Agirre, Eneko
In this position paper, we argue that the classical evaluation on Natural Language Processing (NLP) tasks using annotated benchmarks is in trouble. The worst kind of data contamination happens when a Large Language Model (LLM) is trained on the test
Externí odkaz:
http://arxiv.org/abs/2310.18018
Autor:
Sainz, Oscar, García-Ferrero, Iker, Agerri, Rodrigo, de Lacalle, Oier Lopez, Rigau, German, Agirre, Eneko
Large Language Models (LLMs) combined with instruction tuning have made significant progress when generalizing to unseen tasks. However, they have been less successful in Information Extraction (IE), lagging behind task-specific models. Typically, IE
Externí odkaz:
http://arxiv.org/abs/2310.03668
Named Entity Recognition (NER) is a core natural language processing task in which pre-trained language models have shown remarkable performance. However, standard benchmarks like CoNLL 2003 do not address many of the challenges that deployed NER sys
Externí odkaz:
http://arxiv.org/abs/2304.10637
Language Models are the core for almost any Natural Language Processing system nowadays. One of their particularities is their contextualized representations, a game changer feature when a disambiguation between word senses is necessary. In this pape
Externí odkaz:
http://arxiv.org/abs/2302.03353
Recent work has shown that NLP tasks such as Relation Extraction (RE) can be recasted as Textual Entailment tasks using verbalizations, with strong performance in zero-shot and few-shot settings thanks to pre-trained entailment models. The fact that
Externí odkaz:
http://arxiv.org/abs/2205.01376
The current workflow for Information Extraction (IE) analysts involves the definition of the entities/relations of interest and a training corpus with annotated examples. In this demonstration we introduce a new workflow where the analyst directly ve
Externí odkaz:
http://arxiv.org/abs/2203.13602
Autor:
Min, Bonan, Ross, Hayley, Sulem, Elior, Veyseh, Amir Pouran Ben, Nguyen, Thien Huu, Sainz, Oscar, Agirre, Eneko, Heinz, Ilana, Roth, Dan
Large, pre-trained transformer-based language models such as BERT have drastically changed the Natural Language Processing (NLP) field. We present a survey of recent work that uses these large language models to solve NLP tasks via pre-training then
Externí odkaz:
http://arxiv.org/abs/2111.01243