Zobrazeno 1 - 10
of 1 017
pro vyhledávání: '"Etxaniz A"'
Large Language Models (LLMs) exhibit extensive knowledge about the world, but most evaluations have been limited to global or anglocentric subjects. This raises the question of how well these models perform on topics relevant to other cultures, whose
Externí odkaz:
http://arxiv.org/abs/2406.07302
Autor:
Biderman, Stella, Schoelkopf, Hailey, Sutawika, Lintang, Gao, Leo, Tow, Jonathan, Abbasi, Baber, Aji, Alham Fikri, Ammanamanchi, Pawan Sasanka, Black, Sidney, Clive, Jordan, DiPofi, Anthony, Etxaniz, Julen, Fattori, Benjamin, Forde, Jessica Zosa, Foster, Charles, Hsu, Jeffrey, Jaiswal, Mimansa, Lee, Wilson Y., Li, Haonan, Lovering, Charles, Muennighoff, Niklas, Pavlick, Ellie, Phang, Jason, Skowron, Aviya, Tan, Samson, Tang, Xiangru, Wang, Kevin A., Winata, Genta Indra, Yvon, François, Zou, Andy
Effective evaluation of language models remains an open challenge in NLP. Researchers and engineers face methodological issues such as the sensitivity of models to evaluation setup, difficulty of proper comparisons across methods, and the lack of rep
Externí odkaz:
http://arxiv.org/abs/2405.14782
Autor:
Heredia, Maite, Etxaniz, Julen, Zulaika, Muitze, Saralegi, Xabier, Barnes, Jeremy, Soroa, Aitor
XNLI is a popular Natural Language Inference (NLI) benchmark widely used to evaluate cross-lingual Natural Language Understanding (NLU) capabilities across languages. In this paper, we expand XNLI to include Basque, a low-resource language that can g
Externí odkaz:
http://arxiv.org/abs/2404.06996
Autor:
Etxaniz, Julen, Sainz, Oscar, Perez, Naiara, Aldabe, Itziar, Rigau, German, Agirre, Eneko, Ormazabal, Aitor, Artetxe, Mikel, Soroa, Aitor
Publikováno v:
Proceedings of the 62nd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pages 14952--14972. 2024
We introduce Latxa, a family of large language models for Basque ranging from 7 to 70 billion parameters. Latxa is based on Llama 2, which we continue pretraining on a new Basque corpus comprising 4.3M documents and 4.2B tokens. Addressing the scarci
Externí odkaz:
http://arxiv.org/abs/2403.20266
Autor:
Osaba, Eneko, Benguria, Gorka, Lobo, Jesus L., Diaz-de-Arcaya, Josu, Alonso, Juncal, Etxaniz, Iñaki
In the last years, one of the fields of artificial intelligence that has been investigated the most is nature-inspired computing. The research done on this specific topic showcases the interest that sparks in researchers and practitioners, who put th
Externí odkaz:
http://arxiv.org/abs/2311.10767
Autor:
Sainz, Oscar, Campos, Jon Ander, García-Ferrero, Iker, Etxaniz, Julen, de Lacalle, Oier Lopez, Agirre, Eneko
In this position paper, we argue that the classical evaluation on Natural Language Processing (NLP) tasks using annotated benchmarks is in trouble. The worst kind of data contamination happens when a Large Language Model (LLM) is trained on the test
Externí odkaz:
http://arxiv.org/abs/2310.18018
Translate-test is a popular technique to improve the performance of multilingual language models. This approach works by translating the input into English using an external machine translation system, and running inference over the translated input.
Externí odkaz:
http://arxiv.org/abs/2308.01223
Autor:
Alberto Luengo, Mikel Etxaniz
Publikováno v:
Munibe Ciencias Naturales, Vol 72 (2024)
Se notifica la primera cita conocida de reproducción de martinete común Nycticorax nycticorax (Linnaeus, 1758) en la provincia de Gipuzkoa. En julio de 2023 se observaron dos individuos juveniles aún no totalmente desarrollados en el Parque Ecoló
Externí odkaz:
https://doaj.org/article/5a224fdc29974bb5923063af95ad20e2
Autor:
Olga Romero-Clarà, Clara Madrid, Juan Carlos Pardo, Vicenç Ruiz de Porras, Olatz Etxaniz, Deborah Moreno-Alonso, Albert Font
Publikováno v:
Frontiers in Pharmacology, Vol 15 (2024)
BackgroundThe high incidence and mortality rates of urothelial carcinoma mean it remains a significant global health concern. Its prevalence is notably pronounced in industrialized countries, with Spain registering one of the highest incidences in Eu
Externí odkaz:
https://doaj.org/article/ab827142f4f74762818a3ab766c2a1be
Publikováno v:
Open Research Europe, Vol 4 (2024)
In order to address current challenges on security certification of European ICT products, processes and services, the European Comission, through ENISA (European Union Agency for Cybersecurity), has developed the European Cybersecurity Certification
Externí odkaz:
https://doaj.org/article/8d96698ed86e4e65b7a61c0397c6bd94