Spanish Biomedical and Clinical Language Embeddings

Autor: Gutiérrez-Fandiño, Asier, Armengol-Estapé, Jordi, Carrino, Casimiro Pio, De Gibert, Ona, Gonzalez-Agirre, Aitor, Villegas, Marta
Rok vydání: 2021
Předmět:
Druh dokumentu: Working Paper
Popis: We computed both Word and Sub-word Embeddings using FastText. For Sub-word embeddings we selected Byte Pair Encoding (BPE) algorithm to represent the sub-words. We evaluated the Biomedical Word Embeddings obtaining better results than previous versions showing the implication that with more data, we obtain better representations.
Databáze: arXiv