Zobrazeno 1 - 10
of 23
pro vyhledávání: '"Constantine Lignos"'
Autor:
David Ifeoluwa Adelani, Jade Abbott, Graham Neubig, Daniel D’souza, Julia Kreutzer, Constantine Lignos, Chester Palen-Michel, Happy Buzaaba, Shruti Rijhwani, Sebastian Ruder, Stephen Mayhew, Israel Abebe Azime, Shamsuddeen H. Muhammad, Chris Chinenye Emezue, Joyce Nakatumba-Nabende, Perez Ogayo, Aremu Anuoluwapo, Catherine Gitau, Derguene Mbaye, Jesujoba Alabi, Seid Muhie Yimam, Tajuddeen Rabiu Gwadabe, Ignatius Ezeani, Rubungo Andre Niyongabo, Jonathan Mukiibi, Verrah Otiende, Iroro Orife, Davis David, Samba Ngom, Tosin Adewumi, Paul Rayson, Mofetoluwa Adeyemi, Gerald Muriuki, Emmanuel Anebi, Chiamaka Chukwuneke, Nkiruka Odu, Eric Peter Wairagala, Samuel Oyerinde, Clemencia Siro, Tobius Saul Bateesa, Temilola Oloyede, Yvonne Wambui, Victor Akinode, Deborah Nabagereka, Maurice Katusiime, Ayodele Awokoya, Mouhamadane MBOUP, Dibora Gebreyohannes, Henok Tilaye, Kelechi Nwaike, Degaga Wolde, Abdoulaye Faye, Blessing Sibanda, Orevaoghene Ahia, Bonaventure F. P. Dossou, Kelechi Ogueji, Thierno Ibrahima DIOP, Abdoulaye Diallo, Adewale Akinfaderin, Tendai Marengereke, Salomey Osei
Publikováno v:
Transactions of the Association for Computational Linguistics, Vol 9, Pp 1116-1131 (2021)
AbstractWe take a step towards addressing the under- representation of the African continent in NLP research by bringing together different stakeholders to create the first large, publicly available, high-quality dataset for named entity recognition
Externí odkaz:
https://doaj.org/article/07e7ec6a468e4df198aa92181a9205b3
This volume explores how the patterning of surface variation can shed light on the grammatical representation of variable phenomena. The authors explore variation in several domains, addressing intra- and inter-dialectal patterns, using diverse sourc
Autor:
Jonne Sälevä, Constantine Lignos
We introduce ParaNames, a multilingual parallel name resource consisting of 118 million names spanning across 400 languages. Names are provided for 13.6 million entities which are mapped to standardized entity types (PER/LOC/ORG). Using Wikidata as a
Externí odkaz:
https://explore.openaire.eu/search/publication?articleId=doi_dedup___::85a1c9c6231adb4074853c834d3ce392
http://arxiv.org/abs/2202.14035
http://arxiv.org/abs/2202.14035
This work presents a new resource for borrowing identification and analyzes the performance and errors of several models on this task. We introduce a new annotated corpus of Spanish newswire rich in unassimilated lexical borrowings -- words from one
Externí odkaz:
https://explore.openaire.eu/search/publication?articleId=doi_dedup___::66dda1c1f1832e11b7e4635a30c716e6
Autor:
Julia Kreutzer, Ayodele Awokoya, Ignatius Ezeani, Rubungo Andre Niyongabo, Happy Buzaaba, Adewale Akinfaderin, Samuel Oyerinde, Stephen Mayhew, Emmanuel Anebi, Mofetoluwa Adeyemi, Kelechi Ogueji, Abdoulaye Diallo, Seid Muhie Yimam, Jade Abbott, Joyce Nakatumba-Nabende, Victor Akinode, Blessing Sibanda, Catherine Gitau, Chester Palen-Michel, Shamsuddeen Hassan Muhammad, Degaga Wolde, Graham Neubig, Tendai Marengereke, Paul Rayson, Derguene Mbaye, Eric Peter Wairagala, Daniel D'souza, Tosin P. Adewumi, Jonathan Mukiibi, Chris Chinenye Emezue, David Ifeoluwa Adelani, Shruti Rijhwani, Iroro Orife, Verrah Otiende, Maurice Katusiime, Yvonne Wambui, Dibora Gebreyohannes, Kelechi Nwaike, Salomey Osei, Chiamaka Chukwuneke, Henok Tilaye, Deborah Nabagereka, Thierno Ibrahima Diop, Orevaoghene Ahia, Jesujoba O. Alabi, Sebastian Ruder, Davis David, Mouhamadane Mboup, Samba Ngom, Tajuddeen R. Gwadabe, Bonaventure F. P. Dossou, Temilola Oloyede, Perez Ogayo, Clemencia Siro, Gerald Muriuki, Aremu Anuoluwapo, Nkiruka Odu, Tobius Saul Bateesa, Abdoulaye Faye, Israel Abebe Azime, Constantine Lignos
Publikováno v:
Transactions of the Association for Computational Linguistics
Transactions of the Association for Computational Linguistics, The MIT Press, 2021, ⟨10.1162/tacl⟩
Transactions of the Association for Computational Linguistics, 2021, ⟨10.1162/tacl⟩
Transactions of the Association for Computational Linguistics, The MIT Press, 2021, ⟨10.1162/tacl⟩
Transactions of the Association for Computational Linguistics, 2021, ⟨10.1162/tacl⟩
We take a step towards addressing the under-representation of the African continent in NLP research by creating the first large publicly available high-quality dataset for named entity recognition (NER) in ten African languages, bringing together a v
Externí odkaz:
https://explore.openaire.eu/search/publication?articleId=doi_dedup___::d1930ae0735e2b37961957cb5eb49a8e
https://hal.inria.fr/hal-03350962/file/adelani_TACL2021.pdf
https://hal.inria.fr/hal-03350962/file/adelani_TACL2021.pdf
Publikováno v:
NAACL-HLT
While traditional corpus-level evaluation metrics for machine translation (MT) correlate well with fluency, they struggle to reflect adequacy. Model-based MT metrics trained on segment-level human judgments have emerged as an attractive replacement d
Externí odkaz:
https://explore.openaire.eu/search/publication?articleId=doi_dedup___::ef5c09b95e6dbfdf1060f7aa9d077525
https://aclanthology.org/2021.naacl-main.90
https://aclanthology.org/2021.naacl-main.90
Autor:
Jingxuan Tu, Constantine Lignos
Publikováno v:
EACL (Student Research Workshop)
We propose the Tough Mentions Recall (TMR) metrics to supplement traditional named entity recognition (NER) evaluation by examining recall on specific subsets of "tough" mentions: unseen mentions, those whose tokens or token/type combination were not
Autor:
Constantine Lignos, Jonne Sälevä
Publikováno v:
EACL (Student Research Workshop)
This paper evaluates the performance of several modern subword segmentation methods in a low-resource neural machine translation setting. We compare segmentations produced by applying BPE at the token or sentence level with morphologically-based segm
Autor:
Marjan Kamyab, Constantine Lignos
Publikováno v:
Insights
We attempt to replicate a named entity recognition (NER) model implemented in a popular toolkit and discover that a critical barrier to doing so is the inconsistent evaluation of improper label sequences. We define these sequences and examine how two
Publikováno v:
Machine Translation. 32:31-43
We describe a multifaceted approach to named entity recognition that can be deployed with minimal data resources and a handful of hours of non-expert annotation. We describe how this approach was applied in the 2016 LoReHLT evaluation and demonstrate