Zobrazeno 1 - 10
of 84
pro vyhledávání: '"Gorman, Kyle"'
Autor:
Gorman, Kyle, Pinter, Yuval
The common practice of preprocessing text before feeding it into NLP models introduces many decision points which have unintended consequences on model performance. In this opinion piece, we focus on the handling of diacritics in texts originating in
Externí odkaz:
http://arxiv.org/abs/2410.24140
Autor:
Batsuren, Khuyagbaatar, Bella, Gábor, Arora, Aryaman, Martinović, Viktor, Gorman, Kyle, Žabokrtský, Zdeněk, Ganbold, Amarsanaa, Dohnalová, Šárka, Ševčíková, Magda, Pelegrinová, Kateřina, Giunchiglia, Fausto, Cotterell, Ryan, Vylomova, Ekaterina
The SIGMORPHON 2022 shared task on morpheme segmentation challenged systems to decompose a word into a sequence of morphemes and covered most types of morphology: compounds, derivations, and inflections. Subtask 1, word-level morpheme segmentation, c
Externí odkaz:
http://arxiv.org/abs/2206.07615
Autor:
Batsuren, Khuyagbaatar, Goldman, Omer, Khalifa, Salam, Habash, Nizar, Kieraś, Witold, Bella, Gábor, Leonard, Brian, Nicolai, Garrett, Gorman, Kyle, Ate, Yustinus Ghanggo, Ryskina, Maria, Mielke, Sabrina J., Budianskaya, Elena, El-Khaissi, Charbel, Pimentel, Tiago, Gasser, Michael, Lane, William, Raj, Mohit, Coler, Matt, Samame, Jaime Rafael Montoya, Camaiteri, Delio Siticonatzi, Sagot, Benoît, Rojas, Esaú Zumaeta, Francis, Didier López, Oncevay, Arturo, Bautista, Juan López, Villegas, Gema Celeste Silva, Hennigen, Lucas Torroba, Ek, Adam, Guriel, David, Dirix, Peter, Bernardy, Jean-Philippe, Scherbakov, Andrey, Bayyr-ool, Aziyana, Anastasopoulos, Antonios, Zariquiey, Roberto, Sheifer, Karina, Ganieva, Sofya, Cruz, Hilaria, Karahóǧa, Ritván, Markantonatou, Stella, Pavlidis, George, Plugaryov, Matvey, Klyachko, Elena, Salehi, Ali, Angulo, Candy, Baxi, Jatayu, Krizhanovsky, Andrew, Krizhanovskaya, Natalia, Salesky, Elizabeth, Vania, Clara, Ivanova, Sardana, White, Jennifer, Maudslay, Rowan Hall, Valvoda, Josef, Zmigrod, Ran, Czarnowska, Paula, Nikkarinen, Irene, Salchak, Aelita, Bhatt, Brijesh, Straughn, Christopher, Liu, Zoey, Washington, Jonathan North, Pinter, Yuval, Ataman, Duygu, Wolinski, Marcin, Suhardijanto, Totok, Yablonskaya, Anna, Stoehr, Niklas, Dolatian, Hossep, Nuriah, Zahroh, Ratan, Shyam, Tyers, Francis M., Ponti, Edoardo M., Aiton, Grant, Arora, Aryaman, Hatcher, Richard J., Kumar, Ritesh, Young, Jeremiah, Rodionova, Daria, Yemelina, Anastasia, Andrushko, Taras, Marchenko, Igor, Mashkovtseva, Polina, Serova, Alexandra, Prud'hommeaux, Emily, Nepomniashchaya, Maria, Giunchiglia, Fausto, Chodroff, Eleanor, Hulden, Mans, Silfverberg, Miikka, McCarthy, Arya D., Yarowsky, David, Cotterell, Ryan, Tsarfaty, Reut, Vylomova, Ekaterina
The Universal Morphology (UniMorph) project is a collaborative effort providing broad-coverage instantiated normalized morphological inflection tables for hundreds of diverse world languages. The project comprises two major thrusts: a language-indepe
Externí odkaz:
http://arxiv.org/abs/2205.03608
Autor:
Gorman, Kyle, Allauzen, Cyril
The single shortest path algorithm is undefined for weighted finite-state automata over non-idempotent semirings because such semirings do not guarantee the existence of a shortest path. However, in non-idempotent semirings admitting an order satisfy
Externí odkaz:
http://arxiv.org/abs/2204.07236
We consider the problem of constructing matched groups such that the resulting groups are statistically similar with respect to their average values for multiple covariates. This group-matching problem arises in many cases, including quasi-experiment
Externí odkaz:
http://arxiv.org/abs/2110.04432
Ad hoc abbreviations are commonly found in informal communication channels that favor shorter messages. We consider the task of reversing these abbreviations in context to recover normalized, expanded versions of abbreviated messages. The problem is
Externí odkaz:
http://arxiv.org/abs/2110.01140
Inverse text normalization (ITN) converts spoken-domain automatic speech recognition (ASR) output into written-domain text to improve the readability of the ASR output. Many state-of-the-art ITN systems use hand-written weighted finite-state transduc
Externí odkaz:
http://arxiv.org/abs/2104.05055
Autor:
Waller, Angie, Gorman, Kyle
Student reviews often make reference to professors' physical appearances. Until recently RateMyProfessors.com, the website of this study's focus, used a design feature to encourage a "hot or not" rating of college professors. In the wake of recent #M
Externí odkaz:
http://arxiv.org/abs/2010.08540
Autor:
Szymański, Piotr, Gorman, Kyle
Recent work raises concerns about the use of standard splits to compare natural language processing models. We propose a Bayesian statistical model comparison technique which uses k-fold cross-validation across multiple data sets to estimate the like
Externí odkaz:
http://arxiv.org/abs/2010.03088
How language-agnostic are current state-of-the-art NLP tools? Are there some types of language that are easier to model with current methods? In prior work (Cotterell et al., 2018) we attempted to address this question for language modeling, and obse
Externí odkaz:
http://arxiv.org/abs/1906.04726