Zobrazeno 1 - 10
of 26
pro vyhledávání: '"Park, Kyubyong"'
Numerous datasets have been proposed to combat the spread of online hate. Despite these efforts, a majority of these resources are English-centric, primarily focusing on overt forms of hate. This research gap calls for developing high-quality corpora
Externí odkaz:
http://arxiv.org/abs/2310.15439
Autor:
Ko, Hyunwoong, Yang, Kichang, Ryu, Minho, Choi, Taekyoon, Yang, Seungmu, Hyun, Jiwung, Park, Sungho, Park, Kyubyong
Polyglot is a pioneering project aimed at enhancing the non-English language performance of multilingual language models. Despite the availability of various multilingual models such as mBERT (Devlin et al., 2019), XGLM (Lin et al., 2022), and BLOOM
Externí odkaz:
http://arxiv.org/abs/2306.02254
Typically, tokenization is the very first step in most text processing works. As a token serves as an atomic unit that embeds the contextual information of text, how to define a token plays a decisive role in the performance of a model.Even though By
Externí odkaz:
http://arxiv.org/abs/2010.02534
Autor:
Park, Kyubyong
Korean is a morphologically rich language. Korean verbs change their forms in a fickle manner depending on tense, mood, speech level, meaning, etc. Therefore, it is challenging to construct comprehensive conjugation paradigms of Korean verbs. In this
Externí odkaz:
http://arxiv.org/abs/2004.13221
Invariant risk minimization (IRM) (Arjovsky et al., 2019) is a recently proposed framework designed for learning predictors that are invariant to spurious correlations across different training environments. Yet, despite its theoretical justification
Externí odkaz:
http://arxiv.org/abs/2004.05007