Zobrazeno 1 - 4
of 4
pro vyhledávání: '"Dalsgaard, Jacob Aarup"'
In the realm of Computational Social Science (CSS), practitioners often navigate complex, low-resource domains and face the costly and time-intensive challenges of acquiring and annotating data. We aim to establish a set of guidelines to address such
Externí odkaz:
http://arxiv.org/abs/2304.13861
Autor:
Strømberg-Derczynski, Leon, Ciosici, Manuel R., Baglini, Rebekah, Christiansen, Morten H., Dalsgaard, Jacob Aarup, Fusaroli, Riccardo, Henrichsen, Peter Juel, Hvingelby, Rasmus, Kirkedal, Andreas, Kjeldsen, Alex Speed, Ladefoged, Claus, Nielsen, Finn Årup, Petersen, Malte Lau, Rystrøm, Jonathan Hvithamar, Varab, Daniel
Danish language technology has been hindered by a lack of broad-coverage corpora at the scale modern NLP prefers. This paper describes the Danish Gigaword Corpus, the result of a focused effort to provide a diverse and freely-available one billion wo
Externí odkaz:
http://arxiv.org/abs/2005.03521
Obtaining and annotating data can be expensive and time-consuming, especially in complex, low-resource domains. We use GPT-4 and ChatGPT to augment small labeled datasets with synthetic data via simple prompts, in three different classification tasks
Externí odkaz:
https://explore.openaire.eu/search/publication?articleId=doi_dedup___::44e655f7ce6a0ccdd0b0ce964b53ae7d
http://arxiv.org/abs/2304.13861
http://arxiv.org/abs/2304.13861
Autor:
Strømberg-Derczynski, Leon, Ciosici, Manuel Rafael, Christiansen, Morten H., Baglini, Rebekah Brita, Dalsgaard, Jacob Aarup, Fusaroli, Riccardo, Henrichsen, Peter Juel, Hvingelby, Rasmus, Kirkedal, Andreas, Kjeldsen, Alex Speed, Ladefoged, Claus, Nielsen, Finn Arup, Madsen, Jens, Petersen, Malte Lau, Rystrøm, Jonathan Hvithamar, Varab, Daniel
Publikováno v:
Strømberg-Derczynski, L, Ciosici, M R, Christiansen, M H, Baglini, R B, Dalsgaard, J A, Fusaroli, R, Henrichsen, P J, Hvingelby, R, Kirkedal, A, Kjeldsen, A S, Ladefoged, C, Nielsen, F A, Madsen, J, Petersen, M L, Rystrøm, J H & Varab, D 2021, The Danish Gigaword Corpus . in Proceedings of the 23rd Nordic Conference on Computational Linguistics (NoDaLiDa) . Linköping University Electronic Press, pp. 413-421 . < https://www.aclweb.org/anthology/2021.nodalida-main.46 >
Danish language technology has been hindered by a lack of broad-coverage corpora at the scale modern NLP prefers. This paper describes the Danish Gigaword Corpus, the result of a focused effort to provide a diverse and freely-available one billion wo
Externí odkaz:
https://explore.openaire.eu/search/publication?articleId=od______2751::40d0d6ef0a8ba840b1ac2f6cc3e8a1eb
https://curis.ku.dk/ws/files/305020079/2021.nodalida_main.46v2.pdf
https://curis.ku.dk/ws/files/305020079/2021.nodalida_main.46v2.pdf