Zobrazeno 1 - 10
of 199
pro vyhledávání: '"Oberski, Daniel L."'
Many existing benchmarks of large (multimodal) language models (LLMs) focus on measuring LLMs' academic proficiency, often with also an interest in comparing model performance with human test takers. While these benchmarks have proven key to the deve
Externí odkaz:
http://arxiv.org/abs/2404.01799
Autor:
Fang, Qixiang, Zhou, Zhihan, Barbieri, Francesco, Liu, Yozen, Neves, Leonardo, Nguyen, Dong, Oberski, Daniel L., Bos, Maarten W., Dotsch, Ron
Learning general-purpose user representations based on user behavioral logs is an increasingly popular user modeling approach. It benefits from easily available, privacy-friendly yet expressive data, and does not require extensive re-tuning of the up
Externí odkaz:
http://arxiv.org/abs/2312.12111
Autor:
Fang, Qixiang, Giachanou, Anastasia, Bagheri, Ayoub, Boeschoten, Laura, van Kesteren, Erik-Jan, Kamalabad, Mahdi Shafiee, Oberski, Daniel L
Text-based personality computing (TPC) has gained many research interests in NLP. In this paper, we describe 15 challenges that we consider deserving the attention of the research community. These challenges are organized by the following topics: per
Externí odkaz:
http://arxiv.org/abs/2212.06711
Publikováno v:
EPJ Data Sci. 11, 39 (2022)
Text embedding models from Natural Language Processing can map text data (e.g. words, sentences, documents) to supposedly meaningful numerical representations (a.k.a. text embeddings). While such models are increasingly applied in social science rese
Externí odkaz:
http://arxiv.org/abs/2202.09166
A potentially powerful method of social-scientific data collection and investigation has been created by an unexpected institution: the law. Article 15 of the EU's 2018 General Data Protection Regulation (GDPR) mandates that individuals have electron
Externí odkaz:
http://arxiv.org/abs/2011.09851
Publikováno v:
In Informatics in Medicine Unlocked 2024 50
Autor:
Bagheri, Ayoub, Groenhof, T. Katrien J., Veldhuis, Wouter B., de Jong, Pim A., Asselbergs, Folkert W., Oberski, Daniel L.
Electronic health records (EHRs) contain structured and unstructured data of significant clinical and research value. Various machine learning approaches have been developed to employ information in EHRs for risk prediction. The majority of these att
Externí odkaz:
http://arxiv.org/abs/2008.11979
Autor:
Pankowska, Paulina, Oberski, Daniel L.
Clustering consists of a popular set of techniques used to separate data into interesting groups for further analysis. Many data sources on which clustering is performed are well-known to contain random and systematic measurement errors. Such errors
Externí odkaz:
http://arxiv.org/abs/2005.11743
Fair inference in supervised learning is an important and active area of research, yielding a range of useful methods to assess and account for fairness criteria when predicting ground truth targets. As shown in recent work, however, when target labe
Externí odkaz:
http://arxiv.org/abs/2003.07621
Combining data from varied sources has considerable potential for knowledge discovery: collaborating data parties can mine data in an expanded feature space, allowing them to explore a larger range of scientific questions. However, data sharing among
Externí odkaz:
http://arxiv.org/abs/1911.03183