Zobrazeno 1 - 10
of 5 542
pro vyhledávání: '"Text dataset"'
We present Public Domain 12M (PD12M), a dataset of 12.4 million high-quality public domain and CC0-licensed images with synthetic captions, designed for training text-to-image models. PD12M is the largest public domain image-text dataset to date, wit
Externí odkaz:
http://arxiv.org/abs/2410.23144
High-quality video-text preference data is crucial for Multimodal Large Language Models (MLLMs) alignment. However, existing preference data is very scarce. Obtaining VQA preference data for preference training is costly, and manually annotating resp
Externí odkaz:
http://arxiv.org/abs/2411.16201
Currently, image-text-driven multi-modal deep learning models have demonstrated their outstanding potential in many fields. In practice, tasks centered around facial images have broad application prospects. This paper presents \textbf{FaceCaption-15M
Externí odkaz:
http://arxiv.org/abs/2407.08515
Large pre-trained language models have become popular for many applications and form an important backbone of many downstream tasks in natural language processing (NLP). Applying 'explainable artificial intelligence' (XAI) techniques to enrich such m
Externí odkaz:
http://arxiv.org/abs/2406.11547
Age-related language patterns play a crucial role in understanding linguistic differences and developing age-appropriate communication strategies. However, the lack of comprehensive and diverse datasets has hindered the progress of research in this a
Externí odkaz:
http://arxiv.org/abs/2406.16890
Autor:
Yang, Yuchen, Duan, Yingxuan
A more robust and holistic language-video representation is the key to pushing video understanding forward. Despite the improvement in training strategies, the quality of the language-video dataset is less attention to. The current plain and simple t
Externí odkaz:
http://arxiv.org/abs/2406.13809
Interleaved image-text generation has emerged as a crucial multimodal task, aiming at creating sequences of interleaved visual and textual content given a query. Despite notable advancements in recent multimodal large language models (MLLMs), generat
Externí odkaz:
http://arxiv.org/abs/2406.10462
Akademický článek
Tento výsledek nelze pro nepřihlášené uživatele zobrazit.
K zobrazení výsledku je třeba se přihlásit.
K zobrazení výsledku je třeba se přihlásit.
An in-depth comprehension of global land cover is essential in Earth observation, forming the foundation for a multitude of applications. Although remote sensing technology has advanced rapidly, leading to a proliferation of satellite imagery, the in
Externí odkaz:
http://arxiv.org/abs/2402.11325
Making sense of unstructured text datasets is perennially difficult, yet increasingly relevant with Large Language Models. Data workers often rely on dataset summaries, especially distributions of various derived features. Some features, like toxicit
Externí odkaz:
http://arxiv.org/abs/2402.14880