Uncertainty in Visual Generative AI

Autor:	Kara Combs, Adam Moyer, Trevor J. Bihl
Jazyk:	angličtina
Rok vydání:	2024
Předmět:	generative AI image to text computer vision machine translation uncertainty text mining Industrial engineering. Management engineering T55.4-60.8 Electronic computers. Computer science QA75.5-76.95
Zdroj:	Algorithms, Vol 17, Iss 4, p 136 (2024)
Druh dokumentu:	article
ISSN:	1999-4893
DOI:	10.3390/a17040136
Popis:	Recently, generative artificial intelligence (GAI) has impressed the world with its ability to create text, images, and videos. However, there are still areas in which GAI produces undesirable or unintended results due to being “uncertain”. Before wider use of AI-generated content, it is important to identify concepts where GAI is uncertain to ensure the usage thereof is ethical and to direct efforts for improvement. This study proposes a general pipeline to automatically quantify uncertainty within GAI. To measure uncertainty, the textual prompt to a text-to-image model is compared to captions supplied by four image-to-text models (GIT, BLIP, BLIP-2, and InstructBLIP). Its evaluation is based on machine translation metrics (BLEU, ROUGE, METEOR, and SPICE) and word embedding’s cosine similarity (Word2Vec, GloVe, FastText, DistilRoBERTa, MiniLM-6, and MiniLM-12). The generative AI models performed consistently across the metrics; however, the vector space models yielded the highest average similarity, close to 80%, which suggests more ideal and “certain” results. Suggested future work includes identifying metrics that best align with a human baseline to ensure quality and consideration for more GAI models. The work within can be used to automatically identify concepts in which GAI is “uncertain” to drive research aimed at increasing confidence in these areas.
Databáze:	Directory of Open Access Journals
Externí odkaz:	https://doaj.org/article/6b2b5f1330f0453dbdb0d243448a8209 Zobrazit plný text záznamu View record in DOAJ Plný text ve formátu PDF Plný text ve formátu HTML
Nepřihlášeným uživatelům se plný text nezobrazuje	K zobrazení výsledku je třeba se přihlásit.