The Potential of a Visual Dialogue Agent In a Tandem Automated Audio Description System for Videos.

Autor:	Stangl, Abigale, Ihorn, Shasta, Siu, Yue-Ting, Bodi, Aditya, Castanon, Mar, Narins, Lothar D, Yoon, Ilmi
Předmět:	SOUND systems LOW vision VIDEO production & direction ARTIFICIAL intelligence MACHINE learning
Zdroj:	ACM SIGACCESS Conference on Computers & Accessibility; 2023, p1-17, 17p
Abstrakt:	The relentless pace of video production exacerbates the digital accessibility gap that individuals who are blind or low vision (BLV) face on a daily basis, resulting in disproportionate exclusion from community opportunities and risk management. Whereas previous automated audio description (AD) systems provide single-tool approaches for delivering minimum viable description (MVD) or delivering on-demand visual question answering (VQA), we present a tandem AI-based AD tool that combines MVD and on-demand VQA. A user study with 26 BLV individuals explored how the tandem system may be used under the conditions of delivering MVD and/or on-demand VQA with AI-only or human-in-the-loop support. When each tool was used in isolation, AI-only conditions scored significantly lower in both user enjoyment and comprehension. When used in tandem, AI-only conditions matched outcomes delivered with human-in-the-loop, which suggests that AI-only AD tools may be most effective when both types of tools are used in tandem. A multimodal analysis of interactions with the tandem system revealed areas for system improvement in terms of the timing of AD delivery and accurate content delivery. We discuss how the use of both types of tools in a tandem system can mitigate some of the digital frictions that have plagued efforts in machine learning and automated tools for accessibility. [ABSTRACT FROM AUTHOR]
Databáze:	Complementary Index
Externí odkaz:	Zobrazit plný text záznamu