Audio Narration of a Scene for Visually Disabled using Smart Goggle

Autor:	Singh, Pratyush Pratap, Hegde, Sharath S., Varun, R., Hegde, Vivek, Devi, K. A. Sumithra
Jazyk:	angličtina
Rok vydání:	2022
Předmět:	Natural Language Generation Text to Speech (TTS) engine Object detection Raspberry Pi ComputingMethodologies_IMAGEPROCESSINGANDCOMPUTERVISION Optical Character Recognition (OCR) Tesseract OCR engine Raspberry Pi camera board OpenCV Natural Language Processing
Zdroj:	International Journal of Research in Engineering, Science and Management; Vol. 5 No. 4 (2022); 73-75
ISSN:	2581-5792
Popis:	This work supports visually disabled people to get an idea of what is in the captured image. By using different kinds of multimedia information processing techniques, the proposed device will first acquire image attributes via Pi Camera, then perform an image to text conversion using Tesseract library and OpenCV library. Previously proposed approaches used computer vision technology to determine labels or exploit already available descriptions of the training images to transfer or compose a completely new description for the image to be tested. Now we propose an approach that will use image annotations to generate image descriptions and shows that with the accurate object and attribute detection, human-like descriptions for images can be generated. We use TTS (Text to Speech) for text to speech transformation and Python programming language.
Databáze:	OpenAIRE
Externí odkaz:	https://explore.openaire.eu/search/publication?articleId=issn25815792::b9ab69ecb206173dbe52ca43dcc71967 https://www.journals.resaim.com/ijresm/article/view/1943 Zobrazit plný text záznamu