Zobrazeno 1 - 10
of 37
pro vyhledávání: '"Bose, Digbalay"'
Autor:
Kommineni, Aditya, Bose, Digbalay, Feng, Tiantian, Kim, So Hyun, Tager-Flusberg, Helen, Bishop, Somer, Lord, Catherine, Kadiri, Sudarsana, Narayanan, Shrikanth
Clinical videos in the context of Autism Spectrum Disorder are often long-form interactions between children and caregivers/clinical professionals, encompassing complex verbal and non-verbal behaviors. Objective analyses of these videos could provide
Externí odkaz:
http://arxiv.org/abs/2409.13606
Multi-modal learning has emerged as an increasingly promising avenue in vision recognition, driving innovations across diverse domains ranging from media and education to healthcare and transportation. Despite its success, the robustness of multi-mod
Externí odkaz:
http://arxiv.org/abs/2402.09036
Autor:
Nam, Yoonsoo, Lehavi, Adam, Yang, Daniel, Bose, Digbalay, Swayamdipta, Swabha, Narayanan, Shrikanth
Video summarization remains a huge challenge in computer vision due to the size of the input videos to be summarized. We propose an efficient, language-only video summarizer that achieves competitive accuracy with high data efficiency. Using only tex
Externí odkaz:
http://arxiv.org/abs/2309.09405
Autor:
Bose, Digbalay, Hebbar, Rajat, Feng, Tiantian, Somandepalli, Krishna, Xu, Anfeng, Narayanan, Shrikanth
Advertisement videos (ads) play an integral part in the domain of Internet e-commerce as they amplify the reach of particular products to a broad audience or can serve as a medium to raise awareness about specific issues through concise narrative str
Externí odkaz:
http://arxiv.org/abs/2308.14052
Autor:
Feng, Tiantian, Bose, Digbalay, Zhang, Tuo, Hebbar, Rajat, Ramakrishna, Anil, Gupta, Rahul, Zhang, Mi, Avestimehr, Salman, Narayanan, Shrikanth
Over the past few years, Federated Learning (FL) has become an emerging machine learning technique to tackle data privacy challenges through collaborative training. In the Federated Learning algorithm, the clients submit a locally trained model, and
Externí odkaz:
http://arxiv.org/abs/2306.09486
Automatic Speech Understanding (ASU) leverages the power of deep learning models for accurate interpretation of human speech, leading to a wide range of speech applications that enrich the human experience. However, training a robust ASU model requir
Externí odkaz:
http://arxiv.org/abs/2306.07791
This paper presents the approach and results of USC SAIL's submission to the Signal Processing Grand Challenge 2023 - e-Prevention (Task 2), on detecting relapses in psychotic patients. Relapse prediction has proven to be challenging, primarily due t
Externí odkaz:
http://arxiv.org/abs/2304.08614
The process of human affect understanding involves the ability to infer person specific emotional states from various sources including images, speech, and language. Affect perception from images has predominantly focused on expressions extracted fro
Externí odkaz:
http://arxiv.org/abs/2303.06904
Audio event detection is a widely studied audio processing task, with applications ranging from self-driving cars to healthcare. In-the-wild datasets such as Audioset have propelled research in this field. However, many efforts typically involve manu
Externí odkaz:
http://arxiv.org/abs/2302.07315
Detecting unsafe driving states, such as stress, drowsiness, and fatigue, is an important component of ensuring driving safety and an essential prerequisite for automatic intervention systems in vehicles. These concerning conditions are primarily con
Externí odkaz:
http://arxiv.org/abs/2210.15826