Zobrazeno 1 - 10
of 1 166
pro vyhledávání: '"68t45"'
Autor:
Agarwal, Amit, Panda, Srikant, Charles, Angeline, Kumar, Bhargava, Patel, Hitesh, Pattnayak, Priyanranjan, Rafi, Taki Hasan, Kumar, Tejaswini, Chae, Dong-Kyu
Recent advancements in Vision-Language Models (VLMs) have enabled significant progress in complex video understanding tasks. However, their robustness to real-world manipulations remains underexplored, limiting their reliability in critical applicati
Externí odkaz:
http://arxiv.org/abs/2412.19794
Existing Sign Language Learning applications focus on the demonstration of the sign in the hope that the student will copy a sign correctly. In these cases, only a teacher can confirm that the sign was completed correctly, by reviewing a video captur
Externí odkaz:
http://arxiv.org/abs/2412.18187
Autor:
Phukan, Orchid Chetia, Singh, Drishti, Behera, Swarup Ranjan, Buduru, Arun Balaji, Sharma, Rajesh
In this work, we investigate various state-of-the-art (SOTA) speech pre-trained models (PTMs) for their capability to capture prosodic signatures of the generative sources for audio deepfake source attribution (ADSD). These prosodic characteristics c
Externí odkaz:
http://arxiv.org/abs/2412.17796
Autor:
Jenkins, Marcus, Franklin, Kirsty A., Nicoll, Malcolm A. C., Cole, Nik C., Ruhomaun, Kevin, Tatayah, Vikash, Mackiewicz, Michal
Publikováno v:
Sensors 2024, 24, 8002
Monitoring animal populations is crucial for assessing the health of ecosystems. Traditional methods, which require extensive fieldwork, are increasingly being supplemented by time-lapse camera-trap imagery combined with an automatic analysis of the
Externí odkaz:
http://arxiv.org/abs/2412.16329
Autor:
Zhuang, Yiyu, Lv, Jiaxi, Wen, Hao, Shuai, Qing, Zeng, Ailing, Zhu, Hao, Chen, Shifeng, Yang, Yujiu, Cao, Xun, Liu, Wei
Creating a high-fidelity, animatable 3D full-body avatar from a single image is a challenging task due to the diverse appearance and poses of humans and the limited availability of high-quality training data. To achieve fast and high-quality human re
Externí odkaz:
http://arxiv.org/abs/2412.14963
Monitoring growth behavior of maize plants such as the development of ears can give key insights into the plant's health and development. Traditionally, the measurement of the angle of ears is performed manually, which can be time-consuming and prone
Externí odkaz:
http://arxiv.org/abs/2412.14954
Autor:
Beltran, Tommy D., Villao, Raul J., Chuquimarca, Luis E., Vintimilla, Boris X., Velastin, Sergio A.
Publikováno v:
Iberoamerican Congress on Pattern Recognition. Cham: Springer Nature Switzerland, 2024. p. 46-62
The present study focuses on detecting the degree of deformity in fruits such as apples, mangoes, and strawberries during the process of inspecting their external quality, employing Single-Input and Multi-Input architectures based on convolutional ne
Externí odkaz:
http://arxiv.org/abs/2412.12966
Convolutional Neural Networks (CNNs) have been the standard for image classification tasks for a long time, but more recently attention-based mechanisms have gained traction. This project aims to compare traditional CNNs with attention-augmented CNNs
Externí odkaz:
http://arxiv.org/abs/2412.11657
Continual learning aims to update a model so that it can sequentially learn new tasks without forgetting previously acquired knowledge. Recent continual learning approaches often leverage the vision-language model CLIP for its high-dimensional featur
Externí odkaz:
http://arxiv.org/abs/2412.05840
Recent advancements in models linking natural language with human motions have shown significant promise in motion generation and editing based on instructional text. Motivated by applications in sports coaching and motor skill learning, we investiga
Externí odkaz:
http://arxiv.org/abs/2412.05460