Zobrazeno 1 - 10
of 123
pro vyhledávání: '"Joshi, Dhiraj"'
Autor:
Wood, David, Lublinsky, Boris, Roytman, Alexy, Singh, Shivdeep, Adam, Constantin, Adebayo, Abdulhamid, An, Sungeun, Chang, Yuan Chi, Dang, Xuan-Hong, Desai, Nirmit, Dolfi, Michele, Emami-Gohari, Hajar, Eres, Revital, Goto, Takuya, Joshi, Dhiraj, Koyfman, Yan, Nassar, Mohammad, Patel, Hima, Selvam, Paramesvaran, Shah, Yousaf, Surendran, Saptha, Tsuzuku, Daiki, Zerfos, Petros, Daijavad, Shahrokh
Data preparation is the first and a very important step towards any Large Language Model (LLM) development. This paper introduces an easy-to-use, extensible, and scale-flexible open-source data preparation toolkit called Data Prep Kit (DPK). DPK is a
Externí odkaz:
http://arxiv.org/abs/2409.18164
The human visual perception system demonstrates exceptional capabilities in learning without explicit supervision and understanding the part-to-whole composition of objects. Drawing inspiration from these two abilities, we propose Hierarchical Adapti
Externí odkaz:
http://arxiv.org/abs/2402.03311
Object detectors often suffer from the domain gap between training (source domain) and real-world applications (target domain). Mean-teacher self-training is a powerful paradigm in unsupervised domain adaptation for object detection, but it struggles
Externí odkaz:
http://arxiv.org/abs/2305.03034
Predictions made by deep learning models are prone to data perturbations, adversarial attacks, and out-of-distribution inputs. To build a trusted AI system, it is therefore critical to accurately quantify the prediction uncertainties. While current e
Externí odkaz:
http://arxiv.org/abs/2304.04824
Autor:
Rouditchenko, Andrew, Boggust, Angie, Harwath, David, Chen, Brian, Joshi, Dhiraj, Thomas, Samuel, Audhkhasi, Kartik, Kuehne, Hilde, Panda, Rameswar, Feris, Rogerio, Kingsbury, Brian, Picheny, Michael, Torralba, Antonio, Glass, James
Current methods for learning visually grounded language from videos often rely on text annotation, such as human generated captions or machine generated automatic speech recognition (ASR) transcripts. In this work, we introduce the Audio-Video Langua
Externí odkaz:
http://arxiv.org/abs/2006.09199
The wide popularity of digital photography and social networks has generated a rapidly growing volume of multimedia data (i.e., image, music, and video), resulting in a great demand for managing, retrieving, and understanding these data. Affective co
Externí odkaz:
http://arxiv.org/abs/1911.05609
Autor:
Mac, Khoi-Nguyen C., Joshi, Dhiraj, Yeh, Raymond A., Xiong, Jinjun, Feris, Rogerio S., Do, Minh N.
Fine-grained action detection is an important task with numerous applications in robotics and human-computer interaction. Existing methods typically utilize a two-stage approach including extraction of local spatio-temporal features followed by tempo
Externí odkaz:
http://arxiv.org/abs/1811.08815
Autor:
Merler, Michele, Joshi, Dhiraj, Nguyen, Quoc-Bao, Hammer, Stephen, Kent, John, Smith, John R., Feris, Rogerio S.
The production of sports highlight packages summarizing a game's most exciting moments is an essential task for broadcast media. Yet, it requires labor-intensive video editing. We propose a novel approach for auto-curating sports highlights, and use
Externí odkaz:
http://arxiv.org/abs/1707.07075
Autor:
Joshi, Dhiraj
Thesis (Ph.D.)--Pennsylvania State University, 2007.
Mode of access: World Wide Web.
Mode of access: World Wide Web.
Publikováno v:
BMJ Case Reports; Jul2024, Vol. 17 Issue 7, p1-4, 4p