Zobrazeno 1 - 10
of 22
pro vyhledávání: '"Uppal, Shagun"'
While there has been remarkable progress recently in the fields of manipulation and locomotion, mobile manipulation remains a long-standing challenge. Compared to locomotion or static manipulation, a mobile system must make a diverse range of long-ho
Externí odkaz:
http://arxiv.org/abs/2405.07991
While there have been significant strides in dexterous manipulation, most of it is limited to benchmark tasks like in-hand reorientation which are of limited utility in the real world. The main benefit of dexterous hands over two-fingered ones is the
Externí odkaz:
http://arxiv.org/abs/2312.02975
Several works have developed end-to-end pipelines for generating lip-synced talking faces with various real-world applications, such as teaching and language translation in videos. However, these prior works fail to create realistic-looking videos si
Externí odkaz:
http://arxiv.org/abs/2303.11548
Autor:
Gupta, Devansh, Saini, Aditya, Bhasin, Drishti, Bhagat, Sarthak, Uppal, Shagun, Jain, Rishi Raj, Kumaraguru, Ponnurangam, Shah, Rajiv Ratn
Retrieving facial images from attributes plays a vital role in various systems such as face recognition and suspect identification. Compared to other image retrieval tasks, facial image retrieval is more challenging due to the high subjectivity invol
Externí odkaz:
http://arxiv.org/abs/2205.15870
Autor:
Liu, I-Chun Arthur, Uppal, Shagun, Sukhatme, Gaurav S., Lim, Joseph J., Englert, Peter, Lee, Youngwoon
Learning complex manipulation tasks in realistic, obstructed environments is a challenging problem due to hard exploration in the presence of obstacles and high-dimensional visual observations. Prior work tackles the exploration problem by integratin
Externí odkaz:
http://arxiv.org/abs/2111.06383
Autor:
Uppal, Shagun, Bhagat, Sarthak, Hazarika, Devamanyu, Majumdar, Navonil, Poria, Soujanya, Zimmermann, Roger, Zadeh, Amir
Deep Learning and its applications have cascaded impactful research and development with a diverse range of modalities present in the real-world data. More recently, this has enhanced research interests in the intersection of the Vision and Language
Externí odkaz:
http://arxiv.org/abs/2010.09522
Disentangling the underlying feature attributes within an image with no prior supervision is a challenging task. Models that can disentangle attributes well provide greater interpretability and control. In this paper, we propose a self-supervised fra
Externí odkaz:
http://arxiv.org/abs/2006.05895
Visual Question Generation (VQG) is the task of generating natural questions based on an image. Popular methods in the past have explored image-to-sequence architectures trained with maximum likelihood which have demonstrated meaningful generated que
Externí odkaz:
http://arxiv.org/abs/2005.07771
We introduce MGP-VAE (Multi-disentangled-features Gaussian Processes Variational AutoEncoder), a variational autoencoder which uses Gaussian processes (GP) to model the latent space for the unsupervised learning of disentangled representations in vid
Externí odkaz:
http://arxiv.org/abs/2001.02408
Autor:
Sikka, Jagriti, Satya, Kushal, Kumar, Yaman, Uppal, Shagun, Shah, Rajiv Ratn, Zimmermann, Roger
Predicting the runtime complexity of a programming code is an arduous task. In fact, even for humans, it requires a subtle analysis and comprehensive knowledge of algorithms to predict time complexity with high fidelity, given any code. As per Turing
Externí odkaz:
http://arxiv.org/abs/1911.01155