Zobrazeno 1 - 10
of 158
pro vyhledávání: '"İşcen P"'
Web-scale visual entity recognition, the task of associating images with their corresponding entities within vast knowledge bases like Wikipedia, presents significant challenges due to the lack of clean, large-scale training data. In this paper, we p
Externí odkaz:
http://arxiv.org/abs/2410.23676
Autor:
D'Ambrosio, David B., Abeyruwan, Saminda, Graesser, Laura, Iscen, Atil, Amor, Heni Ben, Bewley, Alex, Reed, Barney J., Reymann, Krista, Takayama, Leila, Tassa, Yuval, Choromanski, Krzysztof, Coumans, Erwin, Jain, Deepali, Jaitly, Navdeep, Jaques, Natasha, Kataoka, Satoshi, Kuang, Yuheng, Lazic, Nevena, Mahjourian, Reza, Moore, Sherry, Oslund, Kenneth, Shankar, Anish, Sindhwani, Vikas, Vanhoucke, Vincent, Vesom, Grace, Xu, Peng, Sanketi, Pannag R.
Achieving human-level speed and performance on real world tasks is a north star for the robotics research community. This work takes a step towards that goal and presents the first learned robot agent that reaches amateur human-level performance in c
Externí odkaz:
http://arxiv.org/abs/2408.03906
This work investigates the problem of instance-level image retrieval re-ranking with the constraint of memory efficiency, ultimately aiming to limit memory usage to 1KB per image. Departing from the prevalent focus on performance enhancements, this w
Externí odkaz:
http://arxiv.org/abs/2408.03282
Autor:
Sanoubari, Elaheh, Iscen, Atil, Takayama, Leila, Saliceti, Stefano, Cunningham, Corbin, Caluwaerts, Ken
In this paper, we investigate the use of 'prosody' (the musical elements of speech) as a communicative signal for intuitive human-robot interaction interfaces. Our approach, rooted in Research through Design (RtD), examines the application of prosody
Externí odkaz:
http://arxiv.org/abs/2403.08144
In this paper, we address web-scale visual entity recognition, specifically the task of mapping a given query image to one of the 6 million existing entities in Wikipedia. One way of approaching a problem of such scale is using dual-encoder models (e
Externí odkaz:
http://arxiv.org/abs/2403.02041
Autor:
Hu, Ziniu, Iscen, Ahmet, Jain, Aashi, Kipf, Thomas, Yue, Yisong, Ross, David A., Schmid, Cordelia, Fathi, Alireza
This paper introduces SceneCraft, a Large Language Model (LLM) Agent converting text descriptions into Blender-executable Python scripts which render complex scenes with up to a hundred 3D assets. This process requires complex spatial planning and ar
Externí odkaz:
http://arxiv.org/abs/2403.01248
Autor:
D'Ambrosio, David B., Abelian, Jonathan, Abeyruwan, Saminda, Ahn, Michael, Bewley, Alex, Boyd, Justin, Choromanski, Krzysztof, Cortes, Omar, Coumans, Erwin, Ding, Tianli, Gao, Wenbo, Graesser, Laura, Iscen, Atil, Jaitly, Navdeep, Jain, Deepali, Kangaspunta, Juhana, Kataoka, Satoshi, Kouretas, Gus, Kuang, Yuheng, Lazic, Nevena, Lynch, Corey, Mahjourian, Reza, Moore, Sherry Q., Nguyen, Thinh, Oslund, Ken, Reed, Barney J, Reymann, Krista, Sanketi, Pannag R., Shankar, Anish, Sermanet, Pierre, Sindhwani, Vikas, Singh, Avi, Vanhoucke, Vincent, Vesom, Grace, Xu, Peng
We present a deep-dive into a real-world robotic learning system that, in previous work, was shown to be capable of hundreds of table tennis rallies with a human and has the ability to precisely return the ball to desired targets. This system puts to
Externí odkaz:
http://arxiv.org/abs/2309.03315
Autor:
Hu, Ziniu, Iscen, Ahmet, Sun, Chen, Chang, Kai-Wei, Sun, Yizhou, Ross, David A, Schmid, Cordelia, Fathi, Alireza
In this paper, we propose an autonomous information seeking visual question answering framework, AVIS. Our method leverages a Large Language Model (LLM) to dynamically strategize the utilization of external tools and to investigate their outputs, the
Externí odkaz:
http://arxiv.org/abs/2306.08129
Contrastive image-text models such as CLIP form the building blocks of many state-of-the-art systems. While they excel at recognizing common generic concepts, they still struggle on fine-grained entities which are rare, or even absent from the pre-tr
Externí odkaz:
http://arxiv.org/abs/2306.07196
Autor:
Caluwaerts, Ken, Iscen, Atil, Kew, J. Chase, Yu, Wenhao, Zhang, Tingnan, Freeman, Daniel, Lee, Kuang-Huei, Lee, Lisa, Saliceti, Stefano, Zhuang, Vincent, Batchelor, Nathan, Bohez, Steven, Casarini, Federico, Chen, Jose Enrique, Cortes, Omar, Coumans, Erwin, Dostmohamed, Adil, Dulac-Arnold, Gabriel, Escontrela, Alejandro, Frey, Erik, Hafner, Roland, Jain, Deepali, Jyenis, Bauyrjan, Kuang, Yuheng, Lee, Edward, Luu, Linda, Nachum, Ofir, Oslund, Ken, Powell, Jason, Reyes, Diego, Romano, Francesco, Sadeghi, Feresteh, Sloat, Ron, Tabanpour, Baruch, Zheng, Daniel, Neunert, Michael, Hadsell, Raia, Heess, Nicolas, Nori, Francesco, Seto, Jeff, Parada, Carolina, Sindhwani, Vikas, Vanhoucke, Vincent, Tan, Jie
Animals have evolved various agile locomotion strategies, such as sprinting, leaping, and jumping. There is a growing interest in developing legged robots that move like their biological counterparts and show various agile skills to navigate complex
Externí odkaz:
http://arxiv.org/abs/2305.14654