Zobrazeno 1 - 4
of 4
pro vyhledávání: '"Roberts, Josselin Somerville"'
Autor:
Roberts, Josselin Somerville, Lee, Tony, Wong, Chi Heem, Yasunaga, Michihiro, Mai, Yifan, Liang, Percy
We introduce Image2Struct, a benchmark to evaluate vision-language models (VLMs) on extracting structure from images. Our benchmark 1) captures real-world use cases, 2) is fully automatic and does not require human judgment, and 3) is based on a rene
Externí odkaz:
http://arxiv.org/abs/2410.22456
Autor:
Lee, Tony, Tu, Haoqin, Wong, Chi Heem, Zheng, Wenhao, Zhou, Yiyang, Mai, Yifan, Roberts, Josselin Somerville, Yasunaga, Michihiro, Yao, Huaxiu, Xie, Cihang, Liang, Percy
Current benchmarks for assessing vision-language models (VLMs) often focus on their perception or problem-solving capabilities and neglect other critical aspects such as fairness, multilinguality, or toxicity. Furthermore, they differ in their evalua
Externí odkaz:
http://arxiv.org/abs/2410.07112
Publikováno v:
ICRA 2024
Multi-task reinforcement learning could enable robots to scale across a wide variety of manipulation tasks in homes and workplaces. However, generalizing from one task to another and mitigating negative task interference still remains a challenge. Ad
Externí odkaz:
http://arxiv.org/abs/2309.08776
Publikováno v:
IEEE IRC 2023
Conventional wheeled robots are unable to traverse scientifically interesting, but dangerous, cave environments. Multi-limbed climbing robot designs, such as ReachBot, are able to grasp irregular surface features and execute climbing motions to overc
Externí odkaz:
http://arxiv.org/abs/2309.05139