Výsledky vyhledávání

Report

LLARVA: Vision-Action Instruction Tuning Enhances Robot Learning

Autor: Niu, Dantong, Sharma, Yuvan, Biamby, Giscard, Quenum, Jerome, Bai, Yutong, Shi, Baifeng, Darrell, Trevor, Herzig, Roei

In recent years, instruction-tuned Large Multimodal Models (LMMs) have been successful at several tasks, including image captioning and visual question answering; yet leveraging these models remains an open question for robotics. Prior LMMs for robot

Externí odkaz: http://arxiv.org/abs/2406.11815

Zobrazit plný text záznamu

Report

When Do We Not Need Larger Vision Models?

Autor: Shi, Baifeng, Wu, Ziyang, Mao, Maolin, Wang, Xin, Darrell, Trevor

Scaling up the size of vision models has been the de facto standard to obtain more powerful visual representations. In this work, we discuss the point beyond which larger vision models are not necessary. First, we demonstrate the power of Scaling on

Externí odkaz: http://arxiv.org/abs/2403.13043

Zobrazit plný text záznamu

Report

Humanoid Locomotion as Next Token Prediction

Autor: Radosavovic, Ilija, Zhang, Bike, Shi, Baifeng, Rajasegaran, Jathushan, Kamat, Sarthak, Darrell, Trevor, Sreenath, Koushil, Malik, Jitendra

We cast real-world humanoid control as a next token prediction problem, akin to predicting the next word in language. Our model is a causal transformer trained via autoregressive prediction of sensorimotor trajectories. To account for the multi-modal

Externí odkaz: http://arxiv.org/abs/2402.19469

Zobrazit plný text záznamu

Report

Rethinking Patch Dependence for Masked Autoencoders

Autor: Fu, Letian, Lian, Long, Wang, Renhao, Shi, Baifeng, Wang, Xudong, Yala, Adam, Darrell, Trevor, Efros, Alexei A., Goldberg, Ken

In this work, we re-examine inter-patch dependencies in the decoding mechanism of masked autoencoders (MAE). We decompose this decoding mechanism for masked patch reconstruction in MAE into self-attention and cross-attention. Our investigations sugge

Externí odkaz: http://arxiv.org/abs/2401.14391

Zobrazit plný text záznamu

Report

Recursive Visual Programming

Autor: Ge, Jiaxin, Subramanian, Sanjay, Shi, Baifeng, Herzig, Roei, Darrell, Trevor

Visual Programming (VP) has emerged as a powerful framework for Visual Question Answering (VQA). By generating and executing bespoke code for each question, these methods demonstrate impressive compositional and reasoning capabilities, especially in

Externí odkaz: http://arxiv.org/abs/2312.02249

Zobrazit plný text záznamu

Report

LLM-grounded Video Diffusion Models

Autor: Lian, Long, Shi, Baifeng, Yala, Adam, Darrell, Trevor, Li, Boyi

Text-conditioned diffusion models have emerged as a promising tool for neural video generation. However, current models still struggle with intricate spatiotemporal prompts and often generate restricted or incorrect motion. To address these limitatio

Externí odkaz: http://arxiv.org/abs/2309.17444

Zobrazit plný text záznamu

Report

Mandarin Lombard Flavor Classification

Autor: Liu, Qingmu, Yang, Yuhong, Li, Baifeng, Chen, Hongyang, Tu, Weiping, Lin, Song

The Lombard effect refers to individuals' unconscious modulation of vocal effort in response to variations in the ambient noise levels, intending to enhance speech intelligibility. The impact of different decibel levels and types of background noise

Externí odkaz: http://arxiv.org/abs/2309.07419

Zobrazit plný text záznamu

Report

EMALG: An Enhanced Mandarin Lombard Grid Corpus with Meaningful Sentences

Autor: Li, Baifeng, Liu, Qingmu, Yang, Yuhong, Chen, Hongyang, Tu, Weiping, Lin, Song

This study investigates the Lombard effect, where individuals adapt their speech in noisy environments. We introduce an enhanced Mandarin Lombard grid (EMALG) corpus with meaningful sentences , enhancing the Mandarin Lombard grid (MALG) corpus. EMALG

Externí odkaz: http://arxiv.org/abs/2309.06858

Zobrazit plný text záznamu

Akademický článek

GPRC5A promotes lung colonization of esophageal squamous cell carcinoma

Autor: Hongyu Zhou, Licheng Tan, Baifeng Zhang, Dora Lai Wan Kwong, Ching Ngar Wong, Yu Zhang, Beibei Ru, Yingchen Lyu, Kin To Hugo Siu, Jie Luo, Yuma Yang, Qin Liu, Yixin Chen, Weiguang Zhang, Chaohui He, Peng Jiang, Yanru Qin, Beilei Liu, Xin-Yuan Guan

Publikováno v: Nature Communications, Vol 15, Iss 1, Pp 1-19 (2024)

Abstract Emerging evidence suggests that cancer cells may disseminate early, prior to the formation of traditional macro-metastases. However, the mechanisms underlying the seeding and transition of early disseminated cancer cells (DCCs) into metastat

Externí odkaz: https://doaj.org/article/141f9175d6024219ae7a8318b41af7f6

Zobrazit plný text záznamu

Akademický článek

Direct anterior decompression in patients with ossification of the posterior longitudinal ligament significantly relieves short-segment spinal cord high signal

Autor: Zichuan Wu, Xuhong Zhang, Hanlin Song, Aochen Xu, Baifeng Sun, Chen Xu, Min Qi, Yang Liu

Publikováno v: BMC Musculoskeletal Disorders, Vol 25, Iss 1, Pp 1-10 (2024)

Abstract Background In patients with ossification of the posterior longitudinal ligament of the cervical spine (OPLL), high spinal cord signal (HCS) is frequently observed in the spinal cord of the corresponding segment. However, studies on the diffe

Externí odkaz: https://doaj.org/article/b582d296f1674586a1266e96c20f1b53

Zobrazit plný text záznamu

Plný text ve formátu HTML

Vyhledávací nástroje:

Upřesnit hledání