Výsledky vyhledávání

Report

WOMD-Reasoning: A Large-Scale Language Dataset for Interaction and Driving Intentions Reasoning

Autor: Li, Yiheng, Ge, Chongjian, Li, Chenran, Xu, Chenfeng, Tomizuka, Masayoshi, Tang, Chen, Ding, Mingyu, Zhan, Wei

We propose Waymo Open Motion Dataset-Reasoning (WOMD-Reasoning), a language annotation dataset built on WOMD, with a focus on describing and reasoning interactions and intentions in driving scenarios. Previous language datasets primarily captured int

Externí odkaz: http://arxiv.org/abs/2407.04281

Zobrazit plný text záznamu

Report

HGNET: A Hierarchical Feature Guided Network for Occupancy Flow Field Prediction

Autor: Chen, Zhan, Tang, Chen, Xiong, Lu

Predicting the motion of multiple traffic participants has always been one of the most challenging tasks in autonomous driving. The recently proposed occupancy flow field prediction method has shown to be a more effective and scalable representation

Externí odkaz: http://arxiv.org/abs/2407.01097

Zobrazit plný text záznamu

Report

Residual-MPPI: Online Policy Customization for Continuous Control

Autor: Wang, Pengcheng, Li, Chenran, Weaver, Catherine, Kawamoto, Kenta, Tomizuka, Masayoshi, Tang, Chen, Zhan, Wei

Policies learned through Reinforcement Learning (RL) and Imitation Learning (IL) have demonstrated significant potential in achieving advanced performance in continuous control tasks. However, in real-world environments, it is often necessary to furt

Externí odkaz: http://arxiv.org/abs/2407.00898

Zobrazit plný text záznamu

Report

BioMNER: A Dataset for Biomedical Method Entity Recognition

Autor: Tang, Chen, Yang, Bohao, Zhao, Kun, Lv, Bo, Xiao, Chenghao, Guerin, Frank, Lin, Chenghua

Named entity recognition (NER) stands as a fundamental and pivotal task within the realm of Natural Language Processing. Particularly within the domain of Biomedical Method NER, this task presents notable challenges, stemming from the continual influ

Externí odkaz: http://arxiv.org/abs/2406.20038

Zobrazit plný text záznamu

Report

Q-DiT: Accurate Post-Training Quantization for Diffusion Transformers

Autor: Chen, Lei, Meng, Yuan, Tang, Chen, Ma, Xinzhu, Jiang, Jingyan, Wang, Xin, Wang, Zhi, Zhu, Wenwu

Recent advancements in diffusion models, particularly the trend of architectural transformation from UNet-based Diffusion to Diffusion Transformer (DiT), have significantly improved the quality and scalability of image synthesis. Despite the incredib

Externí odkaz: http://arxiv.org/abs/2406.17343

Zobrazit plný text záznamu

Report

SimsChat: A Customisable Persona-Driven Role-Playing Agent

Autor: Yang, Bohao, Liu, Dong, Tang, Chen, Xiao, Chenghao, Zhao, Kun, Li, Chao, Yuan, Lin, Yang, Guang, Huang, Lanxiao, Lin, Chenghua

Large Language Models (LLMs) possess the remarkable capability to understand human instructions and generate high-quality text, enabling them to act as agents that simulate human behaviours. This capability allows LLMs to emulate human beings in a mo

Externí odkaz: http://arxiv.org/abs/2406.17962

Zobrazit plný text záznamu

Report

X-ray Made Simple: Radiology Report Generation and Evaluation with Layman's Terms

Autor: Zhao, Kun, Xiao, Chenghao, Tang, Chen, Yang, Bohao, Ye, Kai, Moubayed, Noura Al, Zhan, Liang, Lin, Chenghua

Radiology Report Generation (RRG) has achieved significant progress with the advancements of multimodal generative models. However, the evaluation in the domain suffers from a lack of fair and robust metrics. We reveal that, high performance on RRG w

Externí odkaz: http://arxiv.org/abs/2406.17911

Zobrazit plný text záznamu

Report

MEReQ: Max-Ent Residual-Q Inverse RL for Sample-Efficient Alignment from Intervention

Autor: Chen, Yuxin, Tang, Chen, Li, Chenran, Tian, Ran, Stone, Peter, Tomizuka, Masayoshi, Zhan, Wei

Aligning robot behavior with human preferences is crucial for deploying embodied AI agents in human-centered environments. A promising solution is interactive imitation learning from human intervention, where a human expert observes the policy's exec

Externí odkaz: http://arxiv.org/abs/2406.16258

Zobrazit plný text záznamu

Report

Evaluating the Generalization Ability of Quantized LLMs: Benchmark, Analysis, and Toolbox

Autor: Liu, Yijun, Meng, Yuan, Wu, Fang, Peng, Shenhao, Yao, Hang, Guan, Chaoyu, Tang, Chen, Ma, Xinzhu, Wang, Zhi, Zhu, Wenwu

Large language models (LLMs) have exhibited exciting progress in multiple scenarios, while the huge computational demands hinder their deployments in lots of real-world applications. As an effective means to reduce memory footprint and inference cost

Externí odkaz: http://arxiv.org/abs/2406.12928

Zobrazit plný text záznamu

Report

STAR: Skeleton-aware Text-based 4D Avatar Generation with In-Network Motion Retargeting

Autor: Chai, Zenghao, Tang, Chen, Wong, Yongkang, Kankanhalli, Mohan

The creation of 4D avatars (i.e., animated 3D avatars) from text description typically uses text-to-image (T2I) diffusion models to synthesize 3D avatars in the canonical space and subsequently applies animation with target motions. However, such an

Externí odkaz: http://arxiv.org/abs/2406.04629

Zobrazit plný text záznamu

Vyhledávací nástroje:

Upřesnit hledání