Výsledky vyhledávání

Report

Diagram Formalization Enhanced Multi-Modal Geometry Problem Solver

Autor: Zhang, Zeren, Cheng, Jo-Ku, Deng, Jingyang, Tian, Lu, Ma, Jinwen, Qin, Ziran, Zhang, Xiaokai, Zhu, Na, Leng, Tuo

Mathematical reasoning remains an ongoing challenge for AI models, especially for geometry problems that require both linguistic and visual signals. As the vision encoders of most MLLMs are trained on natural scenes, they often struggle to understand

Externí odkaz: http://arxiv.org/abs/2409.04214

Zobrazit plný text záznamu

Report

Qihoo-T2X: An Efficiency-Focused Diffusion Transformer via Proxy Tokens for Text-to-Any-Task

Autor: Wang, Jing, Ma, Ao, Feng, Jiasong, Leng, Dawei, Yin, Yuhui, Liang, Xiaodan

The global self-attention mechanism in diffusion transformers involves redundant computation due to the sparse and redundant nature of visual information, and the attention map of tokens within a spatial window shows significant similarity. To addres

Externí odkaz: http://arxiv.org/abs/2409.04005

Zobrazit plný text záznamu

Report

QHDOPT: A Software for Nonlinear Optimization with Quantum Hamiltonian Descent

Autor: Kushnir, Samuel, Leng, Jiaqi, Peng, Yuxiang, Fan, Lei, Wu, Xiaodi

We develop an open-source, end-to-end software (named QHDOPT), which can solve nonlinear optimization problems using the quantum Hamiltonian descent (QHD) algorithm. QHDOPT offers an accessible interface and automatically maps tasks to various suppor

Externí odkaz: http://arxiv.org/abs/2409.03121

Zobrazit plný text záznamu

Report

Vortex: Efficient Sample-Free Dynamic Tensor Program Optimization via Hardware-aware Strategy Space Hierarchization

Autor: Zhou, Yangjie, Zhu, Honglin, Qiu, Qian, Cui, Weihao, Liu, Zihan, Guo, Cong, Feng, Siyuan, Meng, Jintao, Lan, Haidong, Leng, Jingwen, Zhu, Wenxi, Deng, Minwen

Dynamic-shape deep neural networks (DNNs) are rapidly evolving, attracting attention for their ability to handle variable input sizes in real-time applications. However, existing compilation optimization methods for such networks often rely heavily o

Externí odkaz: http://arxiv.org/abs/2409.01075

Zobrazit plný text záznamu

Report

PR2: A Physics- and Photo-realistic Testbed for Embodied AI and Humanoid Robots

Autor: Liu, Hangxin, Xie, Qi, Zhang, Zeyu, Yuan, Tao, Leng, Xiaokun, Sun, Lining, Zhu, Song-Chun, Zhang, Jingwen, He, Zhicheng, Su, Yao

This paper presents the development of a Physics-realistic and Photo-\underline{r}ealistic humanoid robot testbed, PR2, to facilitate collaborative research between Embodied Artificial Intelligence (Embodied AI) and robotics. PR2 offers high-quality

Externí odkaz: http://arxiv.org/abs/2409.01559

Zobrazit plný text záznamu

Report

SpikingSSMs: Learning Long Sequences with Sparse and Parallel Spiking State Space Models

Autor: Shen, Shuaijie, Wang, Chao, Huang, Renzhuo, Zhong, Yan, Guo, Qinghai, Lu, Zhichao, Zhang, Jianguo, Leng, Luziwei

Known as low energy consumption networks, spiking neural networks (SNNs) have gained a lot of attention within the past decades. While SNNs are increasing competitive with artificial neural networks (ANNs) for vision tasks, they are rarely used for l

Externí odkaz: http://arxiv.org/abs/2408.14909

Zobrazit plný text záznamu

Report

Multi-watt long-wavelength infrared femtosecond lasers and resonant enamel ablation

Autor: Yang, Xuemei, Zhang, Dunxiang, Wang, Weizhe, Tian, Kan, He, Linzhen, Guo, Jinmiao, Hu, Bo, Pu, Tao, Li, Wenlong, Sun, Shiran, Ding, Chunmei, Wu, Han, Li, Kenkai, Peng, Yujie, Li, Jianshu, Leng, Yuxin, Liang, Houkun

High-power broadband tunable long-wavelength infrared (LWIR) femtosecond lasers operating at fingerprint wavelengths of 7-14 {\mu}m hold significant promise across a range of applications, including molecular hyperspectral imaging, strong-field light

Externí odkaz: http://arxiv.org/abs/2408.13789

Zobrazit plný text záznamu

Report

IAA: Inner-Adaptor Architecture Empowers Frozen Large Language Model with Multimodal Capabilities

Autor: Wang, Bin, Xie, Chunyu, Leng, Dawei, Yin, Yuhui

In the field of multimodal large language models (MLLMs), common methods typically involve unfreezing the language model during training to foster profound visual understanding. However, the fine-tuning of such models with vision-language data often

Externí odkaz: http://arxiv.org/abs/2408.12902

Zobrazit plný text záznamu

Report

Generating Synthetic Fair Syntax-agnostic Data by Learning and Distilling Fair Representation

Autor: Sikder, Md Fahim, Ramachandranpillai, Resmi, de Leng, Daniel, Heintz, Fredrik

Data Fairness is a crucial topic due to the recent wide usage of AI powered applications. Most of the real-world data is filled with human or machine biases and when those data are being used to train AI models, there is a chance that the model will

Externí odkaz: http://arxiv.org/abs/2408.10755

Zobrazit plný text záznamu

Report

CMoralEval: A Moral Evaluation Benchmark for Chinese Large Language Models

Autor: Yu, Linhao, Leng, Yongqi, Huang, Yufei, Wu, Shang, Liu, Haixin, Ji, Xinmeng, Zhao, Jiahui, Song, Jinwang, Cui, Tingting, Cheng, Xiaoqing, Liu, Tao, Xiong, Deyi

What a large language model (LLM) would respond in ethically relevant context? In this paper, we curate a large benchmark CMoralEval for morality evaluation of Chinese LLMs. The data sources of CMoralEval are two-fold: 1) a Chinese TV program discuss

Externí odkaz: http://arxiv.org/abs/2408.09819

Zobrazit plný text záznamu

Vyhledávací nástroje:

Upřesnit hledání