Výsledky vyhledávání - "Hu, Tianyang"

Report

Exact Conversion of In-Context Learning to Model Weights in Linearized-Attention Transformers

Autor: Chen, Brian K, Hu, Tianyang, Jin, Hui, Lee, Hwee Kuan, Kawaguchi, Kenji

In-Context Learning (ICL) has been a powerful emergent property of large language models that has attracted increasing attention in recent years. In contrast to regular gradient-based learning, ICL is highly interpretable and does not require paramet

Externí odkaz: http://arxiv.org/abs/2406.02847

Zobrazit plný text záznamu

Report

Towards Understanding How Transformer Perform Multi-step Reasoning with Matching Operation

Autor: Wang, Zhiwei, Wang, Yunji, Zhang, Zhongwang, Zhou, Zhangchen, Jin, Hui, Hu, Tianyang, Sun, Jiacheng, Li, Zhenguo, Zhang, Yaoyu, Xu, Zhi-Qin John

Large language models have consistently struggled with complex reasoning tasks, such as mathematical problem-solving. Investigating the internal reasoning mechanisms of these models can help us design better model architectures and training strategie

Externí odkaz: http://arxiv.org/abs/2405.15302

Zobrazit plný text záznamu

Report

Stress out of charmonia

Autor: Xu, Siqi, Cao, Xianghui, Hu, Tianyang, Li, Yang, Zhao, Xingbo, Vary, James P.

Publikováno v: Phys. Rev. D 109, 114024 (2024)

We investigate the gravitational form factors of charmonium. Our method is based on a Hamiltonian formalism on the light front known as basis light-front quantization. The charmonium mass spectrum and light-front wave functions were obtained from dia

Externí odkaz: http://arxiv.org/abs/2404.06259

Zobrazit plný text záznamu

Report

Accelerating Diffusion Sampling with Optimized Time Steps

Autor: Xue, Shuchen, Liu, Zhaoqiang, Chen, Fei, Zhang, Shifeng, Hu, Tianyang, Xie, Enze, Li, Zhenguo

Diffusion probabilistic models (DPMs) have shown remarkable performance in high-resolution image synthesis, but their sampling efficiency is still to be desired due to the typically large number of sampling steps. Recent advancements in high-order nu

Externí odkaz: http://arxiv.org/abs/2402.17376

Zobrazit plný text záznamu

Report

Referee Can Play: An Alternative Approach to Conditional Generation via Model Inversion

Autor: Liu, Xuantong, Hu, Tianyang, Wang, Wenjia, Kawaguchi, Kenji, Yao, Yuan

As a dominant force in text-to-image generation tasks, Diffusion Probabilistic Models (DPMs) face a critical challenge in controllability, struggling to adhere strictly to complex, multi-faceted instructions. In this work, we aim to address this alig

Externí odkaz: http://arxiv.org/abs/2402.16305

Zobrazit plný text záznamu

Report

The Surprising Effectiveness of Skip-Tuning in Diffusion Sampling

Autor: Ma, Jiajun, Xue, Shuchen, Hu, Tianyang, Wang, Wenjia, Liu, Zhaoqiang, Li, Zhenguo, Ma, Zhi-Ming, Kawaguchi, Kenji

With the incorporation of the UNet architecture, diffusion probabilistic models have become a dominant force in image generation tasks. One key design in UNet is the skip connections between the encoder and decoder blocks. Although skip connections h

Externí odkaz: http://arxiv.org/abs/2402.15170

Zobrazit plný text záznamu

Report

On the Expressive Power of a Variant of the Looped Transformer

Autor: Gao, Yihang, Zheng, Chuanyang, Xie, Enze, Shi, Han, Hu, Tianyang, Li, Yu, Ng, Michael K., Li, Zhenguo, Liu, Zhaoqiang

Besides natural language processing, transformers exhibit extraordinary performance in solving broader applications, including scientific computing and computer vision. Previous works try to explain this from the expressive power and capability persp

Externí odkaz: http://arxiv.org/abs/2402.13572

Zobrazit plný text záznamu

Report

Elucidating The Design Space of Classifier-Guided Diffusion Generation

Autor: Ma, Jiajun, Hu, Tianyang, Wang, Wenjia, Sun, Jiacheng

Guidance in conditional diffusion generation is of great importance for sample quality and controllability. However, existing guidance schemes are to be desired. On one hand, mainstream methods such as classifier guidance and classifier-free guidance

Externí odkaz: http://arxiv.org/abs/2310.11311

Zobrazit plný text záznamu

Report

Complexity Matters: Rethinking the Latent Space for Generative Modeling

Autor: Hu, Tianyang, Chen, Fei, Wang, Haonan, Li, Jiawei, Wang, Wenjia, Sun, Jiacheng, Li, Zhenguo

In generative modeling, numerous successful approaches leverage a low-dimensional latent space, e.g., Stable Diffusion models the latent space induced by an encoder and generates images through a paired decoder. Although the selection of the latent s

Externí odkaz: http://arxiv.org/abs/2307.08283

Zobrazit plný text záznamu

Report

Training Energy-Based Models with Diffusion Contrastive Divergences

Autor: Luo, Weijian, Jiang, Hao, Hu, Tianyang, Sun, Jiacheng, Li, Zhenguo, Zhang, Zhihua

Energy-Based Models (EBMs) have been widely used for generative modeling. Contrastive Divergence (CD), a prevailing training objective for EBMs, requires sampling from the EBM with Markov Chain Monte Carlo methods (MCMCs), which leads to an irreconci

Externí odkaz: http://arxiv.org/abs/2307.01668

Zobrazit plný text záznamu

Vyhledávací nástroje:

Upřesnit hledání