Výsledky vyhledávání - "Zhong, Yiqiao"

Report

Assessing and improving reliability of neighbor embedding methods: a map-continuity perspective

Autor: Liu, Zhexuan, Ma, Rong, Zhong, Yiqiao

Visualizing high-dimensional data is an important routine for understanding biomedical data and interpreting deep learning models. Neighbor embedding methods, such as t-SNE, UMAP, and LargeVis, among others, are a family of popular visualization meth

Externí odkaz: http://arxiv.org/abs/2410.16608

Zobrazit plný text záznamu

Report

Out-of-distribution generalization via composition: a lens through induction heads in Transformers

Autor: Song, Jiajun, Xu, Zhuoyan, Zhong, Yiqiao

Large language models (LLMs) such as GPT-4 sometimes appear to be creative, solving novel tasks often with a few demonstrations in the prompt. These tasks require the models to generalize on distributions different from those from training data -- wh

Externí odkaz: http://arxiv.org/abs/2408.09503

Zobrazit plný text záznamu

Report

Model Editing as a Robust and Denoised variant of DPO: A Case Study on Toxicity

Autor: Uppaal, Rheeya, Dey, Apratim, He, Yiting, Zhong, Yiqiao, Hu, Junjie

Recent alignment algorithms such as direct preference optimization (DPO) have been developed to improve the safety of large language models (LLMs) by training these models to match human behaviors exemplified by preference data. However, these method

Externí odkaz: http://arxiv.org/abs/2405.13967

Zobrazit plný text záznamu

Report

How does Multi-Task Training Affect Transformer In-Context Capabilities? Investigations with Function Classes

Autor: Bhasin, Harmon, Ossowski, Timothy, Zhong, Yiqiao, Hu, Junjie

Large language models (LLM) have recently shown the extraordinary ability to perform unseen tasks based on few-shot examples provided as text, also known as in-context learning (ICL). While recent works have attempted to understand the mechanisms dri

Externí odkaz: http://arxiv.org/abs/2404.03558

Zobrazit plný text záznamu

Report

Uncovering hidden geometry in Transformers via disentangling position and context

Autor: Song, Jiajun, Zhong, Yiqiao

Transformers are widely used to extract semantic meanings from input tokens, yet they usually operate as black-box models. In this paper, we present a simple yet informative decomposition of hidden states (or embeddings) of trained transformers into

Externí odkaz: http://arxiv.org/abs/2310.04861

Zobrazit plný text záznamu

Report

Unraveling Projection Heads in Contrastive Learning: Insights from Expansion and Shrinkage

Autor: Gui, Yu, Ma, Cong, Zhong, Yiqiao

We investigate the role of projection heads, also known as projectors, within the encoder-projector framework (e.g., SimCLR) used in contrastive learning. We aim to demystify the observed phenomenon where representations learned before projectors out

Externí odkaz: http://arxiv.org/abs/2306.03335

Zobrazit plný text záznamu

Report

Tractability from overparametrization: The example of the negative perceptron

Autor: Montanari, Andrea, Zhong, Yiqiao, Zhou, Kangjie

In the negative perceptron problem we are given $n$ data points $({\boldsymbol x}_i,y_i)$, where ${\boldsymbol x}_i$ is a $d$-dimensional vector and $y_i\in\{+1,-1\}$ is a binary label. The data are not linearly separable and hence we content ourselv

Externí odkaz: http://arxiv.org/abs/2110.15824

Zobrazit plný text záznamu

Report

The Interpolation Phase Transition in Neural Networks: Memorization and Generalization under Lazy Training

Autor: Montanari, Andrea, Zhong, Yiqiao

Modern neural networks are often operated in a strongly overparametrized regime: they comprise so many parameters that they can interpolate the training set, even if actual labels are replaced by purely random ones. Despite this, they achieve good pr

Externí odkaz: http://arxiv.org/abs/2007.12826

Zobrazit plný text záznamu

Report

A Selective Overview of Deep Learning

Autor: Fan, Jianqing, Ma, Cong, Zhong, Yiqiao

Deep learning has arguably achieved tremendous success in recent years. In simple words, deep learning uses the composition of many nonlinear functions to model the complex dependency between input features and labels. While neural networks have a lo

Externí odkaz: http://arxiv.org/abs/1904.05526

Zobrazit plný text záznamu

Report

Robust high dimensional factor models with applications to statistical machine learning

Autor: Fan, Jianqing, Wang, Kaizheng, Zhong, Yiqiao, Zhu, Ziwei

Factor models are a class of powerful statistical models that have been widely used to deal with dependent measurements that arise frequently from various applications from genomics and neuroscience to economics and finance. As data are collected at

Externí odkaz: http://arxiv.org/abs/1808.03889

Zobrazit plný text záznamu

Vyhledávací nástroje:

Upřesnit hledání