Výsledky vyhledávání - "Blaschko, Matthew"

Report

Efficient Expert Pruning for Sparse Mixture-of-Experts Language Models: Enhancing Performance and Reducing Inference Costs

Autor: Liu, Enshu, Zhu, Junyi, Lin, Zinan, Ning, Xuefei, Blaschko, Matthew B., Yan, Shengen, Dai, Guohao, Yang, Huazhong, Wang, Yu

The rapid advancement of large language models (LLMs) has led to architectures with billions to trillions of parameters, posing significant deployment challenges due to their substantial demands on memory, processing power, and energy consumption. Sp

Externí odkaz: http://arxiv.org/abs/2407.00945

Zobrazit plný text záznamu

Report

FastMem: Fast Memorization of Prompt Improves Context Awareness of Large Language Models

Autor: Zhu, Junyi, Liu, Shuochen, Yu, Yu, Tang, Bo, Yan, Yibo, Li, Zhiyu, Xiong, Feiyu, Xu, Tong, Blaschko, Matthew B.

Large language models (LLMs) excel in generating coherent text, but they often struggle with context awareness, leading to inaccuracies in tasks requiring faithful adherence to provided information. We introduce FastMem, a novel method designed to en

Externí odkaz: http://arxiv.org/abs/2406.16069

Zobrazit plný text záznamu

Report

Can LLMs Learn by Teaching? A Preliminary Study

Autor: Ning, Xuefei, Wang, Zifu, Li, Shiyao, Lin, Zinan, Yao, Peiran, Fu, Tianyu, Blaschko, Matthew B., Dai, Guohao, Yang, Huazhong, Wang, Yu

Teaching to improve student models (e.g., knowledge distillation) is an extensively studied methodology in LLMs. However, for humans, teaching not only improves students but also improves teachers. We ask: Can LLMs also learn by teaching (LbT)? If ye

Externí odkaz: http://arxiv.org/abs/2406.14629

Zobrazit plný text záznamu

Report

A Generic Method for Fine-grained Category Discovery in Natural Language Texts

Autor: Tian, Chang, Blaschko, Matthew B., Yin, Wenpeng, Xing, Mingzhe, Yue, Yinliang, Moens, Marie-Francine

Fine-grained category discovery using only coarse-grained supervision is a cost-effective yet challenging task. Previous training methods focus on aligning query samples with positive samples and distancing them from negatives. They often neglect int

Externí odkaz: http://arxiv.org/abs/2406.13103

Zobrazit plný text záznamu

Report

DistilDoc: Knowledge Distillation for Visually-Rich Document Applications

Autor: Van Landeghem, Jordy, Maity, Subhajit, Banerjee, Ayan, Blaschko, Matthew, Moens, Marie-Francine, Lladós, Josep, Biswas, Sanket

This work explores knowledge distillation (KD) for visually-rich document (VRD) applications such as document layout analysis (DLA) and document image classification (DIC). While VRD research is dependent on increasingly sophisticated and cumbersome

Externí odkaz: http://arxiv.org/abs/2406.08226

Zobrazit plný text záznamu

Report

Multimodal Adaptive Inference for Document Image Classification with Anytime Early Exiting

Autor: Hamed, Omar, Bakkali, Souhail, Moens, Marie-Francine, Blaschko, Matthew, Van Landeghem, Jordy

This work addresses the need for a balanced approach between performance and efficiency in scalable production environments for visually-rich document understanding (VDU) tasks. Currently, there is a reliance on large document foundation models that

Externí odkaz: http://arxiv.org/abs/2405.12705

Zobrazit plný text záznamu

Report

Improving Multimodal Learning with Multi-Loss Gradient Modulation

Autor: Kontras, Konstantinos, Chatzichristos, Christos, Blaschko, Matthew, De Vos, Maarten

Learning from multiple modalities, such as audio and video, offers opportunities for leveraging complementary information, enhancing robustness, and improving contextual understanding and performance. However, combining such modalities presents chall

Externí odkaz: http://arxiv.org/abs/2405.07930

Zobrazit plný text záznamu

Report

Implicit Neural Representations for Robust Joint Sparse-View CT Reconstruction

Autor: Shi, Jiayang, Zhu, Junyi, Pelt, Daniel M., Batenburg, K. Joost, Blaschko, Matthew B.

Computed Tomography (CT) is pivotal in industrial quality control and medical diagnostics. Sparse-view CT, offering reduced ionizing radiation, faces challenges due to its under-sampled nature, leading to ill-posed reconstruction problems. Recent adv

Externí odkaz: http://arxiv.org/abs/2405.02509

Zobrazit plný text záznamu

Report

Linear Combination of Saved Checkpoints Makes Consistency and Diffusion Models Better

Autor: Liu, Enshu, Zhu, Junyi, Lin, Zinan, Ning, Xuefei, Blaschko, Matthew B., Yekhanin, Sergey, Yan, Shengen, Dai, Guohao, Yang, Huazhong, Wang, Yu

Diffusion Models (DM) and Consistency Models (CM) are two types of popular generative models with good generation quality on various tasks. When training DM and CM, intermediate weight checkpoints are not fully utilized and only the last converged ch

Externí odkaz: http://arxiv.org/abs/2404.02241

Zobrazit plný text záznamu

Report

The Common Stability Mechanism behind most Self-Supervised Learning Approaches

Autor: Jha, Abhishek, Blaschko, Matthew B., Asano, Yuki M., Tuytelaars, Tinne

Last couple of years have witnessed a tremendous progress in self-supervised learning (SSL), the success of which can be attributed to the introduction of useful inductive biases in the learning process to learn meaningful visual representations whil

Externí odkaz: http://arxiv.org/abs/2402.14957

Zobrazit plný text záznamu

Vyhledávací nástroje:

Upřesnit hledání