Výsledky vyhledávání

Report

What Are We Measuring When We Evaluate Large Vision-Language Models? An Analysis of Latent Factors and Biases

Autor: Tiong, Anthony Meng Huat, Zhao, Junqi, Li, Boyang, Li, Junnan, Hoi, Steven C. H., Xiong, Caiming

Vision-language (VL) models, pretrained on colossal image-text datasets, have attained broad VL competence that is difficult to evaluate. A common belief is that a small number of VL skills underlie the variety of VL tests. In this paper, we perform

Externí odkaz: http://arxiv.org/abs/2404.02415

Zobrazit plný text záznamu

Report

CompeteSMoE -- Effective Training of Sparse Mixture of Experts via Competition

Autor: Pham, Quang, Do, Giang, Nguyen, Huy, Nguyen, TrungTin, Liu, Chenghao, Sartipi, Mina, Nguyen, Binh T., Ramasamy, Savitha, Li, Xiaoli, Hoi, Steven, Ho, Nhat

Sparse mixture of experts (SMoE) offers an appealing solution to scale up the model complexity beyond the mean of increasing the network's depth or width. However, effective training of SMoE has proven to be challenging due to the representation coll

Externí odkaz: http://arxiv.org/abs/2402.02526

Zobrazit plný text záznamu

Report

Few-Shot Learning on Graphs: from Meta-learning to Pre-training and Prompting

Autor: Yu, Xingtong, Fang, Yuan, Liu, Zemin, Wu, Yuxia, Wen, Zhihao, Bo, Jianyuan, Zhang, Xinming, Hoi, Steven C. H.

Graph representation learning, a critical step in graph-centric tasks, has seen significant advancements. Earlier techniques often operate in an end-to-end setting, where performance heavily relies on the availability of ample labeled data. This cons

Externí odkaz: http://arxiv.org/abs/2402.01440

Zobrazit plný text záznamu

Report

HyperRouter: Towards Efficient Training and Inference of Sparse Mixture of Experts

Autor: Do, Giang, Le, Khiem, Pham, Quang, Nguyen, TrungTin, Doan, Thanh-Nam, Nguyen, Bint T., Liu, Chenghao, Ramasamy, Savitha, Li, Xiaoli, Hoi, Steven

By routing input tokens to only a few split experts, Sparse Mixture-of-Experts has enabled efficient training of large language models. Recent findings suggest that fixing the routers can achieve competitive performance by alleviating the collapsing

Externí odkaz: http://arxiv.org/abs/2312.07035

Zobrazit plný text záznamu

Report

Personalised Distillation: Empowering Open-Sourced LLMs with Adaptive Learning for Code Generation

Autor: Chen, Hailin, Saha, Amrita, Hoi, Steven, Joty, Shafiq

With the rise of powerful closed-sourced LLMs (ChatGPT, GPT-4), there are increasing interests in distilling the capabilies of close-sourced LLMs to smaller open-sourced LLMs. Previous distillation methods usually prompt ChatGPT to generate a set of

Externí odkaz: http://arxiv.org/abs/2310.18628

Zobrazit plný text záznamu

Report

RAP-Gen: Retrieval-Augmented Patch Generation with CodeT5 for Automatic Program Repair

Autor: Wang, Weishi, Wang, Yue, Joty, Shafiq, Hoi, Steven C. H.

Automatic program repair (APR) is crucial to reduce manual debugging efforts for developers and improve software reliability. While conventional search-based techniques typically rely on heuristic rules or a redundancy assumption to mine fix patterns

Externí odkaz: http://arxiv.org/abs/2309.06057

Zobrazit plný text záznamu

Report

PyRCA: A Library for Metric-based Root Cause Analysis

Autor: Liu, Chenghao, Yang, Wenzhuo, Mittal, Himanshu, Singh, Manpreet, Sahoo, Doyen, Hoi, Steven C. H.

We introduce PyRCA, an open-source Python machine learning library of Root Cause Analysis (RCA) for Artificial Intelligence for IT Operations (AIOps). It provides a holistic framework to uncover the complicated metric causal dependencies and automati

Externí odkaz: http://arxiv.org/abs/2306.11417

Zobrazit plný text záznamu

Report

OTW: Optimal Transport Warping for Time Series

Autor: Latorre, Fabian, Liu, Chenghao, Sahoo, Doyen, Hoi, Steven C. H.

Dynamic Time Warping (DTW) has become the pragmatic choice for measuring distance between time series. However, it suffers from unavoidable quadratic time complexity when the optimal alignment matrix needs to be computed exactly. This hinders its use

Externí odkaz: http://arxiv.org/abs/2306.00620

Zobrazit plný text záznamu

Report

CodeTF: One-stop Transformer Library for State-of-the-art Code LLM

Autor: Bui, Nghi D. Q., Le, Hung, Wang, Yue, Li, Junnan, Gotmare, Akhilesh Deepak, Hoi, Steven C. H.

Code intelligence plays a key role in transforming modern software engineering. Recently, deep learning-based models, especially Transformer-based large language models (LLMs), have demonstrated remarkable potential in tackling these tasks by leverag

Externí odkaz: http://arxiv.org/abs/2306.00029

Zobrazit plný text záznamu

Report

BLIP-Diffusion: Pre-trained Subject Representation for Controllable Text-to-Image Generation and Editing

Autor: Li, Dongxu, Li, Junnan, Hoi, Steven C. H.

Subject-driven text-to-image generation models create novel renditions of an input subject based on text prompts. Existing models suffer from lengthy fine-tuning and difficulties preserving the subject fidelity. To overcome these limitations, we intr

Externí odkaz: http://arxiv.org/abs/2305.14720

Zobrazit plný text záznamu

Vyhledávací nástroje:

Upřesnit hledání