Výsledky vyhledávání - "Yao Benjamin"

Report

Bringing Multimodality to Amazon Visual Search System

Autor: Zhu, Xinliang, Huang, Michael, Ding, Han, Yang, Jinyu, Chen, Kelvin, Zhou, Tao, Neiman, Tal, Xie, Ouye, Tran, Son, Yao, Benjamin, Gray, Doug, Bindal, Anuj, Dhua, Arnab

Publikováno v: Proceedings of the 30th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, 2024

Image to image matching has been well studied in the computer vision community. Previous studies mainly focus on training a deep metric learning model matching visual patterns between the query image and gallery images. In this study, we show that pu

Externí odkaz: http://arxiv.org/abs/2412.13364

Zobrazit plný text záznamu

Report

Diffusion Models For Multi-Modal Generative Modeling

Autor: Chen, Changyou, Ding, Han, Sisman, Bunyamin, Xu, Yi, Xie, Ouye, Yao, Benjamin Z., Tran, Son Dinh, Zeng, Belinda

Diffusion-based generative modeling has been achieving state-of-the-art results on various generation tasks. Most diffusion models, however, are limited to a single-generation modeling. Can we generalize diffusion models with the ability of multi-mod

Externí odkaz: http://arxiv.org/abs/2407.17571

Zobrazit plný text záznamu

Report

X-Former: Unifying Contrastive and Reconstruction Learning for MLLMs

Autor: Swetha, Sirnam, Yang, Jinyu, Neiman, Tal, Rizve, Mamshad Nayeem, Tran, Son, Yao, Benjamin, Chilimbi, Trishul, Shah, Mubarak

Recent advancements in Multimodal Large Language Models (MLLMs) have revolutionized the field of vision-language understanding by integrating visual perception capabilities into Large Language Models (LLMs). The prevailing trend in this field involve

Externí odkaz: http://arxiv.org/abs/2407.13851

Zobrazit plný text záznamu

Report

Open Vocabulary Multi-Label Video Classification

Autor: Gupta, Rohit, Rizve, Mamshad Nayeem, Unnikrishnan, Jayakrishnan, Tawari, Ashish, Tran, Son, Shah, Mubarak, Yao, Benjamin, Chilimbi, Trishul

Pre-trained vision-language models (VLMs) have enabled significant progress in open vocabulary computer vision tasks such as image classification, object detection and image segmentation. Some recent works have focused on extending VLMs to open vocab

Externí odkaz: http://arxiv.org/abs/2407.09073

Zobrazit plný text záznamu

Report

VidLA: Video-Language Alignment at Scale

Autor: Rizve, Mamshad Nayeem, Fei, Fan, Unnikrishnan, Jayakrishnan, Tran, Son, Yao, Benjamin Z., Zeng, Belinda, Shah, Mubarak, Chilimbi, Trishul

In this paper, we propose VidLA, an approach for video-language alignment at scale. There are two major limitations of previous video-language alignment approaches. First, they do not capture both short-range and long-range temporal dependencies and

Externí odkaz: http://arxiv.org/abs/2403.14870

Zobrazit plný text záznamu

Report

PersonaPKT: Building Personalized Dialogue Agents via Parameter-efficient Knowledge Transfer

Autor: Han, Xu, Guo, Bin, Jung, Yoon, Yao, Benjamin, Zhang, Yu, Liu, Xiaohu, Guo, Chenlei

Personalized dialogue agents (DAs) powered by large pre-trained language models (PLMs) often rely on explicit persona descriptions to maintain personality consistency. However, such descriptions may not always be available or may pose privacy concern

Externí odkaz: http://arxiv.org/abs/2306.08126

Zobrazit plný text záznamu

Report

KEPLET: Knowledge-Enhanced Pretrained Language Model with Topic Entity Awareness

Autor: Li, Yichuan, Han, Jialong, Lee, Kyumin, Ma, Chengyuan, Yao, Benjamin, Liu, Derek

In recent years, Pre-trained Language Models (PLMs) have shown their superiority by pre-training on unstructured text corpus and then fine-tuning on downstream tasks. On entity-rich textual resources like Wikipedia, Knowledge-Enhanced PLMs (KEPLMs) i

Externí odkaz: http://arxiv.org/abs/2305.01810

Zobrazit plný text záznamu

Report

Large-scale Hybrid Approach for Predicting User Satisfaction with Conversational Agents

Autor: Park, Dookun, Yuan, Hao, Kim, Dongmin, Zhang, Yinglei, Spyros, Matsoukas, Kim, Young-Bum, Sarikaya, Ruhi, Guo, Edward, Ling, Yuan, Quinn, Kevin, Hung, Pham, Yao, Benjamin, Lee, Sungjin

Measuring user satisfaction level is a challenging task, and a critical component in developing large-scale conversational agent systems serving the needs of real users. An widely used approach to tackle this is to collect human annotation data and u

Externí odkaz: http://arxiv.org/abs/2006.07113

Zobrazit plný text záznamu

Report

Knowledge Distillation from Internal Representations

Autor: Aguilar, Gustavo, Ling, Yuan, Zhang, Yu, Yao, Benjamin, Fan, Xing, Guo, Chenlei

Knowledge distillation is typically conducted by training a small model (the student) to mimic a large and cumbersome model (the teacher). The idea is to compress the knowledge from the teacher by using its output probabilities as soft-labels to opti

Externí odkaz: http://arxiv.org/abs/1910.03723

Zobrazit plný text záznamu

Akademický článek

Tento výsledek nelze pro nepřihlášené uživatele zobrazit.
K zobrazení výsledku je třeba se přihlásit.

Vyhledávací nástroje:

Upřesnit hledání