Zobrazeno 1 - 10
of 169
pro vyhledávání: '"Zhou, YiChao"'
Autor:
Bochkovskii, Aleksei, Delaunoy, Amaël, Germain, Hugo, Santos, Marcel, Zhou, Yichao, Richter, Stephan R., Koltun, Vladlen
We present a foundation model for zero-shot metric monocular depth estimation. Our model, Depth Pro, synthesizes high-resolution depth maps with unparalleled sharpness and high-frequency details. The predictions are metric, with absolute scale, witho
Externí odkaz:
http://arxiv.org/abs/2410.02073
Autor:
Hwang, EunJeong, Zhou, Yichao, Wendt, James Bradley, Gunel, Beliz, Vo, Nguyen, Xie, Jing, Tata, Sandeep
Large language models (LLMs) often struggle with processing extensive input contexts, which can lead to redundant, inaccurate, or incoherent summaries. Recent methods have used unstructured memory to incrementally process these contexts, but they sti
Externí odkaz:
http://arxiv.org/abs/2407.15021
No existing dataset adequately tests how well language models can incrementally update entity summaries - a crucial ability as these models rapidly advance. The Incremental Entity Summarization (IES) task is vital for maintaining accurate, up-to-date
Externí odkaz:
http://arxiv.org/abs/2406.05079
Autor:
Gunel, Beliz, Wendt, James B., Xie, Jing, Zhou, Yichao, Vo, Nguyen, Fisher, Zachary, Tata, Sandeep
Users often struggle with decision-making between two options (A vs B), as it usually requires time-consuming research across multiple web pages. We propose STRUM-LLM that addresses this challenge by generating attributed, structured, and helpful con
Externí odkaz:
http://arxiv.org/abs/2403.19710
There has been significant attention devoted to the effectiveness of various domains, such as semi-supervised learning, contrastive learning, and meta-learning, in enhancing the performance of methods for noisy label learning (NLL) tasks. However, mo
Externí odkaz:
http://arxiv.org/abs/2312.09505
This study investigates identity-preserving image synthesis, an intriguing task in image generation that seeks to maintain a subject's identity while adding a personalized, stylistic touch. Traditional methods, such as Textual Inversion and DreamBoot
Externí odkaz:
http://arxiv.org/abs/2312.02663
Many business workflows require extracting important fields from form-like documents (e.g. bank statements, bills of lading, purchase orders, etc.). Recent techniques for automating this task work well only when trained with large datasets. In this w
Externí odkaz:
http://arxiv.org/abs/2212.10047
Understanding visually-rich business documents to extract structured data and automate business workflows has been receiving attention both in academia and industry. Although recent multi-modal language models have achieved impressive results, we fin
Externí odkaz:
http://arxiv.org/abs/2211.15421
A key bottleneck in building automatic extraction models for visually rich documents like invoices is the cost of acquiring the several thousand high-quality labeled documents that are needed to train a model with acceptable accuracy. We propose Sele
Externí odkaz:
http://arxiv.org/abs/2210.16391
Autor:
Li, Xiaoguang, Zhou, Yichao, Yin, Hongxia, Zhao, Pengfei, Lv, Han, Tang, Ruowei, Qin, Yating, Zhuo, Li, Wang, Suyu, Wang, Zhenchang
Publikováno v:
In Biomedical Signal Processing and Control December 2024 98