Výsledky vyhledávání

Report

AutoTVG: A New Vision-language Pre-training Paradigm for Temporal Video Grounding

Autor: Zhang, Xing, Gu, Jiaxi, Zhao, Haoyu, Wang, Shicong, Xu, Hang, Pei, Renjing, Xu, Songcen, Wu, Zuxuan, Jiang, Yu-Gang

Temporal Video Grounding (TVG) aims to localize a moment from an untrimmed video given the language description. Since the annotation of TVG is labor-intensive, TVG under limited supervision has accepted attention in recent years. The great success o

Externí odkaz: http://arxiv.org/abs/2406.07091

Zobrazit plný text záznamu

Report

HFGS: 4D Gaussian Splatting with Emphasis on Spatial and Temporal High-Frequency Components for Endoscopic Scene Reconstruction

Autor: Zhao, Haoyu, Zhao, Xingyue, Zhu, Lingting, Zheng, Weixi, Xu, Yongchao

Robot-assisted minimally invasive surgery benefits from enhancing dynamic scene reconstruction, as it improves surgical outcomes. While Neural Radiance Fields (NeRF) have been effective in scene reconstruction, their slow inference speeds and lengthy

Externí odkaz: http://arxiv.org/abs/2405.17872

Zobrazit plný text záznamu

Report

LLM-Optic: Unveiling the Capabilities of Large Language Models for Universal Visual Grounding

Autor: Zhao, Haoyu, Ge, Wenhang, Chen, Ying-cong

Visual grounding is an essential tool that links user-provided text queries with query-specific regions within an image. Despite advancements in visual grounding models, their ability to comprehend complex queries remains limited. To overcome this li

Externí odkaz: http://arxiv.org/abs/2405.17104

Zobrazit plný text záznamu

Report

MoreStyle: Relax Low-frequency Constraint of Fourier-based Image Reconstruction in Generalizable Medical Image Segmentation

Autor: Zhao, Haoyu, Dong, Wenhui, Yu, Rui, Zhao, Zhou, Bo, Du, Xu, Yongchao

The task of single-source domain generalization (SDG) in medical image segmentation is crucial due to frequent domain shifts in clinical image datasets. To address the challenge of poor generalization across different domains, we introduce a Plug-and

Externí odkaz: http://arxiv.org/abs/2403.11689

Zobrazit plný text záznamu

Report

WIA-LD2ND: Wavelet-based Image Alignment for Self-supervised Low-Dose CT Denoising

Autor: Zhao, Haoyu, Gu, Yuliang, Zhao, Zhou, Du, Bo, Xu, Yongchao, Yu, Rui

In clinical examinations and diagnoses, low-dose computed tomography (LDCT) is crucial for minimizing health risks compared with normal-dose computed tomography (NDCT). However, reducing the radiation dose compromises the signal-to-noise ratio, leadi

Externí odkaz: http://arxiv.org/abs/2403.11672

Zobrazit plný text záznamu

Report

Keeping LLMs Aligned After Fine-tuning: The Crucial Role of Prompt Templates

Autor: Lyu, Kaifeng, Zhao, Haoyu, Gu, Xinran, Yu, Dingli, Goyal, Anirudh, Arora, Sanjeev

Public LLMs such as the Llama 2-Chat have driven huge activity in LLM research. These models underwent alignment training and were considered safe. Recently Qi et al. (2023) reported that even benign fine-tuning (e.g., on seemingly safe datasets) can

Externí odkaz: http://arxiv.org/abs/2402.18540

Zobrazit plný text záznamu

Akademický článek

Online Mindfulness-Based Cognitive Behavioral Therapy Intervention for Youth With Major Depressive Disorders: Randomized Controlled Trial

Autor: Ritvo, Paul, Knyahnytska, Yuliya, Pirbaglou, Meysam, Wang, Wei, Tomlinson, George, Zhao, Haoyu, Linklater, Renee, Bai, Shari, Kirk, Megan, Katz, Joel, Harber, Lillian, Daskalakis, Zafiris

Publikováno v: Journal of Medical Internet Research, Vol 23, Iss 3, p e24380 (2021)

BackgroundApproximately 70% of mental health disorders appear prior to 25 years of age and can become chronic when ineffectively treated. Individuals between 18 and 25 years old are significantly more likely to experience mental health disorders, sub

Externí odkaz: https://doaj.org/article/ea2caf10ae4742faa8334eeb697a5368

Zobrazit plný text záznamu

Plný text ve formátu HTML

Report

VideoAssembler: Identity-Consistent Video Generation with Reference Entities using Diffusion Model

Autor: Zhao, Haoyu, Lu, Tianyi, Gu, Jiaxi, Zhang, Xing, Wu, Zuxuan, Xu, Hang, Jiang, Yu-Gang

Identity-consistent video generation seeks to synthesize videos that are guided by both textual prompts and reference images of entities. Current approaches typically utilize cross-attention layers to integrate the appearance of the entity, which pre

Externí odkaz: http://arxiv.org/abs/2311.17338

Zobrazit plný text záznamu

Report

Adversarial Attacks on Combinatorial Multi-Armed Bandits

Autor: Balasubramanian, Rishab, Li, Jiawei, Tadepalli, Prasad, Wang, Huazheng, Wu, Qingyun, Zhao, Haoyu

We study reward poisoning attacks on Combinatorial Multi-armed Bandits (CMAB). We first provide a sufficient and necessary condition for the attackability of CMAB, a notion to capture the vulnerability and robustness of CMAB. The attackability condit

Externí odkaz: http://arxiv.org/abs/2310.05308

Zobrazit plný text záznamu

Report

Reuse and Diffuse: Iterative Denoising for Text-to-Video Generation

Autor: Gu, Jiaxi, Wang, Shicong, Zhao, Haoyu, Lu, Tianyi, Zhang, Xing, Wu, Zuxuan, Xu, Songcen, Zhang, Wei, Jiang, Yu-Gang, Xu, Hang

Inspired by the remarkable success of Latent Diffusion Models (LDMs) for image synthesis, we study LDM for text-to-video generation, which is a formidable challenge due to the computational and memory constraints during both model training and infere

Externí odkaz: http://arxiv.org/abs/2309.03549

Zobrazit plný text záznamu

Vyhledávací nástroje:

Upřesnit hledání