Zobrazeno 1 - 10
of 4 610
pro vyhledávání: '"GU Yi"'
Neural surface reconstruction relies heavily on accurate camera poses as input. Despite utilizing advanced pose estimators like COLMAP or ARKit, camera poses can still be noisy. Existing pose-NeRF joint optimization methods handle poses with small no
Externí odkaz:
http://arxiv.org/abs/2411.13620
Score identity Distillation (SiD) is a data-free method that has achieved SOTA performance in image generation by leveraging only a pretrained diffusion model, without requiring any training data. However, its ultimate performance is constrained by h
Externí odkaz:
http://arxiv.org/abs/2410.14919
The generation and editing of floor plans are critical in architectural planning, requiring a high degree of flexibility and efficiency. Existing methods demand extensive input information and lack the capability for interactive adaptation to user mo
Externí odkaz:
http://arxiv.org/abs/2410.11908
Medical anomaly detection (AD) is crucial in pathological identification and localization. Current methods typically rely on uncertainty estimation in deep ensembles to detect anomalies, assuming that ensemble learners should agree on normal samples
Externí odkaz:
http://arxiv.org/abs/2409.17485
Autor:
Gu, Yi, Otake, Yoshito, Uemura, Keisuke, Takao, Masaki, Soufi, Mazen, Okada, Seiji, Sugano, Nobuhiko, Talbot, Hugues, Sato, Yoshinobu
Radiography is widely used in orthopedics for its affordability and low radiation exposure. 3D reconstruction from a single radiograph, so-called 2D-3D reconstruction, offers the possibility of various clinical applications, but achieving clinically
Externí odkaz:
http://arxiv.org/abs/2409.16702
Autor:
Shivakumar, Prashanth Gurunath, Kolehmainen, Jari, Gourav, Aditya, Gu, Yi, Gandhe, Ankur, Rastrow, Ariya, Bulyko, Ivan
Large language models (LLM) have demonstrated the ability to understand human language by leveraging large amount of text data. Automatic speech recognition (ASR) systems are often limited by available transcribed speech data and benefit from a secon
Externí odkaz:
http://arxiv.org/abs/2409.16654
Autor:
Gu, Yi, Otake, Yoshito, Uemura, Keisuke, Takao, Masaki, Soufi, Mazen, Okada, Seiji, Sugano, Nobuhiko, Talbot, Hugues, Sato, Yoshinobu
While most vision tasks are essentially visual in nature (for recognition), some important tasks, especially in the medical field, also require quantitative analysis (for quantification) using quantitative images. Unlike in visual analysis, pixel val
Externí odkaz:
http://arxiv.org/abs/2407.20495
For extremely weak-supervised text classification, pioneer research generates pseudo labels by mining texts similar to the class names from the raw corpus, which may end up with very limited or even no samples for the minority classes. Recent works h
Externí odkaz:
http://arxiv.org/abs/2406.11115
Autor:
Xiang, Jiannan, Liu, Guangyi, Gu, Yi, Gao, Qiyue, Ning, Yuting, Zha, Yuheng, Feng, Zeyu, Tao, Tianhua, Hao, Shibo, Shi, Yemin, Liu, Zhengzhong, Xing, Eric P., Hu, Zhiting
World models simulate future states of the world in response to different actions. They facilitate interactive content creation and provides a foundation for grounded, long-horizon reasoning. Current foundation models do not fully meet the capabiliti
Externí odkaz:
http://arxiv.org/abs/2406.09455
Aligning large language models with human preferences has emerged as a critical focus in language modeling research. Yet, integrating preference learning into Text-to-Image (T2I) generative models is still relatively uncharted territory. The Diffusio
Externí odkaz:
http://arxiv.org/abs/2406.06382