Výsledky vyhledávání

Akademický článek

Fe nanoparticle-functionalized ordered mesoporous carbon with tailored mesostructures and their applications in magnetic removal of Ag(i)

Autor: Zhang Wenjuan, Li Yuheng, Ran Mengyu, Wang Youliang, Ding Yezhi, Zhang Bobo, Feng Qiancheng, Chu Qianqian, Shen Yongqian, Sheng Wang

Publikováno v: Reviews on Advanced Materials Science, Vol 63, Iss 1, Pp id. 116748-41438 (2024)

Fe nanoparticle-functionalized ordered mesoporous carbon (Fe0/OMC) was synthesized using triblock copolymers as templates and through solvent evaporation self-assembly, followed by a carbothermal reduction. Fe0/OMC had three microstructures of two-di

Externí odkaz: https://doaj.org/article/591e4968d76940a3965e9e18126c35ee

Zobrazit plný text záznamu

Report

Removing Distributional Discrepancies in Captions Improves Image-Text Alignment

Autor: Li, Yuheng, Liu, Haotian, Cai, Mu, Li, Yijun, Shechtman, Eli, Lin, Zhe, Lee, Yong Jae, Singh, Krishna Kumar

In this paper, we introduce a model designed to improve the prediction of image-text alignment, targeting the challenge of compositional understanding in current visual-language models. Our approach focuses on generating high-quality training dataset

Externí odkaz: http://arxiv.org/abs/2410.00905

Zobrazit plný text záznamu

Report

RoWSFormer: A Robust Watermarking Framework with Swin Transformer for Enhanced Geometric Attack Resilience

Autor: Chen, Weitong, Li, Yuheng

In recent years, digital watermarking techniques based on deep learning have been widely studied. To achieve both imperceptibility and robustness of image watermarks, most current methods employ convolutional neural networks to build robust watermark

Externí odkaz: http://arxiv.org/abs/2409.14829

Zobrazit plný text záznamu

Report

Interpolating Video-LLMs: Toward Longer-sequence LMMs in a Training-free Manner

Autor: Shang, Yuzhang, Xu, Bingxin, Kang, Weitai, Cai, Mu, Li, Yuheng, Wen, Zehao, Dong, Zhen, Keutzer, Kurt, Lee, Yong Jae, Yan, Yan

Advancements in Large Language Models (LLMs) inspire various strategies for integrating video modalities. A key approach is Video-LLMs, which incorporate an optimizable interface linking sophisticated video encoders to LLMs. However, due to computati

Externí odkaz: http://arxiv.org/abs/2409.12963

Zobrazit plný text záznamu

Report

AnatoMask: Enhancing Medical Image Segmentation with Reconstruction-guided Self-masking

Autor: Li, Yuheng, Luan, Tianyu, Wu, Yizhou, Pan, Shaoyan, Chen, Yenho, Yang, Xiaofeng

Due to the scarcity of labeled data, self-supervised learning (SSL) has gained much attention in 3D medical image segmentation, by extracting semantic representations from unlabeled data. Among SSL strategies, Masked image modeling (MIM) has shown ef

Externí odkaz: http://arxiv.org/abs/2407.06468

Zobrazit plný text záznamu

Report

Yo'LLaVA: Your Personalized Language and Vision Assistant

Autor: Nguyen, Thao, Liu, Haotian, Li, Yuheng, Cai, Mu, Ojha, Utkarsh, Lee, Yong Jae

Large Multimodal Models (LMMs) have shown remarkable capabilities across a variety of tasks (e.g., image captioning, visual question answering). While broad, their knowledge remains generic (e.g., recognizing a dog), and they are unable to handle per

Externí odkaz: http://arxiv.org/abs/2406.09400

Zobrazit plný text záznamu

Report

Mammo-CLIP: Leveraging Contrastive Language-Image Pre-training (CLIP) for Enhanced Breast Cancer Diagnosis with Multi-view Mammography

Autor: Chen, Xuxin, Li, Yuheng, Hu, Mingzhe, Salari, Ella, Chen, Xiaoqian, Qiu, Richard L. J., Zheng, Bin, Yang, Xiaofeng

Although fusion of information from multiple views of mammograms plays an important role to increase accuracy of breast cancer detection, developing multi-view mammograms-based computer-aided diagnosis (CAD) schemes still faces challenges and no such

Externí odkaz: http://arxiv.org/abs/2404.15946

Zobrazit plný text záznamu

Report

Improved Baselines with Visual Instruction Tuning

Autor: Liu, Haotian, Li, Chunyuan, Li, Yuheng, Lee, Yong Jae

Large multimodal models (LMM) have recently shown encouraging progress with visual instruction tuning. In this note, we show that the fully-connected vision-language cross-modal connector in LLaVA is surprisingly powerful and data-efficient. With sim

Externí odkaz: http://arxiv.org/abs/2310.03744

Zobrazit plný text záznamu

Report

Visual Instruction Inversion: Image Editing via Visual Prompting

Autor: Nguyen, Thao, Li, Yuheng, Ojha, Utkarsh, Lee, Yong Jae

Text-conditioned image editing has emerged as a powerful tool for editing images. However, in many situations, language can be ambiguous and ineffective in describing specific image edits. When faced with such challenges, visual prompts can be a more

Externí odkaz: http://arxiv.org/abs/2307.14331

Zobrazit plný text záznamu

Report

Towards Automatic Boundary Detection for Human-AI Collaborative Hybrid Essay in Education

Autor: Zeng, Zijie, Sha, Lele, Li, Yuheng, Yang, Kaixun, Gašević, Dragan, Chen, Guanliang

The recent large language models (LLMs), e.g., ChatGPT, have been able to generate human-like and fluent responses when provided with specific instructions. While admitting the convenience brought by technological advancement, educators also have con

Externí odkaz: http://arxiv.org/abs/2307.12267

Zobrazit plný text záznamu

Vyhledávací nástroje:

Upřesnit hledání