Zobrazeno 1 - 3
of 3
pro vyhledávání: '"Guo, Xuechen"'
Multimodal Large Language Model (MLLM) has recently garnered attention as a prominent research focus. By harnessing powerful LLM, it facilitates a transition of conversational generative AI from unimodal text to performing multimodal tasks. This boom
Externí odkaz:
http://arxiv.org/abs/2410.15074
Autor:
Zhang, Zhenyu, Wang, Benlu, Liang, Weijie, Li, Yizhi, Guo, Xuechen, Wang, Guanhong, Li, Shiyan, Wang, Gaoang
With the development of multimodality and large language models, the deep learning-based technique for medical image captioning holds the potential to offer valuable diagnostic recommendations. However, current generic text and image pre-trained mode
Externí odkaz:
http://arxiv.org/abs/2311.01004
Medical images often incorporate doctor-added markers that can hinder AI-based diagnosis. This issue highlights the need of inpainting techniques to restore the corrupted visual contents. However, existing methods require manual mask annotation as in
Externí odkaz:
http://arxiv.org/abs/2303.15124