Zobrazeno 1 - 10
of 9 395
pro vyhledávání: '"In, Soyeon"'
Autor:
Cabral, Rina Carines, Han, Soyeon Caren, Alhassan, Areej, Batista-Navarro, Riza, Nenadic, Goran, Poon, Josiah
Discontinuous Named Entity Recognition (DNER) presents a challenging problem where entities may be scattered across multiple non-adjacent tokens, making traditional sequence labelling approaches inadequate. Existing methods predominantly rely on cust
Externí odkaz:
http://arxiv.org/abs/2411.01839
Existing Multimodal Large Language Models (MLLMs) and Visual Language Pretrained Models (VLPMs) have shown remarkable performances in the general Visual Question Answering (VQA). However, these models struggle with VQA questions that require external
Externí odkaz:
http://arxiv.org/abs/2411.02722
Transformer-based models have achieved remarkable success in various Natural Language Processing (NLP) tasks, yet their ability to handle long documents is constrained by computational limitations. Traditional approaches, such as truncating inputs, s
Externí odkaz:
http://arxiv.org/abs/2410.11119
This tutorial explores recent advancements in multimodal pretrained and large models, capable of integrating and processing diverse data forms such as text, images, audio, and video. Participants will gain an understanding of the foundational concept
Externí odkaz:
http://arxiv.org/abs/2410.05608
Generative models must ensure both privacy and fairness for Trustworthy AI. While these goals have been pursued separately, recent studies propose to combine existing privacy and fairness techniques to achieve both goals. However, naively combining t
Externí odkaz:
http://arxiv.org/abs/2410.02246
Visually-Rich Documents (VRDs), encompassing elements like charts, tables, and references, convey complex information across various fields. However, extracting information from these rich documents is labor-intensive, especially given their inconsis
Externí odkaz:
http://arxiv.org/abs/2410.01609
Although Large Language Models(LLMs) can generate coherent and contextually relevant text, they often struggle to recognise the intent behind the human user's query. Natural Language Understanding (NLU) models, however, interpret the purpose and key
Externí odkaz:
http://arxiv.org/abs/2408.08144
Automatic Chart Question Answering (ChartQA) is challenging due to the complex distribution of chart elements with patterns of the underlying data not explicitly displayed in charts. To address this challenge, we design a joint multimodal scene graph
Externí odkaz:
http://arxiv.org/abs/2408.04852
Autor:
Research, LG AI, An, Soyoung, Bae, Kyunghoon, Choi, Eunbi, Choi, Stanley Jungkyu, Choi, Yemuk, Hong, Seokhee, Hong, Yeonjung, Hwang, Junwon, Jeon, Hyojin, Jo, Gerrard Jeongwon, Jo, Hyunjik, Jung, Jiyeon, Jung, Yountae, Kim, Euisoon, Kim, Hyosang, Kim, Joonkee, Kim, Seonghwan, Kim, Soyeon, Kim, Sunkyoung, Kim, Yireun, Kim, Youchul, Lee, Edward Hwayoung, Lee, Haeju, Lee, Honglak, Lee, Jinsik, Lee, Kyungmin, Lee, Moontae, Lee, Seungjun, Lim, Woohyung, Park, Sangha, Park, Sooyoun, Park, Yongmin, Seo, Boseong, Yang, Sihoon, Yeen, Heuiyeen, Yoo, Kyungjae, Yun, Hyeongu
We introduce EXAONE 3.0 instruction-tuned language model, the first open model in the family of Large Language Models (LLMs) developed by LG AI Research. Among different model sizes, we publicly release the 7.8B instruction-tuned model to promote ope
Externí odkaz:
http://arxiv.org/abs/2408.03541
In this work, we tackle the problem of long-form video-language grounding (VLG). Given a long-form video and a natural language query, a model should temporally localize the precise moment that answers the query. Humans can easily solve VLG tasks, ev
Externí odkaz:
http://arxiv.org/abs/2408.02336