Zobrazeno 1 - 10
of 149
pro vyhledávání: '"Nguyen, Dat Quoc"'
Retrieval-augmented generation (RAG) methods are viable solutions for addressing the static memory limits of pre-trained language models. Nevertheless, encountering conflicting sources of information within the retrieval context is an inevitable prac
Externí odkaz:
http://arxiv.org/abs/2410.15737
Autor:
Ngo, Hoang, Nguyen, Dat Quoc
We present the first domain-adapted and fully-trained large language model, RecGPT-7B, and its instruction-following variant, RecGPT-7B-Instruct, for text-based recommendation. Experimental results on rating prediction and sequential recommendation t
Externí odkaz:
http://arxiv.org/abs/2405.12715
Machine translation for Vietnamese-English in the medical domain is still an under-explored research area. In this paper, we introduce MedEV -- a high-quality Vietnamese-English parallel dataset constructed specifically for the medical domain, compri
Externí odkaz:
http://arxiv.org/abs/2403.19161
We introduce PhoWhisper in five versions for Vietnamese automatic speech recognition. PhoWhisper's robustness is achieved through fine-tuning the Whisper model on an 844-hour dataset that encompasses diverse Vietnamese accents. Our experimental study
Externí odkaz:
http://arxiv.org/abs/2406.02555
Autor:
Pham, Thinh, Nguyen, Dat Quoc
Profile-based intent detection and slot filling are important tasks aimed at reducing the ambiguity in user utterances by leveraging user-specific supporting profile information. However, research in these two tasks has not been extensively explored.
Externí odkaz:
http://arxiv.org/abs/2312.08737
The research study of detecting multiple intents and filling slots is becoming more popular because of its relevance to complicated real-world situations. Recent advanced approaches, which are joint models based on graphs, might still face two potent
Externí odkaz:
http://arxiv.org/abs/2312.05741
We open-source a state-of-the-art 4B-parameter generative model series for Vietnamese, which includes the base pre-trained monolingual model PhoGPT-4B and its chat variant, PhoGPT-4B-Chat. The base model, PhoGPT-4B, with exactly 3.7B parameters, is p
Externí odkaz:
http://arxiv.org/abs/2311.02945
We present XPhoneBERT, the first multilingual model pre-trained to learn phoneme representations for the downstream text-to-speech (TTS) task. Our XPhoneBERT has the same model architecture as BERT-base, trained using the RoBERTa pre-training approac
Externí odkaz:
http://arxiv.org/abs/2305.19709
Autor:
Tong, Vinh, Nguyen, Dat Quoc, Huynh, Trung Thanh, Nguyen, Tam Thanh, Nguyen, Quoc Viet Hung, Niepert, Mathias
Knowledge graph (KG) alignment and completion are usually treated as two independent tasks. While recent work has leveraged entity and relation alignments from multiple KGs, such as alignments between multilingual KGs with common entities and relatio
Externí odkaz:
http://arxiv.org/abs/2210.08922
We present the first empirical study investigating the influence of disfluency detection on downstream tasks of intent detection and slot filling. We perform this study for Vietnamese -- a low-resource language that has no previous study as well as n
Externí odkaz:
http://arxiv.org/abs/2209.08359