Zobrazeno 1 - 10
of 24
pro vyhledávání: '"Luo, Huaishao"'
Recently, the contrastive language-image pre-training, e.g., CLIP, has demonstrated promising results on various downstream tasks. The pre-trained model can capture enriched visual concepts for images by learning from a large scale of text-image data
Externí odkaz:
http://arxiv.org/abs/2211.14813
Few-shot dialogue state tracking (DST) is a realistic problem that trains the DST model with limited labeled data. Existing few-shot methods mainly transfer knowledge learned from external labeled dialogue data (e.g., from question answering, dialogu
Externí odkaz:
http://arxiv.org/abs/2210.05146
Fusion technique is a key research topic in multimodal sentiment analysis. The recent attention-based fusion demonstrates advances over simple operation-based fusion. However, these fusion works adopt single-scale, i.e., token-level or utterance-leve
Externí odkaz:
http://arxiv.org/abs/2112.01368
Autor:
Su, Lin, Duan, Nan, Cui, Edward, Ji, Lei, Wu, Chenfei, Luo, Huaishao, Liu, Yongfei, Zhong, Ming, Bharti, Taroon, Sacheti, Arun
In this paper, we present GEM as a General Evaluation benchmark for Multimodal tasks. Different from existing datasets such as GLUE, SuperGLUE, XGLUE and XTREME that mainly focus on natural language tasks, GEM is a large-scale vision-language benchma
Externí odkaz:
http://arxiv.org/abs/2106.09889
Video-text retrieval plays an essential role in multi-modal research and has been widely used in many real-world web applications. The CLIP (Contrastive Language-Image Pre-training), an image-language pre-training model, has demonstrated the power of
Externí odkaz:
http://arxiv.org/abs/2104.08860
Span extraction is an essential problem in machine reading comprehension. Most of the existing algorithms predict the start and end positions of an answer span in the given corresponding context by generating two probability vectors. In this paper, w
Externí odkaz:
http://arxiv.org/abs/2009.14348
In this paper, we focus on the imbalance issue, which is rarely studied in aspect term extraction and aspect sentiment classification when regarding them as sequence labeling tasks. Besides, previous works usually ignore the interaction between aspec
Externí odkaz:
http://arxiv.org/abs/2009.10557
The detection of the abnormal area from urban data is a significant research problem. However, to the best of our knowledge, previous methods designed on spatio-temporal anomalies are road-based or grid-based, which usually causes the data sparsity p
Externí odkaz:
http://arxiv.org/abs/2007.06794
Autor:
Luo, Huaishao, Ji, Lei, Shi, Botian, Huang, Haoyang, Duan, Nan, Li, Tianrui, Li, Jason, Bharti, Taroon, Zhou, Ming
With the recent success of the pre-training technique for NLP and image-linguistic tasks, some video-linguistic pre-training works are gradually developed to improve video-text related downstream tasks. However, most of the existing multimodal models
Externí odkaz:
http://arxiv.org/abs/2002.06353
This paper focuses on two related subtasks of aspect-based sentiment analysis, namely aspect term extraction and aspect sentiment classification, which we call aspect term-polarity co-extraction. The former task is to extract aspects of a product or
Externí odkaz:
http://arxiv.org/abs/1906.01794