Autor: |
Dong, Qian, Niu, Shuzi, Yuan, Tao, Li, Yucheng |
Předmět: |
|
Zdroj: |
Data Science & Engineering; Mar2022, Vol. 7 Issue 1, p30-43, 14p |
Abstrakt: |
BERT-based ranking models are emerging for its superior natural language understanding ability. All word relations and representations in the concatenation of query and document are modeled in the self-attention matrix as latent knowledge. However, some latent knowledge has none or negative effect on the relevance prediction between query and document. We model the observable and unobservable confounding factors in a causal graph and perform do-query to predict the relevance label given an intervention over this graph. For the observed factors, we block the back door path by an adaptive masking method through the transformer layer and refine word representations over this disentangled word graph through the refinement layer. For the unobserved factors, we resolve the do-operation query from the front door path by decomposing word representations into query related and unrelated parts through the decomposition layer. Pairwise ranking loss is mainly used for the ad hoc document ranking task, triangle distance loss is introduced to both the transformer and refinement layers for more discriminative representations, and mutual information constraints are put on the decomposition layer. Experimental results on public benchmark datasets TREC Robust04 and WebTrack2009-12 show that DGRe outperforms state-of-the-art baselines more than 2% especially for short queries. [ABSTRACT FROM AUTHOR] |
Databáze: |
Complementary Index |
Externí odkaz: |
|