Mixture of Expert Large Language Model for Legal Case Element Recognition

Autor: YIN Hua, WU Zihao, LIU Tingting, ZHANG Jiajia, GAO Ziqian
Jazyk: čínština
Rok vydání: 2024
Předmět:
Zdroj: Jisuanji kexue yu tansuo, Vol 18, Iss 12, Pp 3260-3271 (2024)
Druh dokumentu: article
ISSN: 1673-9418
DOI: 10.3778/j.issn.1673-9418.2406047
Popis: The intelligent judicial decision-making is gradually aligning with the logic of legal adjudication. Case element recognition is a fundamental task proposed in recent years. Compared with earlier methods based on deep learning and machine reading comprehension, the generative element recognition approach using large language models (LLM) holds greater potential for complex reasoning. However, the current performance of judicial LLM on these fundamental tasks remains suboptimal. This paper introduces a conversational mixture of expert element recognition LLM. The proposed model in this paper first designs specific prompts tailored to the characteristics of cases for the ChatGLM3-6B-base model. The LLM is then fine-tuned with full parameters to acquire basic element recognition capabilities, with its weights shared among subsequent hybrid experts to reduce learning costs. To address different case types and label imbalance scenarios, case-specific DoRA experts and label-specific DoRA experts are integrated into the LLM’s attention layer, enhancing the model’s ability to differentiate between tasks. A learnable gating mechanism is also designed to facilitate the selection of label experts. The proposed model is tested on the CAIL2019 dataset and a desensitized theft case element recognition dataset from a certain province, nine benchmark models across three types of methods are compared, and ablation experiments are conducted. Experimental results show that the proposed model’s overall performance, measured by the F1 score, exceeds the best-performance model by 5.9 percentage points. On the label-imbalanced CAIL2019 dataset, the label expert effectively mitigates the impact of extreme data imbalance. Additionally, without repeated full-parameter fine-tuning, the basic model trained on CAIL2019 achieves optimal results in theft cases of a certain province after lightweight fine-tuning by case and label experts, demonstrating the model’s scalability.
Databáze: Directory of Open Access Journals