利用馬可夫邏輯網路模型與自動化生成的模板加強生醫文獻之語意角色標註
Autor: | 賴柏廷 |
---|---|
Předmět: | |
Druh dokumentu: | Text |
Popis: | 背景: 生醫文獻語意角色標註(Semantic Role Labeling, SRL)是一種自然語言處理的技術,其可用來將描述生物過程的語句以predicate-argument structures ( PASs ) 表示。SRL 經常受限於arguments的unbalance problem而且需要花費許多的時間和記憶體空間在學習 arguments 之間的相依性。 方法: 我們提出一Markov Logic Network ( MLN ) -based SRL之系統,且此系統使用自動化生成之SRL 模板同時辨識constituents與候選之語意角色。 結果及結論: 我們的方法在BioProp語料上來評估。實驗結果顯示我們的方法勝過目前最先進的系統。此外,使用SRL模板後,在時間及記憶體之花費上亦大幅的減少,而且我們自動化生成之模板亦能幫助建立這些模板。我們認為本論文提出之方法可以透過增加新的SRL模板例如:由生物學家手動寫的模板,而得到進一步的提升,而且本方法也為於需要處理大量SRL 語料時,提供一種可能的解法。 Background: Biomedical semantic role labeling ( SRL ) is a natural language processing technique that expresses the sentences that describe biological processes as predicate-argument structures ( PASs ). SRL usually suffers from the unbalanced problem of arguments and consuming time and memory on learning the dependencies between the arguments. Method: We constructed a Markov Logic Network ( MLN ) -based SRL system, and the system uses SRL patterns, which utilizes automatically generated approaches, to simultaneously recognize the constituents and candidates of semantic roles. Results and conclusions: Our method is evaluated on the BioProp corpus. The experimental result shows that our method outperforms the state-of-the-art system. Furthermore, after applying SRL patterns, the costs of the time and memory are greatly reduced, and our automatically generated patterns are helpful in the development of these patterns. We consider that our method can be further improved by adding new SRL patterns such as biological experts manually written patterns and it also provide a possible solution to process large SRL corpus. |
Databáze: | Networked Digital Library of Theses & Dissertations |
Externí odkaz: |