NAMED ENTITY RECOGNITION FROM BIOMEDICAL TEXT -AN INFORMATION EXTRACTION TASK

Autor: N. Kanya, T. Ravi
Jazyk: angličtina
Rok vydání: 2016
Předmět:
Zdroj: ICTACT Journal on Soft Computing, Vol 6, Iss 4, Pp 1303-1307 (2016)
Druh dokumentu: article
ISSN: 0976-6561
2229-6956
Popis: Biomedical Text Mining targets the Extraction of significant information from biomedical archives. Bio TM encompasses Information Retrieval (IR) and Information Extraction (IE). The Information Retrieval will retrieve the relevant Biomedical Literature documents from the various Repositories like PubMed, MedLine etc., based on a search query. The IR Process ends up with the generation of corpus with the relevant document retrieved from the Publication databases based on the query. The IE task includes the process of Preprocessing of the document, Named Entity Recognition (NER) from the documents and Relationship Extraction. This process includes Natural Language Processing, Data Mining techniques and machine Language algorithm. The preprocessing task includes tokenization, stop word Removal, shallow parsing, and Parts-Of-Speech tagging. NER phase involves recognition of well-defined objects such as genes, proteins or cell-lines etc. This process leads to the next phase that is extraction of relationships (IE). The work was based on machine learning algorithm Conditional Random Field (CRF).
Databáze: Directory of Open Access Journals