Named Entity Recognition for Rental Documents Using NLP

Autor: Devesh Rajadhyax, Dhiraj Chavan, Sharmila Sengupta, Chinmay Patil, Sushant Patil, Komal Nimbalkar
Rok vydání: 2020
Předmět:
Zdroj: Information and Communication Technology for Intelligent Systems ISBN: 9789811570612
DOI: 10.1007/978-981-15-7062-9_38
Popis: Information retrieval is the process of extracting a pertinent set of facts from a text or a document. The documents are of unstructured format, and thus, information retrieval techniques aim at organizing this data. Named Entity Recognition is one of the information retrieval techniques which classifies a particular word or a phrase in its appropriate class. NER can thus, also be used in extracting entities from legal documents, which would help in providing an effective way to represent these documents. This would reduce the task of a lawyer scrutinizing the document, multiple times, to look for the same set of information. NER systems can be developed with different approaches, one of which is utilizing an NLP library. However, these pretrained NLP libraries may or may not be suitable for a particular use case. Hence, in this paper, we depict an approach to analyze rental documents by custom training spaCy NLP library for tagging named entities such as a person, address, amount, date, etc. The system will provide an interface for the user to upload rent documents, and the result analysis will be stored for quick insights into the document.
Databáze: OpenAIRE