Automated Disaster News Collection Classification and Geoparsing

Autor:	Shubham Raorane, Chelsea Fernandes, Sharon Mathew, Joshua Fernandes, Anuradha Srinivasaraghavan
Rok vydání:	2021
Předmět:	Focus (computing) Computer science business.industry Event (computing) Human life Supervised learning computer.software_genre World Wide Web Named-entity recognition The Internet InformationSystems_MISCELLANEOUS business Geoparsing computer Web scraping
Zdroj:	SSRN Electronic Journal.
ISSN:	1556-5068
DOI:	10.2139/ssrn.3852688
Popis:	A disaster is an unforeseen event, which can have a tremendous impact on human life as well as on the environment. The Internet provides a lot of sources that generate huge amounts of news articles daily. With the increase in the number of online news articles, it has become difficult for users to access disaster relevant news, which makes it a necessity to extract and classify news so that they could be easily accessed. This paper presents an automated system that scraps news from various online sources and identifies disaster relevant news. The news articles are scraped with the help of a scrapy framework and a model is trained using Machine Learning algorithms to classify news as disaster and non-disaster. The system also uses a geoparsing model to identify the focus location from the extracted news articles. The geoparsing model is built using Named Entity Recognition (NER).
Databáze:	OpenAIRE
Externí odkaz:	https://explore.openaire.eu/search/publication?articleId=doi_________::dc57e05ed9883ffa8baa824e2f29ef6e https://doi.org/10.2139/ssrn.3852688 Zobrazit plný text záznamu