Autor: |
Lamine Faty, Marie Ndiaye, Edouard Ngor Sarr, Ousmane Sall |
Rok vydání: |
2020 |
Předmět: |
|
Zdroj: |
SNAMS |
DOI: |
10.1109/snams52053.2020.9336576 |
Popis: |
Information websites, known as On-line Press, contain a tremendous amount of data which is potentially promising. All this information is available in real-time. Due to the velocity in the information broadcasting and the volume of available data, the traditional data extraction methods appear unsuitable to extract the good information on the right pages. It is in this context that we propose a new data extraction. OpinionScraper is a tool for collecting, merging and categorizing journalistic data in order to store it in json format. This tool allows information to be extracted from Web pages in an optimal way. It also represents this information according to the model defined for opinion mining. The interest of implementing a scraping tool is to build an easily exploitable database from journalistic comments in order to respond to the algorithmic complexity of opinion mining. Our tool succeeds in efficiently extracting comments from 132 websites and classify them in the Json informat database. |
Databáze: |
OpenAIRE |
Externí odkaz: |
|