A Web News Classification Method: Fusion Noise Filtering and Convolutional Neural Network
Autor: | Yanli Hu, Aixia Zhou, Zhen Tan, Chong Zhang, Bin Ge, Chunhui He |
---|---|
Rok vydání: | 2020 |
Předmět: |
Information transfer
business.industry Computer science Information sharing 02 engineering and technology Filter (signal processing) 010501 environmental sciences computer.software_genre 01 natural sciences Convolutional neural network Noise Semantic similarity 0202 electrical engineering electronic engineering information engineering 020201 artificial intelligence & image processing The Internet Data mining business F1 score computer 0105 earth and related environmental sciences |
Zdroj: | SSPS |
DOI: | 10.1145/3421515.3421523 |
Popis: | As the way of Internet information transfer, web news plays a significant role in information sharing. Considering that web news usually contains a lot of content, after in-depth analysis, we found that not all content is related to the news topic, and a lot of web news contains some noise content, and these noises content have serious interference to the text classification task. So, how to filter noise and purify web news content to improve the accuracy of web news classification has become a challenging problem. In this paper, we proposed a web news classification method via fusing noise detection, BERT-based semantic similarity noise filtering and convolutional neural network (NF-CNN) to solve the problem. In order to comprehensively evaluate the performance of the method, we use the Chinese public news classification dataset to evaluate it. The experimental results demonstrate that our method can effectively detect and filter a lot of noise text and the average F1 score can reach 95.61% on web news classification task. |
Databáze: | OpenAIRE |
Externí odkaz: |