Research on Film Data Preprocessing and Visualization

Autor: Tiansong Li, KeYin Cao, HuaXin Zhang, Yu Liu, Zituo Wang
Rok vydání: 2020
Předmět:
Zdroj: 2020 IEEE International Conference on Information Technology,Big Data and Artificial Intelligence (ICIBA).
DOI: 10.1109/iciba50161.2020.9276830
Popis: Data is the core of information, and good data quality is a prerequisite for many data analysis. Data cleaning is to increase the fault tolerance rate by correcting the error value of detected data. This paper aims to solve the problem of data set processing and visualization in the recommendation algorithm, so as to better apply in the field of recommendation algorithm. The recommendation algorithm and data sets Movielens and IMDB are analyzed theoretically. First, data set A was processed from data reading and movie score calculation; Again, the IMDB is processed in four steps to make it more suitable for the recommendation algorithm field; Finally, the plot function is used to visualize the key information. experiment shows: The data set sorted out by the above methods can effectively improve the quality and availability of data and provide relevant basis for better application in the algorithm.
Databáze: OpenAIRE