Popis: |
With the increase in unstructured data, the importance of classification of text-based documents has increased. In particular, the classification of news texts and digital documentation provides easy access to the information sought. In this study, a large amount of news textual data was used. After the data set was preprocessed, Bag of Words (BoW), TF-IDF, Word2Vec and Doc2Vec word embedding methods were applied. In the classification phase, Random Forest (RF), Multilayer Perceptron (MLP), Support Vector Machine (SVM) and Deep Neural Network (DNN) algorithms were applied. As a result of the experimental studies, using the Word2Vec method together with the DNN algorithm performed the best result. |