Information system for researching and evaluating user reviews of products
Jazyk: | ruština |
---|---|
Rok vydání: | 2022 |
Předmět: |
обнаÑÑжение Ñпама
ÐнÑоÑмаÑионнÑе ÑиÑÑÐµÐ¼Ñ Ð³Ð»Ñбокое обÑÑение spam features ÐÑкÑÑÑÑвеннÑй инÑÐµÐ»Ð»ÐµÐºÑ sentiment analysis opinion mining deep learning spam detection анализ наÑÑÑоений анализ мнений оÑобенноÑÑи Ñпама |
DOI: | 10.18720/spbpu/3/2022/vr/vr22-3990 |
Popis: | Тема магиÑÑеÑÑкой диÑÑеÑÑаÑии: «ÐнÑоÑмаÑÐ¸Ð¾Ð½Ð½Ð°Ñ ÑиÑÑема Ð´Ð»Ñ Ð¸ÑÑÐ»ÐµÐ´Ð¾Ð²Ð°Ð½Ð¸Ñ Ð¸ оÑенки оÑзÑвов полÑзоваÑелей о пÑодÑкÑÐ°Ñ Â». ÐÐ°Ð½Ð½Ð°Ñ ÑабоÑа поÑвÑÑена иÑÑÐ»ÐµÐ´Ð¾Ð²Ð°Ð½Ð¸Ñ Ð·Ð°Ð´Ð°Ñ Ð°Ð½Ð°Ð»Ð¸Ð·Ð° мнений и анализа наÑÑÑоений, обнаÑÑÐ¶ÐµÐ½Ð¸Ñ Ñпама мнений, а Ñакже ÑÐ¾Ð·Ð´Ð°Ð½Ð¸Ñ Ð¼Ð¾Ð´Ñлей обнаÑÑÐ¶ÐµÐ½Ð¸Ñ Ñпама и анализа мнений на ÑÑовне оÑзÑва и Ð¸Ñ Ð²Ð½ÐµÐ´ÑениÑ. ÐадаÑи, коÑоÑÑе бÑли ÑеÑÐµÐ½Ñ Ð² ÑÑой ÑабоÑе: ÐемонÑÑÑаÑÐ¸Ñ Ð²Ð»Ð¸ÑÐ½Ð¸Ñ Ñпам-оÑзÑвов на ÑезÑлÑÑаÑÑ ÑиÑÑÐµÐ¼Ñ Ð°Ð½Ð°Ð»Ð¸Ð·Ð° мнений и важноÑÑи вклÑÑÐµÐ½Ð¸Ñ Ð¼Ð¾Ð´ÑÐ»Ñ Ð¾Ð±Ð½Ð°ÑÑÐ¶ÐµÐ½Ð¸Ñ Ñпама. Создание модÑÐ»Ñ Ð¾Ð±Ð½Ð°ÑÑÐ¶ÐµÐ½Ð¸Ñ Ñпама, иÑполÑзÑÑ Ð¾ÑобенноÑÑи, коÑоÑÑе извлекаÑÑÑÑ Ð¸Ð· ÑекÑÑа оÑзÑва, и внедÑение ÑÑой ÑиÑÑемÑ. Ðнализ Ñоли ÑиÑÑем анализа мнений на ÑовÑеменном конкÑÑенÑном ÑÑнке и ÑазлиÑнÑе облаÑÑи пÑÐ¸Ð¼ÐµÐ½ÐµÐ½Ð¸Ñ ÑÑÐ¸Ñ ÑиÑÑем. Создание ÑиÑÑÐµÐ¼Ñ Ð°Ð½Ð°Ð»Ð¸Ð·Ð° мнений, ÑпоÑобной опÑеделÑÑÑ Ð¾ÑиенÑаÑÐ¸Ñ Ð½Ð°ÑÑÑоений в оÑзÑÐ²Ð°Ñ , и внедÑение ÑÑой ÑиÑÑемÑ. Ðнализ полÑÑеннÑÑ ÑезÑлÑÑаÑов ÑеÑÑиÑÐ¾Ð²Ð°Ð½Ð¸Ñ Ð¿Ð¾ÑÑÑоеннÑÑ Ð¼Ð¾Ð´Ñлей на оÑнове показаÑелей оÑенки. Ð Ñ Ð¾Ð´Ðµ вÑÐ¿Ð¾Ð»Ð½ÐµÐ½Ð¸Ñ ÑÑой ÑабоÑÑ Ð±ÑÐ´ÐµÑ Ð¿Ð¾ÐºÐ°Ð·Ð°Ð½Ð° важноÑÑÑ Ð¼Ð½ÐµÐ½Ð¸Ð¹ как движÑÑей ÑÐ¸Ð»Ñ ÑеловеÑеÑкого Ð¿Ð¾Ð²ÐµÐ´ÐµÐ½Ð¸Ñ Ð¸ ÑÑнка в Ñелом, а Ñакже бÑдÑÑ Ð¸ÑÑÐ»ÐµÐ´Ð¾Ð²Ð°Ð½Ñ ÑеÑаемÑе задаÑи в облаÑÑи анализа мнений и ÑазлиÑнÑе Ð¿Ð¾Ð´Ñ Ð¾Ð´Ñ Ðº Ð¸Ñ ÑеÑÐµÐ½Ð¸Ñ Ð¸ ÑÑовни пÑÐ¾Ð²ÐµÐ´ÐµÐ½Ð¸Ñ Ð·Ð°Ð´Ð°Ñи анализа мнений, а Ñакже бÑдÑÑ Ð¿Ñедложена ÑиÑÑема анализа мнениÑ, вклÑÑаÑÑÐ°Ñ Ð² ÑÐµÐ±Ñ Ð¼Ð¾Ð´Ñли обнаÑÑÐ¶ÐµÐ½Ð¸Ñ Ñпама и анализа наÑÑÑоений на ÑÑовне оÑзÑва. Ðба модÑÐ»Ñ Ð±Ñли поÑÑÑÐ¾ÐµÐ½Ñ Ð¸ ÑÐµÐ°Ð»Ð¸Ð·Ð¾Ð²Ð°Ð½Ñ Ñ Ð¸ÑполÑзованием ÑзÑка пÑогÑаммиÑÐ¾Ð²Ð°Ð½Ð¸Ñ Python. ÐÐ»Ñ ÑÑой Ñели бÑли иÑполÑÐ·Ð¾Ð²Ð°Ð½Ñ 2 модели: Ð¼Ð¾Ð´ÐµÐ»Ñ Ð¼Ð°Ñинного обÑÑÐµÐ½Ð¸Ñ Ð´Ð»Ñ Ð¾Ð±Ð½Ð°ÑÑÐ¶ÐµÐ½Ð¸Ñ Ñпама, и Ð¼Ð¾Ð´ÐµÐ»Ñ Ð³Ð»Ñбокого обÑÑÐµÐ½Ð¸Ñ Ð´Ð»Ñ Ð°Ð½Ð°Ð»Ð¸Ð·Ð° наÑÑÑоений. ÐодÑÐ»Ñ Ð¾Ð±Ð½Ð°ÑÑÐ¶ÐµÐ½Ð¸Ñ Ñпама обÑабаÑÑÐ²Ð°ÐµÑ Ð´Ð°Ð½Ð½Ñе и Ð¸Ð·Ð²Ð»ÐµÐºÐ°ÐµÑ Ð²Ñе нÑжнÑе оÑобенноÑÑи из домена оÑзÑва пеÑед обÑÑением модели, а модÑÐ»Ñ Ð°Ð½Ð°Ð»Ð¸Ð·Ð° наÑÑÑоений обÑабаÑÑÐ²Ð°ÐµÑ Ð´Ð°Ð½Ð½Ñе и пеÑÐµÐ²Ð¾Ð´Ð¸Ñ Ð¸Ñ Ð² ÑиÑловÑе знаÑÐµÐ½Ð¸Ñ Ð¿ÐµÑед анализом. ÐÑоме Ñого, бÑли иÑполÑÐ·Ð¾Ð²Ð°Ð½Ñ 2 обÑедоÑÑÑпнÑÑ Ð½Ð°Ð±Ð¾Ñа даннÑÑ Ð´Ð»Ñ Ð¾Ð±ÑÑÐµÐ½Ð¸Ñ Ð¼Ð¾Ð´ÐµÐ»Ñм маÑинного обÑÑÐµÐ½Ð¸Ñ Ð¸ глÑбокого обÑÑениÑ. Ð ÑезÑлÑÑаÑе ÑÑи две модели бÑли оÑененÑ, ÑÑÐ¾Ð±Ñ Ð¿ÑодемонÑÑÑиÑоваÑÑ ÑÑÑекÑивноÑÑÑ, надежноÑÑÑ Ð¸ ÑдобÑÑво иÑполÑÐ·Ð¾Ð²Ð°Ð½Ð¸Ñ ÑÑÐ¸Ñ Ð¼Ð¾Ð´ÐµÐ»ÐµÐ¹ наÑÑÐ´Ñ Ñ Ð¿ÑедложеннÑм Ð¿Ð¾Ð´Ñ Ð¾Ð´Ð¾Ð¼, а Ñакже Ð¸Ñ Ð³Ð¾ÑовноÑÑÑ Ðº иÑполÑÐ·Ð¾Ð²Ð°Ð½Ð¸Ñ Ð² ÑеалÑнÑÑ Ð¿ÑиложениÑÑ . The subject of the graduate qualification work is «Information system for researching and evaluating user reviews of products». The given work is devoted studying the tasks of opinion analysis and sentiment analysis, the detection of opinion spam, as well as the creation of spam detection and opinion analysis modules at the review-level and their implementation. The research set the following goals: Demonstrating the impact of spam reviews on the results of the opinion analysis system and the importance of enabling the spam detection module. Creating a spam detection module using features that are extracted from the text of the review, and implementing this system. Analyzing the role of opinion analysis systems in the modern competitive market and various applications of these systems. Creating an opinion analysis system capable of determining the orientation of sentiment in reviews, and implementing this system. Analyzing the obtained test results of the constructed modules based on performance measures. In the course of this work, the importance of opinions as a driving force of human behavior and the market as a whole will be shown, as well as the researched tasks in the field of opinion analysis and various approaches to solving them and the levels of performing opinion analysis will be investigated, and an opinion analysis system will be proposed, including spam detection and sentiment analysis modules on the review-level. Both modules were built and implemented using Python programming language. For this purpose, 2 models were used: a machine learning model for spam detection, and a deep learning model for sentiment analysis. The spam detection module processes the data and extracts all the necessary features from the review- domain before training the model, and the sentiment analysis module processes the data and converts it to numeric values before analysis.In addition, 2 publicly available datasets were used to train machine learning and deep learning models. As a result, these two models were evaluated to demonstrate the effectiveness, reliability and usability of these models along with the proposed approach, as well as their readiness for use in real applications. |
Databáze: | OpenAIRE |
Externí odkaz: |