Application of word embedding and machine learning in detecting phishing websites

Autor:	Routhu Srinivasa Rao, Alwyn R. Pais, Amey Umarekar
Rok vydání:	2021
Předmět:	Password Source code Word embedding Information retrieval Plain text Computer science media_common.quotation_subject computer.file_format Phishing Information sensitivity Credit card Electrical and Electronic Engineering computer Personally identifiable information media_common
Zdroj:	Telecommunication Systems. 79:33-45
ISSN:	1572-9451 1018-4864
DOI:	10.1007/s11235-021-00850-6
Popis:	Phishing is an attack whose aim is to gain personal information such as passwords, credit card details etc. from online users by deceiving them through fake websites, emails or any legitimate internet service. There exists many techniques to detect phishing sites such as third-party based techniques, source code based methods and URL based methods but still users are getting trapped into revealing their sensitive information. In this paper, we propose a new technique which detects phishing sites with word embeddings using plain text and domain specific text extracted from the source code. We applied various word embedding for the evaluation of our model using ensemble and multimodal approaches. From the experimental evaluation, we observed that multimodal with domain specific text achieved a significant accuracy of 99.34% with TPR of 99.59%, FPR of 0.93%, and MCC of 98.68%
Databáze:	OpenAIRE
Externí odkaz:	https://explore.openaire.eu/search/publication?articleId=doi_________::dfb78542dd29a507f96082ec7a21dbfe https://doi.org/10.1007/s11235-021-00850-6 Zobrazit plný text záznamu Full text from SpringerLink