Text data mining: a proposed framework and future perspectives

Autor:	N.A. Sana', a A. Alwidian, Hani A. Bani Salameh, N.A. Ala', a N. Alslaity
Rok vydání:	2015
Předmět:	Information Systems and Management Information retrieval Data stream mining Computer science Process (engineering) business.industry Semantic analysis (machine learning) Semantics computer.software_genre Data science Field (computer science) Management Information Systems Text mining Management of Technology and Innovation Data mining Cluster analysis business computer Interdisciplinarity
Zdroj:	International Journal of Business Information Systems. 18:127
ISSN:	1746-0980 1746-0972
DOI:	10.1504/ijbis.2015.067261
Popis:	With the increased advancements in technology and the emergence of different kinds of applications, the amount of available data becomes enormous, and the large proliferation of such data becomes evident. Therefore, there is an essential need for some techniques or methods to interact with data and extract useful information and patterns from them. Text data mining (TDM) is the process of extracting desired information out of mountains of textual data that are inherently unstructured, without the need to read them all. In this paper, we shed the light on the-state-of-the-art in text mining as an interdisciplinary field of several related areas. To facilitate the understanding of text data mining, this paper proposes a framework that visualises this field in a step-wise manner, taking into consideration the semantic of the extracted text. In addition, this paper surveys a number of useful applications and proposes a new approach for spam detection based on the proposed TDM framework.
Databáze:	OpenAIRE
Externí odkaz:	https://explore.openaire.eu/search/publication?articleId=doi_________::077effdb0995736b499862e823ab8e88 https://doi.org/10.1504/ijbis.2015.067261 Zobrazit plný text záznamu