Fuzzy Modeling for Multi-Label Text Classification Supported by Classification Algorithms
Autor: | Beatriz Wilges, Silvia Modesto Nassar, Rogério Cid Bastos, Gustavo Pereira Mateus, Renato Cislaghi |
---|---|
Rok vydání: | 2016 |
Předmět: |
Fuzzy classification
Neuro-fuzzy Computer Networks and Communications Computer science Fuzzy set Decision tree 02 engineering and technology Machine learning computer.software_genre Fuzzy logic Defuzzification Knowledge extraction Artificial Intelligence 020204 information systems 0202 electrical engineering electronic engineering information engineering Fuzzy associative matrix Adaptive neuro fuzzy inference system Training set business.industry Unstructured data Statistical classification Knowledge base Fuzzy set operations 020201 artificial intelligence & image processing Artificial intelligence Data mining business computer Software |
Zdroj: | Journal of Computer Science. 12:341-349 |
ISSN: | 1549-3636 |
DOI: | 10.3844/jcssp.2016.341.349 |
Popis: | The ever-increasing amount of information on the Web is organized in structured, semi-structured and unstructured data. Text classification systems, capable of handling such different structures, may facilitate the work of important tasks such as indexation and information retrieval in search engines. The objective of this research is to develop a method for the classification of documents into multiple categories with fuzzy logic. This method was built from a process of pattern recognition and, also, two variables called similarity and accuracy were used. The proposed fuzzy classification method uses variables that express the ability to analyze the similarity and accuracy of a document through a database of terms. The database of terms is generated by a collection of pre-classified documents in categories of interest. The documents processed according to the similarity and accuracy in the database of terms composes a training set also called knowledge base. From this database, it is possible to identify a pattern that specifies a set of rules through a knowledge discovery process. This process involves the data mining of the knowledge base. Thus, it was possible to define a general model that is used in the creation of rules and membership functions of the fuzzy model for the classification of documents into multiple categories. The general model of the rules identified in the data mining process and implemented in fuzzy model considers the most significant variables and also contributes to the specification of the membership functions, such as the definition of linguistic terms of fuzzy sets. Thus, it was possible to implement a more deterministic approach regarding the input, membership functions and inference rules of the fuzzy model. The results of the proposed method for classification of documents are relevant because they have a satisfactory accuracy rate. |
Databáze: | OpenAIRE |
Externí odkaz: |