On the design of hardware architectures for parallel frequent itemsets mining
Autor: | Raudel Hernández-León, Claudia Feregrino-Uribe, Lázaro Bustio-Martínez, Martin Letras, René Cumplido |
---|---|
Rok vydání: | 2020 |
Předmět: |
0209 industrial biotechnology
Speedup business.industry Data stream mining Computer science General Engineering 02 engineering and technology Disjoint sets Field (computer science) Computer Science Applications Task (computing) 020901 industrial engineering & automation Software Artificial Intelligence 0202 electrical engineering electronic engineering information engineering 020201 artificial intelligence & image processing business Field-programmable gate array Computer hardware |
Zdroj: | Expert Systems with Applications. 157:113440 |
ISSN: | 0957-4174 |
DOI: | 10.1016/j.eswa.2020.113440 |
Popis: | Algorithms for Frequent Itemsets Mining have proved their effectiveness for extracting frequent sets of patterns in datasets. However, in some specific cases, they do not obtain the expected results in an acceptable time. For this reason, Field Programmable Gates Array-based architectures for Frequent Itemsets Mining have been proposed to accelerate this task. The current paper proposes a search strategy for Frequent Itemsets Mining based on equivalence classes partitioning. The partitioning on equivalence classes allows dividing the search space into disjoint sets that can be processed in parallel. Consequently, this paper presents the design and implementation of two hardware architectures that exploit the nested parallelism in the proposed search strategy. These hardware architectures are capable of obtaining frequent itemsets regardless of the number of distinct items and the number of transactions in the dataset, which are the main issues reported in the reviewed literature. Furthermore, the proposed architectures explore the trade-off between acceleration and hardware resource utilization. The experimental results obtained demonstrate that the proposed search strategy can be scaled to achieve a speedup in the processing time of 40 times faster than software-based implementations. |
Databáze: | OpenAIRE |
Externí odkaz: |