Minimum threshold determination method based on dataset characteristics in association rule mining

Abstract Association rule mining is a technique that is widely used in data mining. This technique is used to identify interesting relationships between sets of items in a dataset and predict associative behavior for new data. Before the rule is formed, it must be determined in advance which items w...

Descripción completa

Guardado en:

Detalles Bibliográficos
Autores principales:	Erna Hikmawati, Nur Ulfa Maulidevi, Kridanto Surendro
Formato:	article
Lenguaje:	EN
Publicado:	SpringerOpen 2021
Materias:	Minimum threshold Adaptive rule Association rule Computer engineering. Computer hardware TK7885-7895 Information technology T58.5-58.64 Electronic computers. Computer science QA75.5-76.95
Acceso en línea:	https://doaj.org/article/1a6effe4028547f6974d48e0b242a861
Etiquetas:	Agregar Etiqueta Sin Etiquetas, Sea el primero en etiquetar este registro!

Descripción
Sumario:	Abstract Association rule mining is a technique that is widely used in data mining. This technique is used to identify interesting relationships between sets of items in a dataset and predict associative behavior for new data. Before the rule is formed, it must be determined in advance which items will be involved or called the frequent itemset. In this step, a threshold is used to eliminate items excluded in the frequent itemset which is also known as the minimum support. Furthermore, the threshold provides an important role in determining the number of rules generated. However, setting the wrong threshold leads to the failure of the association rule mining to obtain rules. Currently, user determines the minimum support value randomly. This leads to a challenge that becomes worse for a user that is ignorant of the dataset characteristics. It causes a lot of memory and time consumption. This is because the rule formation process is repeated until it finds the desired number of rules. The value of minimum support in the adaptive support model is determined based on the average and total number of items in each transaction, as well as their support values. Furthermore, the proposed method also uses certain criteria as thresholds, therefore, the resulting rules are in accordance with user needs. The minimum support value in the proposed method is obtained from the average utility value divided by the total existing transactions. Experiments were carried out on 8 specific datasets to determine the association rules using different dataset characteristics. The trial of the proposed adaptive support method uses 2 basic algorithms in the association rule, namely Apriori and Fpgrowth. The test is carried out repeatedly to determine the highest and lowest minimum support values. The result showed that 6 out of 8 datasets produced minimum and maximum support values for the apriori and fpgrowth algorithms. This means that the value of the proposed adaptive support has the ability to generate a rule when viewed from the quality as adaptive support produces at a lift ratio value of > 1. The dataset characteristics obtained from the experimental results can be used as a factor to determine the minimum threshold value.

Minimum threshold determination method based on dataset characteristics in association rule mining

Ejemplares similares