Part of #An Enhanced SMOTE Algorithm Using Entropy and Clustering for Imbalanced Accident Data# :
Publishing year : 2014
Conference : Second National Conference on Applied Research in Computer Science and Information Technology
Number of pages : 6
Abstract: Over the course of the century, many real-world applications of imbalanced data are emerging. One of its implications which is firstly considered in this context is the imbalanced accident data. In this paper, we consider the transport and accident data in Tehran-Bazargan highway between 2010 and 2015. In the pre-processing step, SMOTE is considered as one of the most important over-sampling techniques that effectively balances imbalanced data. However, it brings noise and other problems and a great need is felt for improving this method. To solve these problems, several techniques have been proposed in this study, such as the combination of dynamic selected, weighted attribute and distance-weighted techniques along with a mixture of classification and clustering techniques. The performance of the proposed algorithm is measured by the f-measure and the ROC curve and the results are compared by Weka's SMOTE with different algorithms.