Pruning Based Interestingness of Mined Classification Patterns

Al-Hegami, Ahmed
October 2009
International Arab Journal of Information Technology (IAJIT);Oct2009, Vol. 6 Issue 4, p336
Academic Journal
Classification is an important, problem in data mining. Decision tree induction is one of the most common techniques that are applied to solve the classification problem. Many decision tree induction algorithms have been proposed based on different attribute selection and pruning strategies. Although the patterns induced by decision trees are easy to interpret and comprehend compare to the patterns induced by other classification algorithms, the constructed decision trees may contain hundreds or thousand of nodes which are difficult to comprehend and interpret by the user who examines the patterns. For this reasons, the question of an appropriate constructing and providing a good pruning criteria have long been a topic of considerable debate. The main objective of such criteria is to create a tree such that the classification accuracy, when used on unseen data, is maximized and the tree size is minimized. Usually, most of decision tree algorithms perform splitting criteria to construct a tree first, then, prune the tree to find an accurate, simple, and comprehensible tree. Even after pruning, the decision tree constructed may be extremely huge and may reflect patterns, which are not interesting from the user point of view. In many scenarios, users are only interested in obtaining patterns that are interesting; thus, users may require obtaining a simple, and interpretable, but only approximate decision tree much better than an accurate tree that involves a lot of details. In this paper, we proposed a pruning approach that captures the user subjectivity to discoverer interesting patterns. The approach computes the subjective interestingness and uses it as a pruning criterion to prune away uninteresting patterns. The proposed framework helps in reducing the size of the induced model and maintaining the model One of the features of the proposed approach is to capture the user background knowledge, which is monotonically augmented. The experimental results are quite promising.


Related Articles

  • A NEW PRUNING APPROACH FOR BETTER AND COMPACT DECISION TREES. Mahmood, Ali Mirza; Gudapati, Pavani; Kavuluru, Venu Gopal; Kuppa, Mrithyumjaya Rao // International Journal on Computer Science & Engineering;2010, p2551 

    The development of computer technology has enhanced the people's ability to produce and collect data. Data mining techniques can be effectively utilized for analyzing the data to discover hidden knowledge. One of the well known and efficient techniques is decision trees, due to easy...

  • EXTRACTING USEFUL RULES THROUGH IMPROVED DECISION TREE INDUCTION USING INFORMATION ENTROPY. Ali, Mohd. Mahmood; Qaseem, Mohd. S.; Rajamani, Lakshmi; Govardhan, A. // International Journal of Information Sciences & Techniques;Jan2013, Vol. 3 Issue 1, p27 

    Classification is widely used technique in the data mining domain, where scalability and efficiency are the immediate problems in classification algorithms for large databases. We suggest improvements to the existing C4.5 decision tree algorithm. In this paper attribute oriented induction (AOI)...

  • KNOWLEDGE CONFLICTS RESOLVING IN THE MULTI-AGENT DECISION SUPPORT SYSTEM USING MULTI-STAGE CONSENSUS DETERMINING. Sobieska-KarpiƄska, Jadwiga; Hernes, Marcin // Business Informatics / Informatyka Ekonomiczna;2012, Vol. 3 Issue 25, p182 

    The present paper describes using a multi-stage consensus algorithm to knowledge conflicts resolving in multi-agent decision support systems. The problem of knowledge conflict between agents, structure of decision, profile and criteria of consensus determining are presented in the first part of...

  • Decision Support System for Medical Diagnosis Using Data Mining. Kumar, D. Senthil; Sathyadevi, G.; Sivanesh, S. // International Journal of Computer Science Issues (IJCSI);May2011, Vol. 8 Issue 3, p147 

    The healthcare industry collects a huge amount of data which is not properly mined and not put to the optimum use. Discovery of these hidden patterns and relationships often goes unexploited. Our research focuses on this aspect of Medical diagnosis by learning pattern through the collected data...

  • Decision Tree Induction&Clustering Techniques In SAS Enterprise Miner, SPSS Clementine, And IBM Intelligent Miner -- A Comparative Analysis. Al Ghoson, Abdullah M. // International Journal of Management & Information Systems;2010 3rd Quarter, Vol. 14 Issue 3, p57 

    Decision tree induction and Clustering are two of the most prevalent data mining techniques used separately or together in many business applications. Most commercial data mining software tools provide these two techniques but few of them satisfy business needs. There are many criteria and...

  • A new web based data mining exploration and reporting tool for decision makers. Bozkir, Ahmet Selman; Sezer, Ebru Akcapinar // Artificial Intelligence Research;Sep2013, Vol. 2 Issue 3, p70 

    The current DSS tools are generally built as "desktop applications" and designed for the use of data mining experts. In this paper, design and implementation of ASMINER, a new web-based data mining exploration and reporting tool, is introduced. ASMINER enables both decision makers and also...

  • An Overview of Recent and Traditional Decision Tree Classifiers in Machine Learning. Mahmood, Ali Mirza; Satuluri, Naganjaneyulu; Kuppa, Mrithyumjaya Rao // International Journal of Research & Reviews in Ad hoc Networks;Mar2011, Vol. 1 Issue 1, p9 

    Data mining techniques can be effectively utilized for analyzing the data to discover hidden knowledge. Decision trees are considered to be one of the most popular approaches for representing classifiers. The ever growing presence of data lead to a large number of proposed algorithms for...

  • Meta-analysis and evaluation of visualization support to decision trees classification. Zuyi Chen; Taixiang Zhao // Advanced Materials Research;7/24/2014, Vol. 989-994, p1692 

    In the social sciences, meta-analysis has been used on a limited scale only, mainly because there still remains a gap between the knowledge available and itsapplication in policymaking. The experimental results suggested that, compared to the automatic modeling process as typically applied in...

  • Comparison of Decision Tree Algorithms for Predicting Potential Air Pollutant Emissions with Data Mining Models. Birant, D. // Journal of Environmental Informatics;Mar2011, Vol. 17 Issue 1, p46 

    Predicting air pollutant emissions from potential industrial installations is important for controlling air pollution and future planning of air quality management. This paper proposes the classification and prediction of the emission levels of industrial air pollutant sources using decision...


Read the Article


Sorry, but this item is not currently available from your library.

Try another library?
Sign out of this library

Other Topics