Google scholar arxiv informatics ads IJAIS publications are indexed with Google Scholar, NASA ADS, Informatics et. al.

Call for Paper

-

November Edition 2021

International Journal of Applied Information Systems solicits high quality original research papers for the November 2021 Edition of the journal. The last date of research paper submission is October 15, 2021.

Enhanced Classification via Clustering Techniques using Decision Tree for Feature Selection

Balogun Abdullateef O., Mabayoje Modinat A., Salihu Shakirat, Arinze Salvation A.. Published in Information Sciences

International Journal of Applied Information Systems
Year of Publication: 2015
Publisher: Foundation of Computer Science (FCS), NY, USA
Authors: Balogun Abdullateef O., Mabayoje Modinat A., Salihu Shakirat, Arinze Salvation A.
10.5120/ijais2015451425
Download full text
  1. Balogun Abdullateef O., Mabayoje Modinat A., Salihu Shakirat and Arinze Salvation A.. Article: Enhanced Classification via Clustering Techniques using Decision Tree for Feature Selection. International Journal of Applied Information Systems 9(6):11-16, September 2015. BibTeX

    @article{key:article,
    	author = "Balogun Abdullateef O. and Mabayoje Modinat A. and Salihu Shakirat and Arinze Salvation A.",
    	title = "Article: Enhanced Classification via Clustering Techniques using Decision Tree for Feature Selection",
    	journal = "International Journal of Applied Information Systems",
    	year = 2015,
    	volume = 9,
    	number = 6,
    	pages = "11-16",
    	month = "September",
    	note = "Published by Foundation of Computer Science (FCS), NY, USA"
    }
    

Abstract

Information overload has raggedly increased as a result of the advances in the aspect of storage capabilities and data collection in previous years. The growth seen in the number of observation has partly cause a collapse in analytical method but the increases in the number of variable associated with each observation has grossly collapse it. The number of variables that are measured on each observation.is referred to as the dimension of the data, and a major problem of dataset containing high dimensions is that, there exist only few “important” measured variables for understanding the fundamental occurrences of interest. Hence, dimension reduction of the original data prior to any modeling of the data is of great necessity today. In this paper, a précis of K-Means, Expectation Maximization and J48 decision tree classifier is presented with a framework on the performance measurement of base classifiers with and without feature reduction. A performance evaluation was carried out based on F-Measure, Precision, Recall, True Positive Rate, False Positive Rate, ROC Area and Time taken to build model. The experiment revealed that the reduced dataset yielded improved results than the full dataset after performing classification via clustering.

Reference

  1. Patil, T.R. and Sherekar, S.S (2013). Performance Analysis of Naive Bayes and J48 Classification Algorithm for Data Classification. International Journal of Computer Science and Applications. Vol. 6, No.2.
  2. Preeti K. and Rajeswari K. (2014). Selection of Significant Features using Decision Tree Classifiers. International Journal for Engineering Research and Applications (IJERA).
  3. Osama, A.A. (2008). Comparisons between Data Clustering Algorithms. The International Arab Journal of Information Technology, Vol. 5, No. 3.
  4. Yong, G.J., Min S.K. and Jun H. (2014). Clustering Performance Comparison using K-means and Expectation Maximization Algorithms. Biotechnology & Biotechnological Equipment, 28:sup1, S44-S48.
  5. Ian H.W., Eibe F. and Mark A.H. (2011). Data Minig: Practical Machine Learning Tools and Techniques (3rd edition). Morgan Kaufmann Publishers Inc., Inc., San Francisco, CA, USA.
  6. Bezdek, J.C. (1980). A Convergence Theorem for the Fuzzy C-means Clustering Algorithms. IEEE Transactions on Pattern Analysis and Machine Intelligence.
  7. Mahdi, E. and Fazekas, G. (2011). Feature Selection as an Improving Step for Decision Tree Construction. 2009 International Conference on Machine Learning and Computing IPCSIT Vol. 3. p. 35.
  8. Sang-Hyun C. and Hee-Su C. (2014). Feature Selection using Attribute Ratio in NSL-KDD data. International Conference Data Mining, Civil and Mechanical Engineering (ICDMCME’2014), Feb 4-5, 2014 Bali (Indonesia).
  9. Neil, A., Andrew, S. and Doug, T (n.d). Clustering with EM and K-Means.
  10. Mehmet, A., I. Cigdem and A. Mutlu. (2010). “A hybrid classification method of K Nearest Neighbor, Bayesian Methods and Genetic Algorithm,” Expert Systems with Applications vol. 37, p. 5061–7.
  11. Namita B., Deepti M., (2013). Comparative Study of EM and K-Means Clustering Techniques in WEKA interface. International Journal of Advanced Technology & Engineering Research (IJATER) Volume 3, Issue 4, Pp 40.
  12. Kesavalu, E., Reddy, V.N. and Rajulu, P.G. (2011). A Study of Intrusion Detection in Data Mining. Proceedings of the World Congress on Engineering 2011 Vol IIIWCE 2011, July 6-8, 2011, London, UK.

Keywords

K-Means (KM), Expectation Maximization (EM), Decision Tree, Feature Selection, Data mining