Google scholar arxiv informatics ads IJAIS publications are indexed with Google Scholar, NASA ADS, Informatics et. al.

Call for Paper


April Edition 2021

International Journal of Applied Information Systems solicits high quality original research papers for the April 2021 Edition of the journal. The last date of research paper submission is March 15, 2021.

Clustering Gene Expression Data using Quad Tree based Expectation Maximization Approach

Leela Rani. P, Rajalakshmi. P Published in Pattern Recognition

International Journal of Applied Information Systems
Year of Publication 2012
© 2010 by IJAIS Journal
Download full text
  1. Leela Rani.p and Rajalakshmi.p. Article: Clustering Gene Expression Data using Quad Tree based Expectation Maximization Approach. International Journal of Applied Information Systems 2(8):10-13, June 2012. BibTeX

    	author = "Leela Rani.p and Rajalakshmi.p",
    	title = "Article: Clustering Gene Expression Data using Quad Tree based Expectation Maximization Approach",
    	journal = "International Journal of Applied Information Systems",
    	year = 2012,
    	volume = 2,
    	number = 8,
    	pages = "10-13",
    	month = "June",
    	note = "Published by Foundation of Computer Science, New York, USA"


In molecular biology, micro arrays are employed in monitoring the expression levels of genes simultaneously. Arrays are used in the domains of gene expression, genome mapping, toxicity, pathogen identification and other biological applications. Clustering is a useful technique for grouping gene expression data. In clustering, similar gene expression data will be grouped together for identifying relationships between the genes. Clustering of gene expression data is a useful tool for identifying co-expressed genes and biologically relevant grouping of genes, which is an important research area in Bioinformatics. In this paper, a Quad Tree based Expectation Maximization (EM) algorithm has been applied for clustering gene expression data. Quad Tree is used to initialize the cluster centroids. With these centroids, EM is used to group the data efficiently. Expectation Maximization is used to compute maximum likelihood estimates given incomplete samples. Silhouette refers to a method of interpretation and validation of clusters. This measure provides a representation of how well each object lies within its cluster. Experimental results have shown that Quad Tree based Expectation Maximization algorithm finds compact clusters when compared to K-Means algorithm.


  1. Bashar Al-Shboul and Sung-Hyon Myaeng, "Initializing K-Means using Genetic Algorithms", World Academy of Science, Engineering and Technology 54, 2009.
  2. P. S. Bishnu and V. Bhattacherjee, "A New Initialization method for K-Means using Quad Tree," Proc of National. conf. on Methods and Models in Computing, JNU, New Delhi, pp. 73-81, 2008.
  3. T. Chandrasekhar, K. Thangavel and E. Elayaraja, "Performance Analysis of Enhanced Clustering Algorithm for Gene Expression data", IJCSI International Journal of Computer Science Issues, Vol. 8, Issue 6, No 3, November 2011.
  4. Dempster, A. , Laird, N. , and Rubin, D. (1977). Maximum likelihood from incomplete data via the EM algorithm. Journal of the Royal Statistical Society, Series B,39(1):1–38.
  5. J. Han and M. Kamber , Data mining Concepts and techniques, 2nd edition, Morgan Kaufmann Publishers, pp. 401-404, 2007.
  6. Moh'd Belal Al- Zoubi and Mohammad al Rawi, "An Efficient Approach for Computing Silhouette Coefficients". Journal of Computer Science 4 (3): 252-255, 2008.
  7. G. Nathiya, S. C. Punitha, M. Punithavalli, "An Analytical Study on Behavior of Clusters Using K-Means, EM and K* Means Algorithm", (IJCSIS) International Journal of Computer Science and Information Security,Vol. 7, No 3, 2010.
  8. Osama Abu Abbas, Computer Science Department, Yarmouk University, Jordan, "Comparisons between data clustering algorithms" The international Arab Journal of Information Technology,Vol. 5,No. 3,July 2008.
  9. Sunnyvale, Schena M," Microarray biochip technology". , CA: Eaton Publishing; 2000.
  10. Vishwanath R. Iyer, Michael B. Eisen, Douglas T. Ross, Greg Schuler, Troy Moore, Jeffrey C. F. Lee, Jeffrey M. Trent, Louis M. Staudt, James Hudson Jr. , Mark S. Boguski, Deval Lashkari, Dari Shalon, David Botstein, and Patrick O. Brown, "The Transcriptional Program in the Response of Human Fibroblasts to Serum", www. sciencemag. org, Science Vol. 283,1 January 1999.


Clustering, Quad Tree, Expectation Maximization Algorithm, K-means, Silhouette Measure, Similarity