CFP last date
15 May 2024
Reseach Article

Cluster Algorithm using Distributed Processing for Human Protein Function Prediction

by Manpreet Singh, Gurvinder Singh, Karanjeet Singh Kahlon
International Journal of Applied Information Systems
Foundation of Computer Science (FCS), NY, USA
Volume 4 - Number 4
Year of Publication: 2012
Authors: Manpreet Singh, Gurvinder Singh, Karanjeet Singh Kahlon
10.5120/ijais12-450696

Manpreet Singh, Gurvinder Singh, Karanjeet Singh Kahlon . Cluster Algorithm using Distributed Processing for Human Protein Function Prediction. International Journal of Applied Information Systems. 4, 4 ( October 2012), 24-28. DOI=10.5120/ijais12-450696

@article{ 10.5120/ijais12-450696,
author = { Manpreet Singh, Gurvinder Singh, Karanjeet Singh Kahlon },
title = { Cluster Algorithm using Distributed Processing for Human Protein Function Prediction },
journal = { International Journal of Applied Information Systems },
issue_date = { October 2012 },
volume = { 4 },
number = { 4 },
month = { October },
year = { 2012 },
issn = { 2249-0868 },
pages = { 24-28 },
numpages = {9},
url = { https://www.ijais.org/archives/volume4/number4/293-0696/ },
doi = { 10.5120/ijais12-450696 },
publisher = {Foundation of Computer Science (FCS), NY, USA},
address = {New York, USA}
}
%0 Journal Article
%1 2023-07-05T10:47:20.743705+05:30
%A Manpreet Singh
%A Gurvinder Singh
%A Karanjeet Singh Kahlon
%T Cluster Algorithm using Distributed Processing for Human Protein Function Prediction
%J International Journal of Applied Information Systems
%@ 2249-0868
%V 4
%N 4
%P 24-28
%D 2012
%I Foundation of Computer Science (FCS), NY, USA
Abstract

For the pharmaceutical industry, the discovery of a new drug presents an enormous scientific challenge, and consists essentially in the identification of the target responsible for the disease. Once the therapeutic target is identified, scientists then find one or more leads that interact with the therapeutic target. Usually leads are searched by employing a long and costly process of trial and error. But if the protein class of the target would have been known it will become very easy to find the complementary lead for the responsible molecule. The 50 protein sequences related to 10 different molecular classes are obtained from Human Protein Reference Database (HPRD). Then the Sequence Derived Features (SDFs) for each of the available sequence are obtained using the different online tools. For the whole SDF database, the variation in the values obtained is analyzed and priorities are assigned accordingly. In the present work, priority based cluster algorithm is used for human protein function prediction. Then the distributed processing using four MATLAB workers is applied for different iterations in the algorithm. Two different methods for distributing the code are applied and the cpu times are computed for these methods.

References
  1. I. Friedberg, "Automated Protein Function Prediction-the genomic challenge Briefings in Bioinformatics", vol. 7, No. 3, January 2006, pp. 225-242.
  2. Krane, D. and Raymer, M. 2006 Fundamental Concepts of Bioinformatics. Pearson Education Publishers.
  3. N. Narai, E. D. Kolaczyk, S. Kasif, "Probabilistic Protein Function Prediction from Heterogeneous Genome-Wide Data", PLoS ONE 2(3), issue 3, 2007, pp. 1-7.
  4. Rastogi, S. C. , Mendiratta, M. and Rastogi, P. 2005 Bioinformatics Methods and Applications, 3rd edition. PHI publication.
  5. http://proteincrystallography. org/protein/
  6. M. Singh, P. Singh, P. K. Wadhwa, "Human Protein Function Prediction using Decision Tree Induction", International Journal of Computer Science and Network Security, vol. 7, No. 4, 2007, pp. 92-98.
  7. M. Singh, G. Singh "Cluster Analysis Technique based on Bipartite Graph for Human Protein Class Prediction", International Journal of Computer Applications, vol. 20, no. 3, 2011, pp. 22-27.
  8. L. J. Jensen, R. Gupta, H. H. Staerfeldt, and S. Brunak, "Prediction of Human Protein Function According to Gene Ontology Categories", Bioinformatics, vol. 19, no. 5, 2003, pp. 635-642.
  9. W. R. Weinert and H. S. Lopes, "Neural Networks for Protein Classification", Applied Bioinformatics, vol. 3, no. 1, 2004, pp. 41-48.
  10. A. Clare, A. Karwath, H. Ougham, and R. D. King, "Functional Bioinformatics for Arabidopsis thaliana", Bioinformatics, vol. 22, no. 9, pp. 1130-1136, 2006.
  11. J. He, H. -J. Hu, R. Harrison, P. C. Tai, and Y. Pan, "Transmembrane Segments Prediction and Understanding Using Support Vector Machine and Decision Tree", Expert Systems with Applications, vol. 30, 2006, pp. 64-72.
  12. G. L. Pappa, A. J. Baines, and A. A. Freitas, "Predicting Post-Synaptic Activity in Proteins with Data Mining", Bioinformatics, vol. 21, no. Suppl. 2, 2005, pp. ii19-ii25.
  13. Singh, M. , Sandhu, P. S. Singh, H. 2006. Decision Tree Classifier for Human Protein Function Prediction. In Proceedings of International Conference on Advanced Computing and Communications, ADCOM 2006, 20-23 Dec. , 2006, pp. 564-568.
  14. Jensen, L. 2002. Prediction of Protein Function from Sequence Derived Protein Features. Ph. D. thesis. Technical University of Denmark.
  15. M. Singh, G. Singh, K. S. Kahlon, "Classifier for Human Protein Function Class Prediction", International Journal of Engineering & Information Technology, vol. 1, No. 1, 2009, pp. 1-4.
Index Terms

Computer Science
Information Sciences

Keywords

Human Protein Function Prediction Cluster Algorithm Sequence Derived Features