CFP last date
28 May 2026
Reseach Article

An Interpretable Predictive Model for Cervical Cancer Risk Prediction using Hybrid Feature Selection and Ensemble Learning

by A.S.M. Sabiqul Hassan, Md. Mohsin Uddin Azad, Goutam Paul, Muhammed Samsuddoha Alam, Md. Ruhul Amin
International Journal of Applied Information Systems
Foundation of Computer Science (FCS), NY, USA
Volume 13 - Number 2
Year of Publication: 2026
Authors: A.S.M. Sabiqul Hassan, Md. Mohsin Uddin Azad, Goutam Paul, Muhammed Samsuddoha Alam, Md. Ruhul Amin
10.5120/ijais09d1560bfb24

A.S.M. Sabiqul Hassan, Md. Mohsin Uddin Azad, Goutam Paul, Muhammed Samsuddoha Alam, Md. Ruhul Amin . An Interpretable Predictive Model for Cervical Cancer Risk Prediction using Hybrid Feature Selection and Ensemble Learning. International Journal of Applied Information Systems. 13, 2 ( May 2026), 67-78. DOI=10.5120/ijais09d1560bfb24

@article{ 10.5120/ijais09d1560bfb24,
author = { A.S.M. Sabiqul Hassan, Md. Mohsin Uddin Azad, Goutam Paul, Muhammed Samsuddoha Alam, Md. Ruhul Amin },
title = { An Interpretable Predictive Model for Cervical Cancer Risk Prediction using Hybrid Feature Selection and Ensemble Learning },
journal = { International Journal of Applied Information Systems },
issue_date = { May 2026 },
volume = { 13 },
number = { 2 },
month = { May },
year = { 2026 },
issn = { 2249-0868 },
pages = { 67-78 },
numpages = {9},
url = { https://www.ijais.org/archives/volume13/number2/an-interpretable-predictive-model-for-cervical-cancer-risk-prediction-using-hybrid-feature-selection-and-ensemble-learning/ },
doi = { 10.5120/ijais09d1560bfb24 },
publisher = {Foundation of Computer Science (FCS), NY, USA},
address = {New York, USA}
}
%0 Journal Article
%1 2026-05-04T23:57:53.125225+05:30
%A A.S.M. Sabiqul Hassan
%A Md. Mohsin Uddin Azad
%A Goutam Paul
%A Muhammed Samsuddoha Alam
%A Md. Ruhul Amin
%T An Interpretable Predictive Model for Cervical Cancer Risk Prediction using Hybrid Feature Selection and Ensemble Learning
%J International Journal of Applied Information Systems
%@ 2249-0868
%V 13
%N 2
%P 67-78
%D 2026
%I Foundation of Computer Science (FCS), NY, USA
Abstract

Cervical cancer has been known as a continuous health threat among women for a long time. There are some screening techniques available to identify this disease, but those are not compatible now due to their high-cost and time-consuming issues. In this study, a ML based interpretable model has been proposed to analyze the risk factors of cervical cancer for trustworthy decision making. Several ML algorithms: KNN, LR, DT, RF, NB, SVM, XGBoost, and Ensemble Learning (Soft Voting) were applied on a publicly available cervical cancer dataset collected from the UCI dataset repository. The result analysis demonstrated that tree or plane based models: DT, RF, and SVM generated best accuracy but ensemble model maintained a balanced result in all evaluation metrics (accuracy: 0.959, precision: 0.643, recall: 0.818, f1-score: 0.720, and roc-auc: 0.882). Additionally, a web version of this model was deployed with explainability based on the top features. This type of decision making tool can be used in the healthcare sector in future with further improvements.

References
  1. World Health Organization (WHO), “Cervical Cancer,” [Online]. Available: https://www.who.int/news-room/fact-sheets/detail/cervical-cancer [Accessed: Dec 02, 2025].
  2. H.N. Tahir et al., “Artificial intelligence versus manual screening for the detection of diabetic retinopathy: a comparative systematic review and meta-analysis,” Frontiers in Medicine, vol. 12, May 2025, doi: 10.3389/fmed.2025.1519768.
  3. N.H. Alhumaidi et al., “The Use of Machine Learning for Analyzing Real-World Data in Disease Prediction and Management: Systematic Review,” JMIR medical informatics, vol. 13, Jun 2025, Art. no. e68898, doi:10.2196/68898.
  4. N.A. Mudawi and A. Alazeb, “A Model for Predicting Cervical Cancer Using Machine Learning Algorithms,” Sensors (Basel), vol. 22, no. 11, May 2022, Art. no. 4132, doi: 10.3390/s22114132.
  5. UCI Machine Learning Repository, “Cervical Cancer Risk Classification,” [Online]. Available: https://archive.ics.uci.edu/dataset/383/cervical+cancer+risk+factors. [Accessed: Jan 27, 2026].
  6. A.S. Antonini et al., “Machine Learning model interpretability using SHAP values: Application to Igneous Rock Classification task,” Applied Computing and Geosciences, vol. 23, Sep 2024, Art. no. 100178, doi: 10.1016/j.acags.2024.100178.
  7. S. Devi, “Prediction and Detection of Cervical Malignancy Using Machine Learning Models,” Asian Pacific journal of cancer prevention: APJCP, vol. 24, no. 4, pp: 1419–1433. Apr 2023, doi: 10.31557/APJCP.2023.24.4.1419.
  8. R. Abdulkareem and A. M. Abdulazeez, “A Comparative Study of Multi-Class Classification Based on Imbalanced Data: A Review,” The Indonesian Journal of Computer Science, vol. 14, no. 5, Oct 2025, doi: 10.33022/ijcs.v14i5.5020.
  9. H. Park et al., “Integrating Large Language Models with Deep Learning for Breast Cancer Treatment Decision Support,” Diagnostics, vol. 16, no. 3, Jan 2026, Art. no. 394, doi:10.3390/diagnostics16030394.
  10. S.U. Hassan et al., “Local interpretable model-agnostic explanation approach for medical imaging analysis: A systematic literature review,” Computers in Biology and Medicine, vol. 185, Feb 2025, Art. no. 109569, doi: 10.1016/j.compbiomed.2024.109569.
  11. R. Chauhan et al., “Predictive modeling and web-based tool for cervical cancer risk assessment: A comparative study of machine learning models,” MethodsX, vol. 12, Jun 2024, Art. no. 102653, doi: 10.1016/j.mex.2024.102653.
  12. G.S. Collins et al., “Clinical prediction models using machine learning in oncology: challenges and recommendations,” BMJ oncology, vol. 4, no. 1, Oct 2025, Art. no. e000914, doi: 10.1136/bmjonc-2025-000914.
  13. M.H. Kabir, “Study on the Performance of Classification Algorithms for Data Mining,” IOSR Journal of Computer Engineering (IOSR-JCE), vol. 21, no. 3, pp. 23-30, Jul 2019, doi: 10.9790/0661-2103062330.
  14. A.S.M.S. Hassan et al., “A Machine Learning Approach for Optimized Heart Disease Diagnosis with SMOTE and Voting Classifiers,” International Journal of Computer Applications, vol. 187, no. 64, pp. 30-36, Dec 2025, doi: 10.5120/ijca2025926079.
  15. C. Bunkhumpornpat et al., “Safe-Level-SMOTE: Safe-Level-Synthetic Minority Over-Sampling TEchnique for Handling the Class Imbalanced Problem,” in Proc. Advances in Knowledge Discovery and Data Mining, Bangkok, Thailand, vol. 5476, pp. 475-482, Apr 2009, doi: 10.1007/978-3-642-01307-2_43.
  16. M. Afkanpour et al., “Identify the most appropriate imputation method for handling missing values in clinical structured datasets: a systematic review,” BMC medical research methodology, vol. 24, no. 1, Aug 2024, Art. no. 188, doi: 10.1186/s12874-024-02310-6.
  17. S. Matharaarachchi et al., “Enhancing SMOTE for imbalanced data with abnormal minority instances Author links open overlay panel,” Machine Learning with Applications, vol. 18, Dec 2024, Art. no. 100597, doi: 10.1016/j.mlwa.2024.100597.
  18. N. Pudjihartono et al., “A Review of Feature Selection Methods for Machine Learning-Based Disease Risk Prediction,” Frontiers in bioinformatics, vol. 2, Jun 2022, Art. no. 927312, doi: 10.3389/fbinf.2022.927312.
  19. W. Zhai et al., “A Bagging-SVM field-road trajectory classification model based on feature enhancement,” Computers and Electronics in Agriculture, vol. 217, Feb 2024, Art. no. 108635, doi: 10.1016/j.compag.2024.108635.
  20. N. Alamsyah et al., “XGBoost hyperparameter optimization using randomizedsearchcv for accurate forest fire drought condition prediction,” Journal Pilar Nusa Mandiri, vol. 20, no. 2, pp. 103-110, Sep 2024, doi: 10.33480/pilar.v20i2.5569.
  21. P. Chithuloori and JM. Kim, “Soft voting ensemble classifier for liquefaction prediction based on SPT data,” Artificial Intelligence Review, vol. 58, May 2025, Art. no. 228, doi: 10.1007/s10462-025-11230-w.
  22. S. Swaminathan and B. R.Tantri, “Confusion Matrix-Based Performance Evaluation Metrics,” African Journal of Biomedical Research, vol. 27, pp. 4023-4031, Nov 2024, doi: 10.53555/AJBR.v27i4S.4345.
  23. J. H. Cabot and E. G. Ross, “Evaluating prediction model performance,” Surgery, vol. 174, no. 3, pp. 723–726, Jul 2023, doi: 10.1016/j.surg.2023.05.023.
  24. V. V. Kumar et al., “The stratified K-folds cross-validation and class-balancing methods with high-performance ensemble classifiers for breast cancer classification,” Healthcare Analytics, vol. 4, no. 7, Sep 2023, Art. no. 100247, doi: 10.1016/j.health.2023.100247.
  25. P. Shu et al., “SHAP combined with machine learning to predict mortality risk in maintenance hemodialysis patients: a retrospective study,” Frontiers in medicine, vol. 12, Jul 2025, Art. no. 1615950, doi: 10.3389/fmed.2025.1615950.
  26. P. Roy et al., “Interpretable artificial intelligence (AI) for cervical cancer risk analysis leveraging stacking ensemble and expert knowledge,” Digital health, vol. 11, Mar 2025, Art. no. 20552076251327945, doi: 10.1177/20552076251327945.
  27. A. A. Wani, “Comprehensive review of dimensionality reduction algorithms: challenges, limitations, and innovative solutions,” PeerJ. Computer science, vol. 11, Jul 2025, Art. no. e3025, doi: 10.7717/peerj-cs.3025.
  28. Y. Li et al., “An interpretable machine learning model using SHapley Additive exPlanations for preoperative cervical lymph node metastasis risk stratification in tongue squamous cell carcinoma: a multicenter study,” BMC oral health, vol. 26, no. 1, Dec 2025, Art. no. 185, doi: 10.1186/s12903-025-07528-4.
Index Terms

Computer Science
Information Sciences

Keywords

Cervical Cancer Prediction Risk Factors Analysis Class Imbalance Explainable AI (XAI) Decision Making