Ensemble-based Predictive Model for Financial Fraud Detection

V.O. Olaleye; O.A. Odeniyi; B.K. Alese

Call for Paper

June Edition

IJAIS solicits high quality original research papers for the upcoming June edition of the journal. The last date of research paper submission is 15 May 2024

Submit your paper

Know more

The week's pick

Analysis of ANN Training Algorithms for Hand Geometry-Based Access Control

Kazeem B. Adedeji Apena Waliu O. Adu Michael R.

Random Articles

Developing and Leveraging Business Intelligence Systems for Decision Making in Organizations

July

2017

Efficient Data Hiding System using Cryptography and Steganography

December

2012

Construction of Co-occurrence Matrix using Gabor Wavelets for Classification of Arecanuts by Decision Trees

December

2012

Investigating the Effects of varying the Key Size on the Performance of AES Algorithm for Encryption of Data over a Communication Channel

September

2014

Reseach Article

Ensemble-based Predictive Model for Financial Fraud Detection

by V.O. Olaleye, O.A. Odeniyi, B.K. Alese

International Journal of Applied Information Systems

Foundation of Computer Science (FCS), NY, USA

Volume 12 - Number 42

Year of Publication: 2024

Authors: V.O. Olaleye, O.A. Odeniyi, B.K. Alese

10.5120/ijais2024451961

V.O. Olaleye, O.A. Odeniyi, B.K. Alese . Ensemble-based Predictive Model for Financial Fraud Detection. International Journal of Applied Information Systems. 12, 42 ( Jan 2024), 54-62. DOI=10.5120/ijais2024451961

@article{ 10.5120/ijais2024451961,

author = { V.O. Olaleye, O.A. Odeniyi, B.K. Alese },

title = { Ensemble-based Predictive Model for Financial Fraud Detection },

journal = { International Journal of Applied Information Systems },

issue_date = { Jan 2024 },

volume = { 12 },

number = { 42 },

month = { Jan },

year = { 2024 },

issn = { 2249-0868 },

pages = { 54-62 },

numpages = {9},

url = { https://www.ijais.org/archives/volume12/number42/ensemble-based-predictive-model-for-financial-fraud-detection/ },

doi = { 10.5120/ijais2024451961 },

publisher = {Foundation of Computer Science (FCS), NY, USA},

address = {New York, USA}

}

%0 Journal Article

%1 2024-01-27T22:32:21.391180+05:30

%A V.O. Olaleye

%A O.A. Odeniyi

%A B.K. Alese

%T Ensemble-based Predictive Model for Financial Fraud Detection

%J International Journal of Applied Information Systems

%@ 2249-0868

%V 12

%N 42

%P 54-62

%D 2024

%I Foundation of Computer Science (FCS), NY, USA

Abstract

The financial industry remains a persistent target for fraudulent activities. Challenges to research in this area are due to data privacy concerns and the scarcity of publicly available datasets that contain instances of fraud. Researchers and practitioners have proposed various fraud detection techniques, applying diverse algorithms to uncover fraudulent patterns. To further address this, the study introduces a synthetic fraud-related dataset featuring five distinct fraud scenarios having about 2.5 million transactions. The primary objective is to analyze the intricacies of account transaction behaviour in a financial dataset. The authors propose an ensemble of three gradient boosting algorithms: CatBoost, Extreme Gradient Boosting (XGBoost), and LightGBM; The models developed demonstrate promising results, with several achieving an average Area Under the Curve (AUC) exceeding 0.9 and the ensemble having a predictive accuracy of 98.60%. Further evaluation through an application programming interface indicates a time complexity of less than 300 milliseconds and efficient memory usage, making this approach promising for practical usage in real-world scenarios.

References

D. Prusti and S. K. Rath, "Fraudulent Transaction Detection in Credit Card by Applying Ensemble Machine Learning techniques," 2019 10th International Conference on Computing, Communication and Networking Technologies (ICCCNT), Kanpur, India, 2019, pp. 1-6, doi: 10.1109/ICCCNT45670.2019.8944867.
Sánchez-Aguayo, M., Urquiza-Aguiar, L., & Estrada-Jiménez, J. (2022). Predictive Fraud Analysis Applying the Fraud Triangle Theory through Data Mining Techniques. Applied Sciences, 12, 3382. https://doi.org/10.3390/app12073382
Paefgen, J., Staake, T., & Thiesse, F. (2013). Evaluation and aggregation of pay-as-you-drive insurance rate factors: A classification analysis approach. Decision Support Systems, 56, 192–201
Baecke, P., & Bocca, L. (2017). The value of vehicle telematics data in insurance risk selection processes. Decision Support Systems, 98, 69–79.
Bian, Y., Yang, C., Zhao, J. L., & Liang, L. (2018). Good drivers pay less: A study of usage-based vehicle insurance models. Transportation Research Part A: Policy and Practice, 107, 20–34.
Pesantez-Narvaez, J., Guillen, M., & Alcaniz, M. (2019). Predicting motor insurance claims using telematics data—xgboost versus logistic regression. Risks, 7(2), 70.
Prates, J. M., Oliveira, L. S., Costa, K. A., & Ludermir, T. B. (2011). Predictive modelling for fraud detection: A data-oriented approach. Decision Support Systems, 51(1), 201-210.
Geetha, G., Navin, J., Sanjeevi, P., & Sivaraj, S. (2023). Driver Driving Performance Analysis And Risk Detection Using Deep Learning. International Journal of Advanced Research in Computer and Communication Engineering, 12(5), 388–394. https://doi.org/10.17148/IJARCCE.2023.12563
A. Dal Pozzolo, G. Boracchi, O. Caelen, C. Alippi, and G. Bontempi,“Credit Card Fraud Detection: A Realistic Modeling and a Novel Learning Strategy” in IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS, 2015, pp. 1–14.
A. Dal Pozzolo, O. Caelen, and G. Bontempi, “When is undersampling effective in unbalanced classification tasks?” in Machine Learning and Knowledge Discovery in Databases. Cambridge, U.K.: Springer, 2015
A. Dal Pozzolo, O. Caelen, R. A. Johnson, and G. Bontempi, “Calibrating probability with undersampling for unbalanced classification,” in Proc. IEEE Symp. Ser. Computat. Intell., Dec. 2015, pp. 159–166
C. Alippi, G. Boracchi, and M. Roveri, “Just-in-time classifiers for recurrent concepts,” IEEE Trans. Neural Netw. Learn. Syst., vol. 24, no. 4, pp. 620–634, Apr. 2013.
J. Gama, I. Žliobait˙ e, A. Bifet, M. Pechenizkiy, and A. Bouchachia, “A survey on concept drift adaptation,” ACM Comput. Surv., vol. 46, no. 4, p. 44, 2014.
G. Krempl and V. Hofer, “Classification in presence of drift and latency,” in Proc. 11th Data Mining Workshops, Dec. 2011, pp. 596–603.
J. Plasse and N. Adams, “Handling delayed labels in temporally evolving data streams,” in Proc. Int. Conf. Big Data, 2016, pp. 2416–2424.

Index Terms

Computer Science

Information Sciences

Data mining

Fraud Detection

Financial Industry

Keywords

Machine Learning Synthetic Data Financial Fraud Ensemble Learning Gradient Boosting