Prediction of Diabetes using Supervised Learning Approach
Keywords:
diabetes prediction, diagnosis, data mining, algorithmsAbstract
This paper provides an in-depth evaluation of various supervised machine learning models used for predicting diabetes. It discusses the strengths and limitations of several algorithms, including Decision Trees, Random Forest, Rotation Forest, Ensemble Classifier, K-Star, Simple Bayes, Logistic Regression, Functional Tree, and Perceptron Neural Network. The study utilizes a publicly available diabetes dataset from chistio.ir, which includes 520 samples, comprising 200 diabetic patients and 320 non-diabetic patients, and assesses 16 features. Results are validated on the Weka 3.6 open-source platform, using metrics such as AUC, classification accuracy (CA), F1 score, precision, and recall.
Downloads
References
1. Deberneh HM, Kim I. Prediction of Type 2 Diabetes Based on Machine Learning Algorithm. International Journal of Environmental Research and Public Health. 2021;18(6):3317. [PMID: 33806973] [PMCID: PMC8004981] [DOI]
2. Ahmed U, Issa GF, Khan MA, Aftab S, Khan MF, Said RAT, et al. Prediction of Diabetes Empowered With Fused Machine Learning. IEEE Access. 2022;10:8529-38. [DOI]
3. Dey SK, Hossain A, Rahman MM, editors. Implementation of a web application to predict diabetes disease: an approach using machine learning algorithm. 2018 21st international conference of computer and information technology (ICCIT); 2018: IEEE. [PMCID: PMC6334885] [DOI]
4. Alehegn M, Joshi RR, Mulay P. Diabetes analysis and prediction using random forest, KNN, Naïve Bayes, and J48: an ensemble approach. Int J Sci Technol Res. 2019;8(9):1346-54.
5. Ameri H, Alizadeh S, Barzegari A. Extracting knowledge from the data of diabetic patients using decision tree method C5. Health Management. 2012;16(53):58-72.
6. Park H-A. An Introduction to Logistic Regression: From Basic Concepts to Interpretation with Particular Attention to Nursing Domain. J Korean Acad Nurs. 2013;43(2):154-64. [PMID: 23703593] [DOI]
7. Gandomi M, Dolatshahi Pirooz M, Varjavand I, Nikoo MR. Application of Multilayer Perceptron Neural Network and Support Vector Machine for Modeling the Hydrodynamic Behavior of Permeable Breakwaters with Porous Core. Journal Of Marine Engineering. 2019;15(29):167-79.
8. Ahmadi F, Maddah MA. Development of Wavelet-Kstar Algorithm Hybrid Model for the Monthly Precipitation Prediction (Case Study: Synoptic Station of Ahvaz). Iranian Journal of Soil and Water Research. 2021;52(2):409-20.
9. Zhu F, Tang M, Xie L, Zhu H. A classification algorithm of CART decision tree based on MapReduce attribute weights. International journal of performability engineering. 2018;14(1):17. [DOI]
10. Moshrefzadeh S, Rahmani Seryasat O, Ravaei B. Intelligent intrusion Detection of computer networks using Random Forest Algorithm. Transactions on Machine Intelligence. 2019;2(1):48-58.
11. Ani R, Jose J, Wilson M, Deepa OS, editors. Modified Rotation Forest Ensemble Classifier for Medical Diagnosis in Decision Support Systems. Progress in Advanced Computing and Intelligent Engineering; 2018 2018//; Singapore: Springer Singapore. [DOI]
12. Alpaydin E. Introduction to machine learning: MIT press; 2020.
13. Taser PY. Application of Bagging and Boosting Approaches Using Decision Tree-Based Algorithms in Diabetes Risk Prediction. Proceedings [Internet]. 2021; 74(1). [DOI]
14. Sonar P, JayaMalini K, editors. Diabetes Prediction Using Different Machine Learning Approaches. 2019 3rd International Conference on Computing Methodologies and Communication (ICCMC); 2019 27-29 March 2019. [DOI]
15. Jain B, Ranawat N, Chittora P, Chakrabarti P, Poddar S. WITHDRAWN: A machine learning perspective: To analyze diabetes. Materials Today: Proceedings. 2021. [PMID: 35155131] [PMCID: PMC8820461] [DOI]
16. Salah MM, Shekari E, Hassani H, Salehi A. Prediction of Marital Conflicts Based on Mindfulness in Couples Facing Extra-Marital Relationships Attending Counseling Centers in Fars Province. Transactions on Data Analysis in Social Science. 2024;6(1):1-13.
17. Moslehi Z, Robat Milli S. The Purpose of Determining Prediction of Quality of Life Based on the Feeling of Psychological Coherence and Tolerance of Distress in Students. Transactions on Data Analysis in Social Science. 2023;5(2):104-10.
18. Ghayoumi Zadeh H, Fayazi A, Rahmani Seryasat O, Rabiee H. A Bidirectional Long Short-Term Neural Network Model to Predict Air Pollutant Concentrations: A Case Study of Tehran, Iran. Transactions on Machine Intelligence. 2022;5(2):63-76.
19. Saleh B, Hasanpour H. Diabetes Diagnosis from Big Data using Fuzzy-Neural Chaotic Tree. Transactions on Machine Intelligence. 2023;6(2):104-13.
Downloads
Additional Files
Published
Issue
Section
Categories
License
Copyright (c) 2024 Nasim Khozouie (Corresponding Author); Omid Rahmani Seryasat, Sadegh Moshrefzadeh (Author)

This work is licensed under a Creative Commons Attribution-NonCommercial 4.0 International License.

