Predictive Modeling of Diabetes Mellitus Utilizing Machine Learning Techniques

  • N Nagarjuna
  • Dr. Lakshmi HN

Abstract

Diabetes mellitus represents a persistent metabolic condition distinguished by elevated levels of blood sugar, which results from the inadequacy of the body to secrete and respond to insulin, leading to health risks and frequent hospitalizations. Accurate predictive models are vital for targeted interventions to reduce readmissions and improve healthcare quality and cost. Early prediction can mitigate its impact, aid in control, and potentially save lives. Machine learning algorithms show promise in medical applications, including diabetes prediction and diagnosis. Limited data quality hinders accurate diabetes prediction due to missing values and inconsistencies. This paper investigates machine learning's potential for predicting and diagnosing diabetes, aiming to enhance accuracy and efficiency in disease management. Feature engineering techniques are applied to preprocess the data and extract relevant features for model development. To address class imbalance, SMOTE (Synthetic Minority Oversampling Technique) is employed. Various machine learning algorithms, including logistic regression, Naïve Bayes, random forests, support vector machines (SVM), K-Nearest Neighbors (KNN), and eXtreme Gradient Boosting (XGBoost), are utilized to build predictive models. The performance evaluation employs standard metrics such as accuracy, recall, precision, and F1-Score. Notably, Random Forest achieves an accuracy of 82% followed by XGBoost(80%) , surpassing other ML algorithms utilized.

 

Index Terms: Diabetes mellitus, Machine learning, Prediction, SVM, logistic regression, Accuracy.

Published
2024-06-01