Abstract
Background: Diabetes mellitus (DM) is one of the most common diseases in the world. Complications of
this disease include nephropathy, cardiac arrest, blindness, and even mutilation of the body. The accurate
diagnosis of this condition is very important.
Objectives: This study was to identify and provide a model for diagnosis of DM using data mining.
Methods: The data used in this study were obtained from 768 women aged 21-83 year old. Nine variables
were selected for investigation. The neural network, Basin network, C5.0, and support vector machine
models were compared for predicting diabetes and their precision to this end. Clementine 12 software
was used to analyze the data.
Results: The proposed method for classification of records with the C5.0 algorithm for accuracy data is
80.2% and for accuracy data 87.5%. In comparison with similar studies, it was better to diagnose people
with diabetes, while glucose, body mass index and age variables were important in this study.
Conclusion: The C5.0 algorithm showed the highest value of accuracy, specificity, and sensitivity
compared with other methods studied. Therefore, the C5.0 algorithm probably performs the best
classification among other algorithms and is recommended as the best method for diabetes prediction
using available data.