Submitted: 07 Apr 2018
Accepted: 09 Jun 2018
ePublished: 21 Jun 2018
EndNote EndNote

(Enw Format - Win & Mac)

BibTeX BibTeX

(Bib Format - Win & Mac)

Bookends Bookends

(Ris Format - Mac only)

EasyBib EasyBib

(Ris Format - Win & Mac)

Medlars Medlars

(Txt Format - Win & Mac)

Mendeley Web Mendeley Web
Mendeley Mendeley

(Ris Format - Win & Mac)

Papers Papers

(Ris Format - Win & Mac)

ProCite ProCite

(Ris Format - Win & Mac)

Reference Manager Reference Manager

(Ris Format - Win only)

Refworks Refworks

(Refworks Format - Win & Mac)

Zotero Zotero

(Ris Format - Firefox Plugin)

Avicenna J Med Biochem. 2018;6(1): 3-7.
doi: 10.15171/ajmb.2018.02
  Abstract View: 408
  PDF Download: 301

Research Article

Prediction and Diagnosis of Diabetes by Using Data Mining Techniques

Seyede Somayeh Mirzajani 1,2* ORCID logo, siamak salimi 3

1 Research & Technology Deputy, Hamadan University of Medical Sciences, Hamadan, Iran
2 Masters of Department of Computer Engineering, Malayer Branch, Islamic Azad University, Hamadan, Iran
3 PhD Student of Bioinformatics, Tehran University, Tehran, Iran
*Corresponding author: Seyede Somayeh Mirzajani, so_mirzajani@yahoo.com


Background: Diabetes mellitus (DM) is one of the most common diseases in the world. Complications of this disease include nephropathy, cardiac arrest, blindness, and even mutilation of the body. The accurate diagnosis of this condition is very important.

Objectives: This study was to identify and provide a model for diagnosis of DM using data mining.

Methods: The data used in this study were obtained from 768 women aged 21-83 year old. Nine variables were selected for investigation. The neural network, Basin network, C5.0, and support vector machine models were compared for predicting diabetes and their precision to this end. Clementine 12 software was used to analyze the data.

Results: The proposed method for classification of records with the C5.0 algorithm for accuracy data is 80.2% and for accuracy data 87.5%. In comparison with similar studies, it was better to diagnose people with diabetes, while glucose, body mass index and age variables were important in this study.

Conclusion: The C5.0 algorithm showed the highest value of accuracy, specificity, and sensitivity compared with other methods studied. Therefore, the C5.0 algorithm probably performs the best classification among other algorithms and is recommended as the best method for diabetes prediction using available data.

Keywords: Diabetes mellitus, Bayesian network, Neural network, Decision tree, Support vector machine, Data mining,
First Name
Last Name
Email Address
Security code

Abstract View: 408

Your browser does not support the canvas element.

PDF Download: 301

Your browser does not support the canvas element.