Using the Distance in Logistic Regression Models for Predictor Ranking in Diabetes Detection


Sheikhi G., Altincay H.

International Conference on Medical and Biological Engineering in Bosnia and Herzegovina (CMBEBIH), Banja Luka, Bosna-Hersek, 16 - 18 Mayıs 2019, cilt.73, ss.665-670, (Tam Metin Bildiri) identifier identifier

  • Yayın Türü: Bildiri / Tam Metin Bildiri
  • Cilt numarası: 73
  • Doi Numarası: 10.1007/978-3-030-17971-7_100
  • Basıldığı Şehir: Banja Luka
  • Basıldığı Ülke: Bosna-Hersek
  • Sayfa Sayıları: ss.665-670
  • Orta Doğu Teknik Üniversitesi Kuzey Kıbrıs Kampüsü Adresli: Hayır

Özet

Logistic regression is widely used to model the relationship between a response variable and multiple independent variables. In practice, the most important variables for each problem domain are generally well known. However, a wealth of ongoing studies has been exploring additional variables for improving the prediction performance using an enriched model. In this article, a new method for ranking binary independent variables is suggested based on the distance between two decision boundaries. The boundaries correspond to the cases when value of the variable is zero or one. It is shown that, using age and body mass index as the base variables for diabetes prediction, the distances mentioned above are effective for ranking additional variables, leading to better scores than several conventionally used approaches.