Sağlık alanında hastalıkların erken tahmininde yapay zeka tekniklerinin kullanılması
Date
2024-06-24
Authors
Özdemir, Yasemin
Journal Title
Journal ISSN
Volume Title
Publisher
Bursa Uludağ Üniversitesi
Abstract
Kalp hastalığı ve diyabet hastalığı gibi hastalıklar son yıllarda giderek yaygınlaşan önemli hastalık gruplarındandır. Bu tez çalışmasının amacı, sağlık alanında yapay zeka tekniklerinin kullanılması ve yaygın görülen hastalıkların erken tahmini için literatüre katkı sağlamaktır. Çalışma kapsamında kalp hastalığı için UCI veri tabanından Cleveland veri seti ve diyabet hastalığı için Vanderbilt Üniversitesi Biyoistatistik Bölümünden alınan veri seti kullanılmıştır. Çalışmada, tahmin algoritmaları olarak Rassal Orman, Lojistik Regresyon ve XGBoost kullanılmıştır. Bu algoritmaların performansı, 5 katlı ve 10 katlı çapraz doğrulama kullanılarak değerlendirilmiştir. Performans ölçütleri olarak doğruluk, duyarlılık, kesinlik ve F1 skoru gibi kriterler kullanılmıştır. Diyabet hastalığı veri setinde dengesiz veri problemini çözmek için yapay veriler üretilerek veri seti daha dengeli hale getirilmiştir. Elde edilen sonuçlara göre, kalp hastalığı tahmininde Rastgele Orman algoritması ile %93,00, Lojistik Regresyon ile %94,00 ve XGBoost algoritması ile %95,00 başarı oranı elde edilmiştir. Diyabet hastalığı tahmininde ise Rastgele Orman algoritması %99,00, Lojistik Regresyon %96,00 ve XGBoost algoritması %98,00 başarı oranlarına ulaşmıştır. Çalışmanın devamında, kalp hastalığının tahmininde XGBoost tabanlı, diyabet hastalığı tahmininde ise Rastgele Orman tabanlı arayüz sistemi geliştirilmiştir. Çalışmanın sonucunda, yapay zeka tekniklerinin sağlık alanında hastalıkların tahmini için yüksek performans gösterdiği gözlemlenmiştir. Elde edilen sonuçlar, yapay zeka tekniklerinin hastalıkların erken tahmininde etkili olarak kullanılabileceğini göstermektedir. Anahtar Kelimeler: Makine Öğrenmesi, kalp hastalığı tahmini, yapay zeka, diyabet hastalığı tahmini, rassal orman algoritması, XGBoost algoritması
Diseases such as heart disease and diabetes are among the important disease groups that have become increasingly common in recent years. The aim of this thesis is to contribute to the literature on the use of artificial intelligence techniques in the field of health and the early prediction of common diseases. Within the scope of the study, the Cleveland data set from the UCI database was used for heart disease and the data set from Vanderbilt University Biostatistics Department was used for diabetes. In the study, Random Forest, Logistic Regression and XGBoost were used as prediction algorithms. The performance of these algorithms was evaluated using 5-fold and 10-fold cross-validation. Criteria such as accuracy, sensitivity, precision and F1 score were used as performance measures. To solve the problem of unbalanced data in the diabetes data set, artificial data was produced and the data set was made more balanced. According to the results obtained, a success rate of 93,00% was achieved with the Random Forest algorithm, 94,00% with Logistic Regression and 95,00% with the XGBoost algorithm in predicting heart disease. In predicting diabetes, the Random Forest algorithm achieved success rates of 99,00%, Logistic Regression 96,00% and XGBoost algorithm 98,00%. In the continuation of the study, an XGBoost-based interface system was developed for predicting heart disease and a Random Forest-based interface system for predicting diabetes. As a result of the study, it was observed that artificial intelligence techniques showed high performance in predicting diseases in the field of health. The results obtained show that artificial intelligence techniques can be used effectively in the early prediction of diseases. Keywords: Machine Learning, heart disease prediction, artificial intelligence, diabetes prediction, random forest algorithm, XGBoost algorithm
Diseases such as heart disease and diabetes are among the important disease groups that have become increasingly common in recent years. The aim of this thesis is to contribute to the literature on the use of artificial intelligence techniques in the field of health and the early prediction of common diseases. Within the scope of the study, the Cleveland data set from the UCI database was used for heart disease and the data set from Vanderbilt University Biostatistics Department was used for diabetes. In the study, Random Forest, Logistic Regression and XGBoost were used as prediction algorithms. The performance of these algorithms was evaluated using 5-fold and 10-fold cross-validation. Criteria such as accuracy, sensitivity, precision and F1 score were used as performance measures. To solve the problem of unbalanced data in the diabetes data set, artificial data was produced and the data set was made more balanced. According to the results obtained, a success rate of 93,00% was achieved with the Random Forest algorithm, 94,00% with Logistic Regression and 95,00% with the XGBoost algorithm in predicting heart disease. In predicting diabetes, the Random Forest algorithm achieved success rates of 99,00%, Logistic Regression 96,00% and XGBoost algorithm 98,00%. In the continuation of the study, an XGBoost-based interface system was developed for predicting heart disease and a Random Forest-based interface system for predicting diabetes. As a result of the study, it was observed that artificial intelligence techniques showed high performance in predicting diseases in the field of health. The results obtained show that artificial intelligence techniques can be used effectively in the early prediction of diseases. Keywords: Machine Learning, heart disease prediction, artificial intelligence, diabetes prediction, random forest algorithm, XGBoost algorithm
Description
Keywords
Kalp hastalığı tahmini, Yapay zeka, Diyabet hastalığı tahmini, Rassal orman algoritması, XGBoost algoritması, Makine öğrenmesi, Machine learning, Heart disease prediction, Artificial intelligence, Diabetes prediction, Random forest algorithm, XGBoost algorithm