Comparison of Data Mining Model Performance in Heart Disease Detection with Feature Selection Application: Perbandingan Kinerja Model Data Mining Dalam Deteksi Penyakit Jantung Dengan Penerapan Feature Selection

Widya Cholid Wahyudin; Tole Sutikno; Rusydi Umar; Ahmad Ridwan

doi:10.21070/joincs.v8i1.1669

Authors

Widya Cholid Wahyudin Universitas Muhammadiyah Kudus
Tole Sutikno Universitas Ahmad Dahlan Yogyakarta, Indonesia
Rusydi Umar Universitas Ahmad Dahlan Yogyakarta, Indonesia
Ahmad Ridwan Universitas Muhammadiyah Kudus, Indonesia

DOI:

https://doi.org/10.21070/joincs.v8i1.1669

Abstract

Penyakit jantung merupakan penyebab utama kematian di seluruh dunia, sehingga deteksi dini sangat penting untuk meningkatkan harapan hidup pasien. Dengan kemajuan teknologi data mining dan machine learning, prediksi penyakit jantung dapat dilakukan lebih akurat. Penelitian ini membandingkan kinerja prediksi model Logistic Regression, Decision Tree, Random Forest, K-Nearest Neighbors (KNN), dan Support Vector Machine (SVM) dalam mendeteksi penyakit jantung menggunakan UCI Heart Disease Dataset. Teknik feature selection—Filter Method, Wrapper Method (RFE), dan Embedded Method—diterapkan untuk meningkatkan akurasi prediksi dan mengurangi kompleksitas model. Hasil eksperimen menunjukkan bahwa SVM mencapai akurasi tertinggi sebesar 91,2%, diikuti Random Forest dengan 90,7%. Penggunaan feature selection terbukti meningkatkan kinerja model secara signifikan dengan mengurangi dimensi data dan menghindari overfitting. Temuan ini menunjukkan efektivitas SVM dan Random Forest dalam pengembangan sistem prediksi penyakit jantung yang efisien di lingkungan klinis.

Kata kunci: Data Mining, Prediksi Penyakit Jantung, Feature Selection, Support Vector Machine

References

[1] W. H. Organization, “Cardiovascular diseases (CVDs),” 2021.

[2] S. Yusuf and others, “Effect of potentially modifiable risk factors associated with myocardial infarction in 52 countries (the INTERHEART study): case-control study,” The Lancet, vol. 364, no. 9438, pp. 937–952, 2004.

[3] R. Detrano and others, “International application of a new probability algorithm for the diagnosis of coronary artery disease,” American Journal of Cardiology, vol. 64, no. 5, pp. 304–310, 1989.

[4] A. Chauhan and M. Ghosh, “A Comparative Study of Feature Selection Methods for Heart Disease Prediction,” J Comput Sci, vol. 42, p. 101157, 2020.

[5] S. B. Kotsiantis, “Supervised machine learning: A review of classification techniques,” Informatica, vol. 31, no. 3, pp. 249–268, 2007.

[6] R. Alizadehsani and others, “A data mining approach for diagnosis of coronary artery disease,” Comput Methods Programs Biomed, vol. 111, no. 1, pp. 52–61, 2013.

[7] I. Guyon and A. Elisseeff, “An Introduction to Variable and Feature Selection,” Journal of Machine Learning Research, vol. 3, pp. 1157–1182, 2003.

[8] M. Shouman, T. Turner, and R. Stocker, “Using decision tree for diagnosing heart disease patients,” in 9th Australasian Data Mining Conference, 2012.

[9] W. Cholid Wahyudin, F. Maisa Hana, and A. Prihandono, “PREDIKSI STUNTING PADA BALITA DI RUMAH SAKIT KOTA SEMARANG MENGGUNAKAN NAIVE BAYES,” 2023.

[10] W. Cholid Wahyudin, “KLASIFIKASI STUNTING BALITA MENGGUNAKAN NAIVE BAYES DENGAN SELEKSI FITUR FORWARD SELECTION.”

[11] H. Kaur and S. K. Wasan, “Empirical study on applications of data mining techniques in healthcare,” Journal of Computer Science, vol. 2, no. 2, pp. 194–200, 2006.

[12] J. Tang, S. Alelyani, and H. Liu, “Feature selection for classification: A review,” in Data Classification: Algorithms and Applications, 2014, pp. 37–64.

[13] M. J. Khan and M. Usman, “Early detection of heart diseases using classification and data mining techniques,” International Journal of Advanced Computer Science and Applications, vol. 10, no. 6, pp. 160–165, 2019.

[14] D. S. Rajput and S. B. Mane, “Hybrid approach for heart disease diagnosis using data mining techniques,” Int J Comput Appl, vol. 140, no. 15, pp. 17–21, 2016.

[15] R. Das, I. Turkoglu, and A. Sengur, “Effective diagnosis of heart disease through neural networks ensembles,” Expert Syst Appl, vol. 36, no. 4, pp. 7675–7680, 2009.

[16] A. Dey and S. Samanta, “Performance analysis of machine learning techniques for heart disease prediction,” Procedia Comput Sci, vol. 167, pp. 706–716, 2021.

[17] G. Chandrashekar and F. Sahin, “A survey on feature selection methods,” Computers & Electrical Engineering, vol. 40, no. 1, pp. 16–28, 2014.

[18] M. A. Al-Betar and others, “Data mining and machine learning techniques for heart disease prediction: A review,” Artif Intell Rev, vol. 51, pp. 597–623, 2019.

[19] H. Liu and H. Motoda, Feature Selection for Knowledge Discovery and Data Mining. Springer, 1998.

[20] N. Pudjihartono, T. Fadason, A. W. Kempa-Liehr, and J. M. O’Sullivan, “A review of feature selection methods for machine learning-based disease risk prediction,” Frontiers in Bioinformatics, vol. 2, p. Article 927312, 2022, doi: 10.3389/fbinf.2022.927312.

[21] K. Sumwiza and et al., “Enhanced cardiovascular disease prediction model using random forest algorithm,” Inform Med Unlocked, vol. 41, p. Article 101316, 2023, doi: 10.1016/j.imu.2023.101316.

Comparison of Data Mining Model Performance in Heart Disease Detection with Feature Selection Application

Perbandingan Kinerja Model Data Mining Dalam Deteksi Penyakit Jantung Dengan Penerapan Feature Selection

Authors

DOI:

Abstract

References

Downloads

Published

How to Cite

Issue

Section

Categories

License

Most read articles by the same author(s)

Make a Submission

menu

template

sinta

issn

statistik

contact

Information