Abstract:
In today's world, heart disease is one of the leading causes of death. In clinical data analysis, predicting heart disease is a difficult task. Machine Learning (ML) helps assist with the decision-making and prediction of large volumes of data generated by the healthcare industry. The main goal of this study is to find the best performance model and compare machine learning algorithms for predicting heart disease. This work applies supervised machine learning algorithms, namely Logistic Regression, Support Vector Machine, KNearest Neighbor, and Random Forest, to the Cleveland Heart Disease dataset to predict heart disease. Our experimental analysis using preprocessing steps and model hyperparameter tuning, Logistic Regression, Support Vector Machine, K- Nearest Neighbor and Random Forest achieved 90.16%, 86.89%,86.89%, and 85.25%, accuracies respectively. As a result, Logistic Regression classification outperforms other machine learning algorithms in predicting heart disease.