Abstract
Researchers work around the clock on many datasets provided by various institutions. These researchers strive to come up with highly efficient Artificial Intelligence models. Often, researchers face the problem of imbalance in the distribution of classes in a particular feature in the selected dataset, which creates an Artificial Intelligence model biased towards one class at the expense of another class that is no less important than the first. On the other hand, thalassemia is a disease that affects people of different ages. The degree of disease varies according to the thalassemia class. This study proposes an improved Machine Learning model that aims to provide a comprehensive comparison of different SMOTE techniques to enhance class balance in thalassemia prediction. And create a Machine Learning model that predicts the possibility of an individual suffering from thalassemia based on the data modified by the proposed SMOTE technology. This study concluded, according to the proposed model, that the best SMOTE technique that can be used in such datasets with clear imbalance is the SMOTE-ENN technique, as the model achieved high-accuracy prediction results, as the model's accuracy was 99% and the F1-score was 97%. This study provides software developers with the steps and source code to develop it as a mobile or computer application to help people know the probability of their infection with thalassemia. The study also helps researchers determine the best SMOTE technique that is compatible with imbalanced datasets.
Recommended Citation
Merdas, Hussam Mezher and Mousa, Ayad Hameed
()
"Investigating the Differential Effects of SMOTE Variants on Class Imbalance and Exploring Their Applicability to a Thalassemia Prediction Model,"
AUIQ Technical Engineering Science: Vol. 2:
Iss.
1, Article 2.
DOI: https://doi.org/10.70645/3078-3437.1021
Follow us: