A Systematic Literature Review on AI Approaches To Address Data Imbalance In Machine Learning

Authors

  • Kutub Uddin Apu
  • Mohammad Ali
  • Md Fakrul Islam
  • Masum Miah

DOI:

https://doi.org/10.70937/faet.v2i01.57

Keywords:

Data Imbalance, Machine Learning, Artificial Intelligence, Oversampling, Cost-Sensitive Learning

Abstract

Data imbalance is a pervasive issue in machine learning, where unequal class distributions often lead to biased models and poor predictive performance, particularly for underrepresented minority classes. This systematic review examines a range of strategies employed to address data imbalance, encompassing data-level methods, algorithm-level techniques, hybrid approaches, and advanced AI-driven solutions. A total of 92 peer-reviewed studies were analyzed, providing comprehensive insights into the methodologies, applications, and effectiveness of various techniques. Data-level approaches, such as SMOTE and its extensions, were identified as widely applied but faced challenges in introducing noise and redundancy. Algorithm-level methods, including cost-sensitive learning and ensemble techniques, demonstrated robust performance but required careful parameter tuning and computational resources. Hybrid approaches combined the strengths of these strategies, offering enhanced accuracy and adaptability for complex imbalance scenarios. Advanced AI techniques, such as GANs, VAEs, and deep learning architectures, emerged as powerful tools for handling high-dimensional and imbalanced datasets but were often constrained by computational demands and overfitting risks. The review also identified significant gaps, including the lack of standardized evaluation metrics, which hinder the comparability of findings across studies. By synthesizing these insights, this study provides a foundation for addressing recurring challenges and advancing research in mitigating data imbalance across diverse applications.

Downloads

Published

2025-01-20

How to Cite

Apu, K. U., Ali, M., Islam, M. F., & Miah, M. (2025). A Systematic Literature Review on AI Approaches To Address Data Imbalance In Machine Learning. Frontiers in Applied Engineering and Technology, 2(01), 58–77. https://doi.org/10.70937/faet.v2i01.57