•  
  •  
 

Abstract

Accurate prediction of pan evaporation remains a significant challenge due to inconsistencies across different climatic regions. This study aims to enhance pan evaporation estimation by developing a robust hybrid machine learning (ML) model that integrates spectral clustering with advanced regression techniques, specifically the Histogram-based Gradient Boosting Regressor (HGBR) and Extreme Gradient Boosting Regressor (XGBR), to improve prediction accuracy and adaptability across diverse environments. The research developed a novel methodology by employing spectral clustering for models' performance enhancement, followed by rigorous hyperparameter tuning, sensitivity analysis to assess the impact of individual features on each model. Finally, models underwent lack of fit test to confirm model adequacy and usability. The findings of the study revealed that the HGBR model outperformed the XGBR, this is evidenced by its consistent training and testing results (training R2 of 0.94 and RMSE of 1.34; testing R2 of 0.92 and RMSE of 1.45) both training and testing observed close enough for judging on the robustness of the model compared to the XGBR (training R2 of 0.96 and RMSE of 1.11; testing R2 of 0.91 and RMSE of 1.48) which raises the issue of overfitting due to large gap between the R2 values for training and testing. These results demonstrate the HGBR model's superior robustness and reliability for predicting pan evaporation. The research contributes significantly to local and global water resource management strategies by providing a reliable predictive tool and sets a foundation for future studies to further refine these models and explore their applicability in other geographical settings.

Included in

Engineering Commons

Share

COinS