In the past few decades, global industrial development and population growth have led to a scarcity of water resources, making sustainable management of groundwater a global challenge. The Water Quality Index (WQI) serves as a comprehensive method for assessing water quality and can provide valuable recommendations at the water quality level, optimizing policies for groundwater management. However, the subjectivity and uncertainty of the traditional WQI have negative impacts on evaluation outcomes, particularly in determining indicator weights and selecting aggregation functions. The proposed water quality index for groundwater based on the random forest (RFWQI) model in this study addresses these issues. It selects water quality indicators based on the actual pollution situation in the study area, employs an advanced random forest model to rank water quality indicators, determines indicator weights using the rank centroid method, scores the indicators using a sub-index function designed for groundwater development, and compares the results of two commonly used aggregation functions to identify the optimal one. Based on the aggregated scores, the water quality at 137 monitoring sites is classified into five levels: “Excellent”, “Good”, “Medium”, “Poor”, or “Unacceptable”. Among the 11 water quality indicators (sodium, sulfate, chloride, bicarbonate, total dissolved solids, fluoride, boron, nitrate, pH, CODMn, and hardness), chloride was given the highest weight (0.236), followed by total dissolved solids (0.156), and sodium was given the lowest weight (0.008). The random forest model exhibits a good prediction capability before hyperparameter tuning (86% accuracy, RMSE of 0.378), and after grid search and five-fold cross-validation, the optimal hyperparameter combination is determined, further improving the performance of the random forest model (94% accuracy, F1-Score of 0.967, AUC of 0.91, RMSE of 0.232). For the newly developed groundwater sub-index function, interpolation is used to score each indicator, and after comparing two aggregation functions, the NSF aggregation function is selected as the most suitable for groundwater assessment. Overall, most of the groundwater in the study area was of poor quality (52.5% of low quality) and not suitable for drinking.