University of North Dakota, North Dakota, United States
Abstract Submission: This research project investigates the comparative effectiveness of regional versus site-specific machine learning models for estimating water quality parameters including nitrates and phosphates and chlorophyll-a. Site-specific models are designed for individual locations, leveraging detailed empirical data to create highly accurate predictions based on local environmental variables and conditions. Conversely, regional models aggregate data from multiple sites to develop broader predictive frameworks, which provide a more comprehensive view of water quality across different geographic areas. These models benefit from increased diversity in training data, allowing them to generalize better and account for extreme environmental events such as floods or droughts. However, they face challenges such as data heterogeneity and the risk of overfitting. To address these challenges, techniques like feature engineering, which enhances model robustness by capturing relevant information across varied conditions, and domain adaptation, which transfers knowledge between different sites, are employed. This project focuses on 30 basins in North Dakota, where models were developed and tuned for each basin. Benchmarking and analysis of these models are essential and the output from these models yield interesting results when compared to a regional model that was trained using training data from all 30 of the basins across the state of North Dakota.