Distinguished Professor of Civil Engineering Purdue University, United States
Abstract Submission: Harmful algal bloom (HAB) occurrences are continuing to increase globally in water bodies, with detrimental impacts on lake ecosystems and the socio-economics of surrounding populations. Yet the accurate HAB predictions remain challenging due to the complex interplay of influencing factors as well as the dynamic nature of bloom growth. Traditional process-based models often struggle with limited data and extensive calibration, leading to less effective predictions. Therefore, to address the challenge, this study propose data-driven approaches utilizing transfer learning to enhance HAB prediction accuracy for lakes with sparse data. This study integrates the data, including satellite images of the cyanobacteria index (CI), historical meteorological data, and water quality measurements from multiple lakes. A combination of data-driven approaches, including Random Forest and Shapley Additive Explanations (SHAP), is employed to identify key features of HAB dynamics, focusing on the evaluation and ranking of features to improve model performance. A Long Short-Term Memory (LSTM) model is used as the base for transfer learning, leveraging a well-established source domain dataset to enrich the limited data from a target lake. This approach aims to overcome data scarcity challenges by transferring knowledge from lakes with longer historical records to improve predictions for lakes with less data. By emphasizing key environmental factors and applying transfer learning, this study aims to develop a robust and generalizable prediction model for HAB occurrences. Our findings will contribute to the development of effective HAB management strategies and advance the field of HAB forecasting in aquatic ecosystems.