Assistant Professor University of Tennessee, Tennessee, United States
Abstract Submission: This study explores applying large language models in automating water infrastructure design. Traditional engineering methods struggle with the complexity of fluid dynamics, soft constraints, regulatory requirements, and other non-algorithmic processes. Large language models capable of processing natural language, generating reasoning traces, and adapting to dynamic contexts offer a novel approach to these problems. We evaluate various models, including GPT-4o, Llama 8B, and Llama 70B, across "untethered," \textit{a priori} prompting techniques and "tethered" knowledge embedding methods. We use a large language model framework to generate a synthetic dataset with a sufficiently non-collapsed distribution of design parameters within realistic ranges. Untethered prompting techniques highlight the deficiencies of domain knowledge in consumer large language models. Instead of retraining the large language model, which requires massive time and energy resources and a loss of generality, we embed the domain knowledge directly into the LLMs context window, thereby giving the model near-perfect performance on our synthetic dataset. Additionally, giving the model access to a code execution system mitigates large language models' canonical arithmetic deficiencies. This research integrates large language models into engineering workflows, providing insights into their capabilities and limitations. Future developments in multimodal generative artificial intelligence and enhanced reasoning and validation paradigms are critical for advancing the use of large language models in water infrastructure design.