ABSTRACT
The paper describes a web application developed for managing and presenting experiment data of the WIDER UPTAKE project funded by Horizon Europe. The project's goal is to promote water-smart and sustainable solutions among stakeholders in multiple countries. The application enhances data management and stakeholder engagement through the use of a third-party large language model. It integrates data from demonstration case studies with real-time sensor measurements and laboratory tests, into a comprehensive cloud-based platform. It facilitates data visualization, regulatory compliance checks, and risk assessments for chemical and microbial hazards. The application significantly aided in the coordination and communication of project findings among stakeholders. Key functionalities include interactive diagrams, risk assessment tools, and automated report generation using artificial intelligence (AI). The AI-generated reports, while maintaining confidentiality of data in most cases, provided clear and informative summaries of data compliance with regulatory standards. The web application extended stakeholder engagement, democratized access to complex data, and supported decision-making processes for implementing sustainable water management solutions. However, the development encountered significant challenges surrounding transparency, fairness, accountability, and privacy, which impedes the refinement and scalability of this approach for broader use. Future research should focus on overcoming these obstacles to ensure a more effective and ethical application.
HIGHLIGHTS
Artificial intelligence (AI)-enhanced water management: Large language model (LLM) integration for better communication and decision-making in a water-smart circular economy project.
Advanced data management: Web app for efficient visualization of experimental data, improving stakeholder engagement.
User-friendly interface: interface with customizable visualization options and automated report generation.
AI-driven risk assessment: Augments risk analysis with AI-generated commentary and regulatory compliance checks.
INTRODUCTION
Water supports diverse ecosystems across the planet and holds our societies together (Dudgeon et al. 2006; Navarro-Ortega et al. 2015; Tahir et al. 2018; Morseletto et al. 2022). However, continual industrialization, urban growth, and climate change significantly worsen its availability and quality (Andrews et al. 2011; Colella et al. 2021). Moreover, the same factors jeopardizing the water sector favour further production of waste (Collivignarelli et al. 2019). This imbalance between resource consumption and replenishment – accentuated by the perceivably irresponsible use of potable water for tasks that do not require it (e.g., certain irrigation practices, street washing, and toilet flushing) – created demand for more sustainable water management approaches as Wanner et al. (2023) demonstrates. Due to the scale of the problem, the proposed approaches became increasingly complex, involving a wide array of interconnected stakeholders, policymakers, and institutions. Initially, however, it was a simpler idea – the need to preserve limited natural resources – that drove different actors to work together in closing the loop through reuse and recycling. Over time, these collaborative efforts and interlinked approaches coalesced into what is now known as the Circular Economy (CE) framework, which aims to minimize resource waste and promote sustainable practices (Merli et al. 2018; Morseletto et al. 2022).
Recently, a great number of research and innovation projects dedicated to water reuse have sprung up, including those funded by Horizon Europe, a EU Framework Programme for research and innovation until 2027 (European Commission 2021). Given the disruptive nature of CE and the spirit of the responsible research, these projects tend to include a wide array of stakeholders ranging from research institutions and large industries to small- and medium-sized enterprises (SMEs) and societies at large (Mazzonetto & Simone 2018). In other words, embracing CE principles in the water sector usually requires transformative efforts across the entire value chain, from researchers to end-users (Kirchherr et al. 2023).
Yet, the potential of these efforts can be hindered by inefficient data management and stakeholder engagement (Bellini & Bang 2022; Bellini et al. 2024). This is especially true for case studies, including pilot tests and demo sites, since they are designed to demonstrate the feasibility, viability, and safety of innovative water reuse solutions to a broader, external audience. Recent Horizon Europe initiatives have generated an extensive amount of valuable knowledge, which was delivered as various assessment frameworks/toolkits, analytic platforms, and technology databases as seen in works of Matta et al. (2021), Afghani et al. (2022), Akinsete et al. (2022), Bhambhani et al. (2022), Ghafourian et al. (2022) and Bouziotas et al. (2023). Consequently, while some projects, such as Water2Return (European Commission 2024b), SMART-Plant (European Commission 2024c), the AWESOME project by PRIMA (European Commission 2024a) relied heavily on these technical factsheets, special deliverables, and frameworks for dissemination, others took a more innovative approach by developing digital tools to promote their findings, enhance dissemination efforts and engage more intensely with stakeholders (Table 1).
Digital dissemination tools used in horizon Europe projects
Dissemination tool . | Project acronym . | Description . | Reference . |
---|---|---|---|
Decision analytic platform (DAP) | Project Ô | Light data science cloud-based platform for polluted wells and water distribution network analyses. | Corti et al. (2022) |
Replication tool | HYDROUSA | The tool quickly assesses the replicability of six HYDROUSA solutions based on user-provided information through questionnaires. | Fatone et al. (2023) |
HYDROUSA Game | HYDROUSA | A browser game for increased public awareness. | Tsiropoulos et al. (2024) |
Serious game | NextGen | An educational browser game aiming to demonstrate the roles of water, energy and materials in the urban water cycle. | Khoury & Evans (2023) |
Augmented reality-enhanced demo cases | WATER-MINING | Educational and gamified augmented reality (AR) presentation of the project's case studies. | Katika & Tsiakou (2023) |
AR Tool | NextGen | An AR tool for engaging citizens on the topic of the CE. | Campos (2022) |
Water Europe Marketplace | NextGen, B-WaterSmart, ULTIMATE | A hub for innovative circular economy solutions in water, energy, and materials | Frijns (2023) |
Dissemination tool . | Project acronym . | Description . | Reference . |
---|---|---|---|
Decision analytic platform (DAP) | Project Ô | Light data science cloud-based platform for polluted wells and water distribution network analyses. | Corti et al. (2022) |
Replication tool | HYDROUSA | The tool quickly assesses the replicability of six HYDROUSA solutions based on user-provided information through questionnaires. | Fatone et al. (2023) |
HYDROUSA Game | HYDROUSA | A browser game for increased public awareness. | Tsiropoulos et al. (2024) |
Serious game | NextGen | An educational browser game aiming to demonstrate the roles of water, energy and materials in the urban water cycle. | Khoury & Evans (2023) |
Augmented reality-enhanced demo cases | WATER-MINING | Educational and gamified augmented reality (AR) presentation of the project's case studies. | Katika & Tsiakou (2023) |
AR Tool | NextGen | An AR tool for engaging citizens on the topic of the CE. | Campos (2022) |
Water Europe Marketplace | NextGen, B-WaterSmart, ULTIMATE | A hub for innovative circular economy solutions in water, energy, and materials | Frijns (2023) |
However, despite these advancements, there remains untapped potential in utilizing large language models (LLMs) to get complex topics across to a broader audience – such as end-users, communities, stakeholders, potential clients, and policymakers – a method that has yet to be widely adopted within these resources. LLMs use a particular architecture that enables computers to understand and generate natural language (Rogers & Luccioni 2024), which is useful to help broader audiences to grasp engineering and environmental topics that are often riddled with jargon and complex data. Artificial intelligence (AI) and machine learning (ML) are closely associated with LLMs and have been used in water management before, albeit sparingly compared to other engineering fields. They have been successfully utilized to model wastewater treatment systems, optimize process parameters, predict performance outcomes, and detect and identify contamination as demonstrated in the extensive study by Wang et al. (2023). Moreover, it falls in line with a recent trend for data-based approaches to water management, as there is an established benefit from using technologies such as sensor networks, smart meters, and real-time monitoring systems to significantly enhance its efficiency, risk assessment, and sustainability (Eggimann et al. 2017). Nevertheless, the trend stays purely technical in nature despite notorious communication issues with stakeholders and end-users in the field (Elelman et al. 2021). LLMs have every potential to bridge that gap. Furthermore, as Doorn (2021) notes, water professionals already cooperate closely with data scientists and social sciences experts, presenting an opportunity to address ethical concerns associated with stakeholder engagement in data-driven decision-making in other sectors.
LLMs have been shown to help democratize complex data access and facilitate decision-making. They address a key challenge in implementing ML for water management: democratizing data assets in a socially justified manner (Sun & Scanlon 2019) and supporting decision-making (Kamyab et al. 2023). By introducing interactivity and enhancing the communicability of data, LLMs can make data more accessible. In healthcare, for example, educational chatbots – which allow users to interact with LLMs – have demonstrated significant potential when supplementing rather than supplanting human expertise (Reddy 2024). To the best of the authors' knowledge, an AI-driven tool used to convey regulated pollution indicators, technical parameters, and technological principles of a water reuse demonstration case study to peripheral stakeholders is still an area to explore.
This article focuses on developing a web application for storing, processing, and showcasing the experimental and conceptual data generated in the WIDER UPTAKE project (hosted at https://database.wider-uptake.eu/). Funded by Horizon Europe, WIDER UPTAKE is a multi-disciplinary project dedicated to water-smart CE solutions spanning five countries. It supports wastewater treatment plants in resource recovery, water reuse, and energy sustainability by addressing technological, regulatory, organizational, social, and economic barriers (Mannina et al. 2021). The case study in Prague was tasked with demonstrating the technical and chemical feasibility of water reuse in the urban context. The team has decided to utilize a LLM-powered web application to bridge the gap in sustainable water management communication, providing a valuable example for other similar Horizon Europe projects. The web application grants access to data, data visualization, and detailed information about case studies, as well as qualitative chemical risk assessment (QCRA) and qualitative microbial risk assessment (QMRA) conducted using the measured data. Consequently, it uses AI to process this and other data to generate a report containing a regulation compliance check for the measured parameters. This allows the data to be interpreted not only by trained professionals but also by indirect stakeholders involved in the project. Furthermore, it opens the possibility of user data input, potentially turning the web application into a full-fledged decision-making tool for those interested in water recycling.
This article covers the role of the web application in WIDER UPTAKE project, the web application's architecture and development, as well as necessary considerations for future research.
METHODS
A photo of one of the ‘rain garden’ irrigation boxes, which includes sections of lawn, plants, and shrubs.
A photo of one of the ‘rain garden’ irrigation boxes, which includes sections of lawn, plants, and shrubs.
From the beginning of April until the end of November, during the 2022–2023 seasons, the experiment was conducted to study the short- and long-term effects of wastewater reuse for irrigation. To assess these effects accurately, a wide range of parameters was closely monitored. Some parameters were measured in real-time, while others were evaluated through offline methods, such as monthly or semi-monthly sampling and laboratory tests (see Table 1).
The full construction process of the irrigation boxes – including further details on the sensors and the camera – is described in publicly available deliverables of the WIDER UPTAKE project (Pollert 2021; Pollert et al. 2023).
From sensor data to web application
The primary purpose of data aggregation was to create a comprehensive project database. This database not only had to aggregate data but also facilitates its retrieval for visualization and risk assessment. The cloud storage was chosen as the best intermediary for the job due to its scalability, accessibility, and robust security features, ensuring that data could be efficiently managed, accessed from various locations, and protected against unauthorized access. The overarching goal was to enable project participants to ensure compliance with regulatory standards, assess risks for workers and end-users of the recycled water, and present data to stakeholders using advanced visualization techniques.
These goals implied the development of a web application that would authenticate the users, fetch the data, and allow the users to interact with it in a certain way. Moreover, as web applications generally do not require installing anything on the user's device, they are accessible from anywhere, regardless of the operating system, ensuring user-friendliness and intuitiveness. It was decided to host it on a Synology-brand server at the Czech Technical University (CTU) in Prague. The backend was written in PHP programming language to manage data creation, storage, and changed. To safeguard data integrity and security, rigorous measures such as firewall rules, automated IP blocking for suspicious activities, and daily backups to a secondary server are implemented. From the architecture perspective of the web application, the server served two main purposes: user authentication and data fetching. Furthermore, the data fetching is limited by the role of a user. The core functionalities for interacting with the fetched data was completely transferred to the frontend as the application evolved as a small-scale extension of the database, making a simple frontend implementation more practical given the limited audience. Thus, the frontend became responsible for processing the data, performing computations, and providing an interactive user interface, effectively offloading these tasks from the server and enhancing overall system performance in case of a large number of simultaneous active sessions. However, this advantage comes with higher security risks, especially if the application is opened to external users.
Frontend and functionalities
The visualization scripts, as well as QCRA and QMRA algorithms are client-side, i.e., the data are received and processed in a browser's memory by running the TypeScript code to initiate the web interface of the application. An authenticated user is welcomed by an uncluttered interface with several menus on the side to piece together the visualization of his choosing. Alternatively, they can toggle one of the predefined templates for some of the case studies (e.g., ‘Prague – Heavy Metals: Pb, Cd, Cr, Fe’ or ‘Stavanger – Cd, Ni and As, Hg – Across All Measured Sources’). In any case, the interaction with the data is guided by 9 independent widgets (i.e., an interface component that functions independently and executes its own code) including the Report widget that gathers the data from five other widgets for automatic report generation (see Table 3). Each widget allows for further clarification of the data to be shown if available, i.e., place of sampling, measured parameter, or additional legal and situational context.
Parameters measured during the Prague case experiment
Data collection . | Group of parameters . | Examples . |
---|---|---|
Online | Related to water | Irrigation system flow rate, turbidity, conductivity, pH |
Soil and air | Soil moisture and temperature, relative humidity and air temperature | |
Camera feed | Snapshots of vegetation | |
Offline | Basic parameters | BOD5, CODCr, TOC, N-![]() ![]() ![]() |
Trace elements | B, Cd, Pb, As, Cr, Cu, Ni, Zn, Fe, Mn | |
Microbiologic Parameters | Escherichia coli, Clostridium perfringens, thermotolerant coliform bacteria, etc. | |
Micropollutants | 50+ pharmaceuticals, personal care products, and other organic substances |
Data collection . | Group of parameters . | Examples . |
---|---|---|
Online | Related to water | Irrigation system flow rate, turbidity, conductivity, pH |
Soil and air | Soil moisture and temperature, relative humidity and air temperature | |
Camera feed | Snapshots of vegetation | |
Offline | Basic parameters | BOD5, CODCr, TOC, N-![]() ![]() ![]() |
Trace elements | B, Cd, Pb, As, Cr, Cu, Ni, Zn, Fe, Mn | |
Microbiologic Parameters | Escherichia coli, Clostridium perfringens, thermotolerant coliform bacteria, etc. | |
Micropollutants | 50+ pharmaceuticals, personal care products, and other organic substances |
Names and descriptions of widgets developed for the web application
. | Widget name . | Description . |
---|---|---|
Utilized by report widget | Overview of treatment plant design | Interactive diagram of the plant at the monitoring site. |
Risk assessment | Quantifying chemical and biological exposure risks and probabilities via QCRA and QMRA developed during the project. | |
Visualization of processes and parameters | Entire process breakdown for a selected case study and studied parameters. | |
End-state snapshot | The last picture of the process taken at the end of chosen time range. | |
Measured values | Data presentation of measured values in line graph form over current time range. | |
Not utilized by report widget | Roadmap information | Sustainable water management roadmap, showcasing models and key findings. |
Process time-lapse | Time-lapse of the process over the chosen time range. | |
Data export | Tabular data export from Database. | |
– | Report | Case study summary with an AI-generated text report. |
. | Widget name . | Description . |
---|---|---|
Utilized by report widget | Overview of treatment plant design | Interactive diagram of the plant at the monitoring site. |
Risk assessment | Quantifying chemical and biological exposure risks and probabilities via QCRA and QMRA developed during the project. | |
Visualization of processes and parameters | Entire process breakdown for a selected case study and studied parameters. | |
End-state snapshot | The last picture of the process taken at the end of chosen time range. | |
Measured values | Data presentation of measured values in line graph form over current time range. | |
Not utilized by report widget | Roadmap information | Sustainable water management roadmap, showcasing models and key findings. |
Process time-lapse | Time-lapse of the process over the chosen time range. | |
Data export | Tabular data export from Database. | |
– | Report | Case study summary with an AI-generated text report. |
The web application's interface: custom layout with report (left-hand side), risk analysis (top right corner) and measured values (bottom right corner) widgets.
The web application's interface: custom layout with report (left-hand side), risk analysis (top right corner) and measured values (bottom right corner) widgets.
In Figure 4, the Report widget is displayed on the left side of the screen. The Report widget can be considered the backbone of the entire application, as it meets user expectations by providing an intelligible report on the data of greatest interest. Its primary function is to guide the user through the most informative widgets (e.g., End-State Snapshot, Measured Values, Visualization of Processes and Parameters, Overview of Treatment Plant Design, and Risk Assessment for the Prague case), focusing only on the parameters chosen in the initial step. The main advantage of the Report widget is the AI-generated description accompanying each used widget. An API call to OpenAI's ChatGPT-4 (previously ChatGPT-3) is initiated on the client side of the application, containing the relevant data, regulations, and standards for the selected parameters. Using AI for regulatory compliance checks allowed for storing legal documents as PDFs in the database without the need to transcribe them into separate CSV files. Although the documents were later injected in the prompt (see Supplementary material, Appendix 2), this process was also facilitated by AI. This approach not only eliminated unnecessary work and enabled the integration of various legislative texts but also provided essential context for the LLM to make recommendations for possible applications.
The prompt provided for the AI-based report generation is designed to ensure the confidentiality of the measured values while delivering an accurate and informative analysis. It includes specific instructions to avoid displaying the raw numerical values of the measurements directly in the generated report to prevent potential legal issues associated with disclosing confidential data. The prompt includes various standards with their names, levels, and units for reference. However, since the measured values are kept confidential, there is no need to output these standards directly in the report. The focus is on discussing the compliance of the measured values with relevant standards (such as European and Czech standards) while maintaining the confidentiality of the measured data. This approach focused on the use of AI in regulatory compliance and risk assessment within the WIDER UPTAKE project, balancing data confidentiality and informative reporting.
The Risk Assessment widget was developed in collaboration with colleagues from the Norwegian University of Science and Technology. Upon launch, it runs algorithms based on the selected type of water and exposure scenario (e.g., brief contact for a ‘visitor’ or prolonged exposure for a ‘worker’) to assess the risk posed by a specific compound (in the case of QCRA) or microbial contamination (in the case of QMRA). The widget presents the results as either the potential intake dose for chemicals or the Disability-Adjusted Life Years (DALY) per case for microbial contaminants. These procedures allow for a transparent and comprehensive risk evaluation by government bodies or public end-users. Therefore, the widget was at the core of the application and the key component of the Report widget. For more information about the risk assessment see Ardiyanti et al. (2021) and Ardiyanti et al. (2024).
In summary, a system was implemented that uses sensors to collect real-time data on plant growth using treated wastewater for irrigation. This data is transmitted to a cloud platform for storage and management. A web application was developed to visualize this data, assess potential risks using custom QCRA and QMRA algorithms, and generate reports. This approach allows the evaluation of the feasibility and safety of using treated wastewater for urban agriculture.
RESULTS AND DISCUSSION
The web application served as the primary database throughout the entire duration of the WIDER UPTAKE project. It accompanied the Prague case experiment during the 2021–2022 measurement seasons and acted as a repository of knowledge for other cases and key findings of the project (e.g., the Roadmap widget). Specifically, it helped the Prague team coordinate efforts to overcome legal barriers in urban water reuse in the Czech Republic. Additionally, stakeholders who were approached with water reuse pilot solutions had the opportunity to familiarize themselves with and interact with the data and the explanations provided through the Report widget in their browsers (an example of a report for the Prague case is provided in Supplementary material, Appendix 3). It was especially valuable given the legal status of water reuse in the Czech Republic. It is not generally popular among the population and there is no legal act that would regulate it (Ramm & Smol 2023). Among other efforts, in the result of the Prague case study demonstration efforts (among which the web application was the main knowledge hub and demonstration unit), the ministry of agriculture has decided to update their ‘Circular Czechia Action Plan’ (Ministry of the Environment of the Czech Republic 2022) strategy to emphasize water reuse and to accept the European water reuse regulation (i.e., Regulation 2020/741) in the future. Thus, the web application achieved its goal of extended stakeholder engagement, including legal bodies and peripheral stakeholders such as parks and one of Prague's district administrations.
The web application served as the foundational platform for managing data from the large experiment involving continuous real-time measurements and periodic laboratory tests. It structured the data within the Visualization of Processes and Parameters widget, allowing measured parameters to be traced across multiple levels, such as water, plants, soil, and filtrate. Additionally, it generated ideal material for AI analysis, which was instrumental in achieving the application's other two goals: ensuring compliance with regulations and generating risk assessment conclusions. At the same time, this analysis is unique in the field of data management for circular water economy projects, so it is the most important functionality to describe and assess its performance here.
AI conclusion inference for risk assessment
The QCRA and QMRA algorithms were specifically developed to account for potential adverse effects of the hazards contained in recycled water (Ardiyanti et al. 2021, 2024). The methodology was incorporated into Risk Analysis and Report widgets. The latter provided a conclusion generated on the basis of the results of the assessment on the right-hand side of the widget. The commentary usually had the following structure (as exemplified by the QMRA for thermotolerant coliform bacteria TCOLI in Supplementary material, Appendix 4):
introductory phrases,
description of QMRA methodology,
analysis of TCOLI values in different water sources,
visualization comment,
implications and standards comparison,
conclusion.
As seen in Supplementary material, Appendix 2, the prompt instructed the LLM to avoid displaying confidential values to the user. Therefore, the ‘Analysis of TCOLI values’ section of the report begins with a polite disclaimer regarding the impossibility of disclosing the full set of data. The commentary then proceeds to describe the relationship with all the legislative documents included in the prompt, indicating whether the values exceed or remain within the limits. Note that the standards are referred to by code names, a remnant of the data-processing pipeline. Further study of the provided commentary might leave the reader with a feeling of uncertainty, as most of the answers the LLM provides are very cautious.
AI response consistency analysis for risk assessment and regulatory compliancy check
Due to the nature of the LLM, this commentary varied each time the user generated it. For the sake of analysing the perceived consistency of AI-generated reports, 20 reports were generated for the same set of parameters (TCOLI and Escherichia coli) and empirically assessed according to the following qualities:
1. compliance status (indicating whether the values are above or below regulation limits);
2. structural consistency (ensuring they have a common text structure and flow);
3. tone consistency (investigating how invariable is the overall perceived tone of the texts);
4. consistent terminology (ensuring the same terms are used across texts);
5. quantitative disclosure (indicating how many texts reveal numerical data).
The analysis revealed that all AI-generated reports uniformly indicated that the TCOLI levels were within the limits set by relevant regulatory standards, which is true. Moreover, the structural consistency across the reports was remarkably high. Each report typically included an introduction to the QMRA process, an analysis of TCOLI levels in different water sources, a comparison with regulatory standards, and a conclusion discussing the health implications – ultimately closely following the example mentioned above (see Supplementary material, Appendix 4). This uniform structure combined with a consistently formal tone observed in all the reportsdemonstrates the AI's ability to organize content coherently and systematically, which is essential for readability and comprehension in scientific reporting. The tone of the reports also remained consistently formal and scientific, appropriate for professional and academic contexts. Maintaining a consistent tone is important for ensuring that the reports are taken seriously and understood correctly by their intended audience.
Spider chart for the response consistency analysis. Each tick represents a 20% step with 0% in the centre.
Spider chart for the response consistency analysis. Each tick represents a 20% step with 0% in the centre.
Overall, the AI-generated QMRA reports exhibited high consistency in qualifying risk, maintaining structure, and tone, which underscores the reliability of AI in producing coherent and professional scientific reports. On the other hand, inconsistencies did arise in revealing some of the less important numbers and word choices. In other words, they continue to be the source of doubt. Overall, it may be concluded that the use of text generated by LLM rather benefited the result presentation. Not one instance of changing the overall conclusion completely (e.g., recommending a certain use case in one report that was discouraged in subsequently generated reports) was observed during the experiment. That is not to say that there is no possibility that it might happen.
Concerns and future outlook
The use of recycled water in public spaces like parks and streets should be a decision made by the public, based on thorough risk analyses to understand potential health and environmental risks. Ideally, the participation of people affected by the decision ever so slightly ought to be facilitated by democratic use of data and extended stakeholder involvement. The database web application discussed in this article is the first step toward developing a publicly available consultancy solution, enabling people to collaborate with scientists in the data science process. Despite AI and LLMs paving a way for more democratic data-driven decision-making of the kind, the use of this still emerging technology comes along with certain concerns one has to address before implementing it into a publicly available solution, namely transparency, justice and fairness, responsibility and accountability, privacy, and non-maleficence (Doorn 2021).
Transparency in LLM-powered applications is crucial for stakeholder engagement, addressing concerns about understanding model behaviour, accountability, and communication of capabilities and limitations. The complexity and unpredictability of LLMs, along with their massive and opaque architectures, make it challenging for stakeholders to fully grasp their functions. Proprietary models such as ChatGPT-4 further restrict independent verification (Liao & Wortman Vaughan 2024), which means there is no way for a third-party user to ensure that they are being trained on diverse and representable datasets. Moreover, in the case of the WIDER UPTAKE database web application, users may either over trust or distrust LLM outputs, leading to overreliance on flawed AI or rejection of water reuse. Although our response consistency analysis did not show any signs of AI hallucination or non-compliance with regulations, such issues could still arise in the worst-case scenarios due to the nature of LLMs. One way to address this issue is by referring users of the application to the extensive repository of deliverables and milestone reports published in the project, emphasizing that the Report widget is merely a tool for obtaining a high-level overview of the results. However, it is encouraging to see the tool reliability during the consistency test.
Justice and fairness address the issues arising from bias in the training data used for publicly available AI chatbots or ML algorithms. In other words, the commentary provided by the Report widget might be overly positive, despite data indicating potential concerns. To ensure a balanced perspective, it is crucial to encourage cross-referencing the widget's commentary with the actual data and critically evaluating any discrepancies by users. A supplementary documentation or a hand-out can inform the user of the possible risks associated with the use of LLMs.
The issue of responsibility and accountability is largely evaded in the project because the descriptions provided by the web application do not carry any legal significance. Nonetheless, if the technology is to be scaled and opened for the public, the current lack of accountability could lead to significant ethical and legal challenges. Without clear guidelines and mechanisms for responsibility, users might misuse the application, resulting in misinformation, privacy violations, or even harm that overlaps with the maleficence issue. To address these concerns, it is imperative to work together with legal teams to establish robust frameworks that define the scope and limitations of the technology and finds ways to incorporate them into the answers provided by the application.
Finally, the biggest concern associated with the use of third-party AI applications is privacy. Sending data to OpenAI servers does not provide the necessary level of privacy required for projects with high degrees of confidentiality. However, Responsible Research and Innovation (RRI) programmes such as WIDER UPTAKE can allow for some degree of freedom in this regard. After all, it is similar to allowing an external person to view the data and draw their own conclusions. Nevertheless, within the project, some data were confidential, and due to the nature of the AI API call, access to certain data had to be restricted. Moreover, even though the prompt was engineered to prevent the display of raw data, it still occurred in approximately 30% of cases, as demonstrated in the response consistency analysis. Consequently, for projects where data confidentiality is a top priority, a custom or locally run LLM is necessary. Additionally, the application's scripts should be moved to the backend, as keeping them in the frontend undermines security. However, in the current implementation, access has been granted strictly on a need-to-know basis, limited only to the project stakeholders.
Simply enumerating the weak points of using LLMs in similar projects reveals a range of challenges that need to be addressed and areas for improvement in future research. Unfortunately, projects of this size might not have enough time to fine-tune the data management applications, train a custom LLM or even host a computationally demanding models for the project or experiments. This makes publications like this, which showcase the drawbacks and imperfections of certain implementations, all the more valuable. Despite the numerous areas for improvement, the application did serve its purpose and achieved all the stated goals, including extended stakeholder engagement, democratized data access, risk analysis commentary, and regulatory compliance checks.
CONCLUSIONS
The development and deployment of a web application within the WIDER UPTAKE project have proven crucial in managing and visualizing complex datasets generated from case studies. This application not only facilitates real-time monitoring and risk assessment but also enhances stakeholder engagement by making data accessible and comprehensible to a broader audience. The use of LLMs for generating AI-driven reports has shown promise in democratizing data interpretation, ensuring compliance with regulations, and fostering informed decision-making among stakeholders.
One of the critical implications of this work is the potential for AI and machine learning to bridge the gap between technical data and practical application. The AI-generated reports, despite some variability, consistently provided accurate risk assessments and regulatory compliance checks, demonstrating the feasibility of using LLMs in environmental monitoring and decision-making processes. However, the study also highlights the need for transparency, accountability, and data privacy when employing AI in sensitive domains. Future research should focus on developing custom or locally-run LLMs to address privacy concerns and improve the consistency and reliability of AI-generated outputs.
The nature of the WIDER UPTAKE project underscores the importance of collaborative efforts across various sectors, including research institutions, industries, and local communities. By involving a wide array of stakeholders, the project has created a more inclusive approach to water management, ensuring that the benefits of sustainable practices are widely recognized and adopted. This holistic approach can serve as a model for other Horizon Europe projects and similar initiatives.
The integration of CE principles in water management, supported by advanced technological solutions and stakeholder collaboration, offers a promising pathway toward sustainable urban development. The WIDER UPTAKE project has laid a strong foundation for future research and innovation in this field, highlighting the need for continuous improvement in data management practices and AI applications. By addressing the challenges identified in this study, future projects can build on these insights to achieve more resilient and sustainable water management systems.
ACKNOWLEDGEMENTS
Special thanks to M. Slezák and T. Burger for their invaluable contributions to this research. Their expertise and dedication were instrumental in the successful completion of this work. This research was supported by the Horizon 2020 project: Achieving wider uptake of water-smart solutions (H2020-SC5-2019-2), Grant agreement ID: 869283.
DATA AVAILABILITY STATEMENT
All relevant data are included in the paper or its Supplementary Information.
CONFLICT OF INTEREST
The authors declare there is no conflict.