Transient flow issues, particularly pressure surges in air vessels, significantly challenge the safe and efficient operation of water systems. This paper explores a hybrid approach, integrating machine learning, including deep learning, to address these challenges through predictive analysis. Focusing on a prevalent but often overlooked transient flow issue, this method combines visual data, from a high-speed camera, with numerical data from pressure transducers and a velocity profiler. A U-Net deep learning model performs image segmentation to quantify air–water mixture patterns, providing crucial input for subsequent pressure predictions. Three neural network models are developed, incorporating visual information derived from the segmentation. These models predict pressure variations within an air vessel, crucial for managing pressure surges and ensuring system safety. Experimental data from transient flow tests are used for training and validation. Results demonstrate that incorporating visual data significantly improves pressure prediction accuracy, generalising to both interpolation and extrapolation scenarios. The models, despite being trained with limited data, yield satisfactory predictions. Key challenges include dataset limitations for image segmentation and the practical acquisition of high-resolution visual data in real-world settings. These findings lay the groundwork for more effective real-time monitoring and control of water systems, contributing to improved safety and efficiency.

  • Integration of visual and numerical features in the transient flow using deep learning.

  • Predictive models achieved 98.7% accuracy in transient flow pressure prediction.

  • Air–water phase segmentation accuracy exceeded 90% with U-Net architecture.

  • Demonstrated model generalisability beyond trained flow rates up to Reynolds 155,000.

  • Enhanced visualisation supports real-time transient flow monitoring in air vessels.

Water and sanitation systems are crucial for public health and sustainable development, particularly SDG6. Global challenges in this sector require greater ambition and innovation, especially through advancements in science and technology. These challenges span a wide range of issues, including enhancing service levels, addressing the role of sanitation in antimicrobial resistance, and building climate resilience (Howard 2021). This study focuses on water systems, which face a multitude of interconnected challenges such as ageing infrastructure, high maintenance costs, and significant energy consumption. These challenges often manifest in various forms, such as water loss and inefficient operations. Smart Water Systems (SWS) offer a promising solution by leveraging advanced technologies to address these issues and improve operations, management, and decision-making (Xiang et al. 2021). SWS can be defined in various ways, from a data analysis tool to a comprehensive transformation of decision-making (Xiang et al. 2021). Ingildsen & Olsson (2016) defines SWS as a real-time data-driven decision-making system spanning the entire water cycle, optimising water quality and quantity while minimising resource consumption.

The current study adopts a data-centric definition of SWS as an intelligent framework across the water cycle, improving sustainability through rapid analysis and decision-making, emphasising real-time data analysis as a fundamental aspect. Therefore, this research focuses on machine learning and predictive analysis for complex water system data.

Water loss, particularly leakage, serves as a pertinent example of the complex challenges facing water systems. While significant, it is one facet of a broader set of issues. Water scarcity is increasing globally, with urban residents in water-scarce regions projected to rise dramatically by 2050 (He et al. 2021). Current trends suggest that a significant portion of the global population could face water scarcity by 2025 (Eliasson 2015; Abd Rahman et al. 2018). Non-revenue water (NRW), the difference between supplied and billed water (Frauendorfer & Liemberger 2010), often exceeds 40% in cities worldwide, and even higher in developing countries (Abd Rahman et al. 2018). While factors such as metering issues and water theft contribute to NRW, pipe leakage is a significant component (Ramos et al. 2023). The scale of water loss is substantial, with millions of cubic metres lost daily in the developing world (Bell 2016). Addressing water loss is important, but it must be viewed within the context of broader system optimisation.

Mitigating the various challenges facing water systems, including water loss, is complex. Water networks are vast and often underground, making observation and analysis difficult (Kim et al. 2016). Various techniques exist, including pressure management, active leakage control, leak detection, smart metering, and asset management (Ramos et al. 2023). Machine learning can play a crucial role in addressing these challenges, including those related to NRW (Conejos Fuertes et al. 2020). These challenges can stem from factors such as ageing pipes, geotechnical issues, faulty installation, high pressures, and transient events such as water hammer (Covas et al. 2005). Amongst them, transient events are particularly challenging, potentially causing pump failures, pipe ruptures, and other malfunctions (Boulos et al. 2005; Besharat et al. 2018). Factors such as entrapped air can exacerbate these events (Izquierdo et al. 1999; Boulos et al. 2005). Transient flows are unavoidable due to operational events like valve closures and pump activity (Chaudhry 2014).

Modelling transient flows, which is computationally intensive and depends on numerous variables and dynamic system characteristics (Chaudhry 2014), presents a significant challenge. Real-time monitoring and prediction of these variables are particularly difficult, especially within large and complex networks (Boulos et al. 2005; Chaudhry 2014). Furthermore, existing transient management methods often concentrate on mitigating the consequences of transients rather than preventing their occurrence (Chaudhry 2014). The dynamic nature of flow parameters, coupled with the inherent complexity of water networks, necessitates the development of responsive, real-time modelling and prediction capabilities. These challenges are further exacerbated by the presence of ageing infrastructure (Sanders et al. 2022), as water companies strive for significant improvements in overall water management (Sanders et al. 2022). However, technological advancements are now enabling more precise monitoring and analysis of water processes through the utilisation of real-time sensor data (Sanders et al. 2022), leading to a better understanding and modelling of previously uncertain phenomena. Smart networks, coupled with advanced sensors and analytics, are therefore key to achieving increased efficiency (Sanders et al. 2022). This approach is strongly supported by Water UK’s routemap (Sanders et al. 2022), which advocates for the integration of these technologies to reduce costs and improve system efficiency. This strategic direction aligns with the World Bank’s Utility of the Future program (Lombana Cordoba et al. 2022), which emphasises a new strategic management approach for water utilities worldwide.

Machine learning, particularly deep learning, has shown significant potential in tasks such as image classification and predictive analysis (Goodfellow et al. 2014). Deep learning models, such as convolutional neural networks (CNNs), have been applied to fluid dynamics, helping predict fluid flow characteristics and improve the management of water systems (Morimoto et al. 2021). Recent advancements in machine learning can also support the development of digital twins, providing real-time, high-fidelity representations of water systems, which are invaluable for monitoring and predicting failures (Wu et al. 2023). Furthermore, machine learning techniques are being applied to optimise various aspects of water systems. For example, Jafari-Asl et al. (2024) introduced a novel optimisation algorithm for pumping stations to reduce water loss and energy consumption. In hydrological forecasting, Khosravi et al. (2024) developed an ensemble machine learning model for enhanced daily river flow prediction. Machine learning is also proving valuable in broader environmental modelling, as demonstrated by Donnelly et al. (2024) who combined autoencoders and Gaussian Processes for improved spatiotemporal regression of global temperature and pressure. Khosravi et al. (2023) reviewed the applications of deep learning in hydrology, including rainfall–runoff simulation and soil moisture prediction, highlighting its potential for improved land management.

Existing research on SWS includes large-scale hydraulic models, but these often rely on simplifying assumptions (Tomić et al. 2022). While sophisticated unsteady flow models exist (He et al. 2022), their high computational demands limit applicability to large infrastructure. Simplified models, meanwhile, can be inadequate for real-time operations like pressure management (Vicente et al. 2015). This study addresses these limitations by developing machine learning and image processing models for transient flow prediction, trained on limited experimental data. Specifically, experimental data from transient flow tests – including water hammer induced by rapid valve closure – are used to predict pressure variation and air–water mixture patterns within an air vessel. The key innovation lies in achieving accurate predictions with minimal data including visual and numerical, significantly reducing computational intensity. This potentially enables practical real-time SWS implementation. The trained models and methodologies are detailed extensively withing the paper following a standard structure where the Methodology section outlines the experimental data and analysis techniques, the Results and Discussion sections present findings and implications, and the Conclusion summarises key achievements and future research directions.

This section introduces the data acquisition system, explains the structure of the data used in the research, and discusses the methods employed. The study focuses on the water hammer event, a computationally intensive phenomenon for simulation, which can trigger multiple issues in pipeline systems. Water hammer in pipelines refers to a sudden increase in pressure caused by an abrupt stop or change in the flow of water. This phenomenon generates hydraulic shock waves, leading to pressure surges that can potentially damage the pipeline system (Chaudhry 2014; Feng et al. 2024). A common solution for controlling the excessive pressure spikes caused by water hammer events is the use of air vessels, which help absorb and mitigate these surges (Besharat et al. 2016). The pressure surge is managed by the compression of gas within the air vessel, which helps dampen the shock. In addition to the design and sizing of air vessels, which directly benefit from accurate calculations of the pressure inside the vessel, the overall operation of the pipeline systems also benefits from predicting pressure variations within the air vessel. This ensures the system operates safely and without significant risks. For large-scale systems, conventional pressure calculations based on theoretical approaches may not be feasible due to the high computational load. Therefore, a predictive analysis approach proves to be quite beneficial. Accordingly, this section will delve into deep learning models, providing detailed information about the specific techniques utilised. The discussion will then transition to image-based models, which serve as another tool employed in the pursuit of predictive analysis within this study.

Experimental datasets

The data used in this study was obtained from the work by Besharat et al. (2017). It emulates an undulating pipeline, as illustrated in Figure 1, comprising PVC pipes with a total length of approximately 8 m and a nominal diameter of 63 mm. The system incorporates a transparent PVC compressed air vessel with a diameter of 0.10 m and a height of 0.60 m. This air vessel features an air pocket at the top and a water column at the lower part. Flow generation is facilitated by a pump with the ability to provide varying velocities for different tests. Three pressure transducers and an ultrasound velocity profiler (UVP) record pressure and flow velocity data at locations shown in Figure 1. The sampling frequency for the pressure data is approximately 6 ms. The UVP operates at a sampling frequency of 100 ms, capturing 100 velocity profiles at different time intervals, with each profile containing 80 recorded velocity magnitudes.
Figure 1

Experimental setting for data acquisition.

Figure 1

Experimental setting for data acquisition.

Close modal

Tests encompass various air pocket sizes and flow velocities, including seven air pocket lengths (2, 3, 4, 5, 10, 20, and 40 cm) and seven flow velocities with corresponding Reynolds numbers (36,000, 56,000, 75,000, 93,000, 115,000, 132,000, and 155,000). These ranges resulted in a set of 11 tests, each containing 7 sub-tests, totalling 77 sub-tests, as demonstrated in Table 1.

Table 1

Sub-test parameters: Re number and air pocket size

Test Number1234567891011
Air Pocket size (cm)0234571015203040
Re 36,000 15 22 29 36 43 50 57 64 71 
Re 56,000 16 23 30 37 44 51 58 65 72 
Re 75,000 10 17 24 31 38 45 52 59 66 73 
Re 93,000 11 18 25 32 39 46 53 60 67 74 
Re 115,000 12 19 26 33 40 47 54 61 68 75 
Re 132,000 13 20 27 34 41 48 55 62 69 76 
Re 155,000 14 21 28 35 42 49 56 63 70 77 
Test Number1234567891011
Air Pocket size (cm)0234571015203040
Re 36,000 15 22 29 36 43 50 57 64 71 
Re 56,000 16 23 30 37 44 51 58 65 72 
Re 75,000 10 17 24 31 38 45 52 59 66 73 
Re 93,000 11 18 25 32 39 46 53 60 67 74 
Re 115,000 12 19 26 33 40 47 54 61 68 75 
Re 132,000 13 20 27 34 41 48 55 62 69 76 
Re 155,000 14 21 28 35 42 49 56 63 70 77 

Tests are conducted by rapidly closing and opening a fast-closure electro-pneumatic Ball Valve – labeled as the Main Operating Valve in Figure 1 – to induce a water hammer event in the system. The valve’s closing-opening action is controlled by an electrical trigger, with each maneuver lasting 0.20 seconds. This actuation generates a pressure surge that travels upstream. A check valve is installed in line with the pipe upstream of the air vessel to prevent water from flowing back while directing it toward the air vessel. Another check valve at the entrance of the air vessel opens toward the air vessel and closes toward the pipeline. This configuration allows water to enter the air vessel but prevents any flow from returning to the pipeline. The setup is designed to accumulate pressure within the air pocket to study its behaviour under fully confined high-pressure conditions. Consistent initial pressure conditions are maintained across tests, with the initial air pocket pressure set to atmospheric pressure.

Predicting the pressure within the air vessel and the pipe system is crucial to ensure it does not exceed the maximum allowable pressure of the system. The pressure variation pattern within the air vessel – referred to as hereafter – exhibits a rapid pressure spike due to the sudden entrance of a pressure wave, followed by a gradual drop to a constant level during the expansion phase, as displayed in Figure 2(a) for selected sub-tests. Pressure patterns upstream () and downstream () are also shown in Figure 2(b) and 2(c), respectively. Given the rapidly variable nature of transient flow, such as the water hammer effect in this study, live monitoring of a similar real system would be essential within the concept of SWS. This necessitates the continuous ability to predict pressure ranges, as opposed to relying solely on actual simulations.
Figure 2

Pressure variation pattern within air vessel over time for selected sub-tests: (a) pressure variation; (b) pressure variation; and (c) pressure variation.

Figure 2

Pressure variation pattern within air vessel over time for selected sub-tests: (a) pressure variation; (b) pressure variation; and (c) pressure variation.

Close modal
The training data structure for a single sub-test consisted of a matrix with a 16,435-element vector for pressure and a matrix for velocity profiles. In addition to the quantitative data, a set of images of the air–water mixture within the air vessel, captured by a high-speed camera at 500 fps, has been used in this research. These images are used to train a separate deep learning model to extend the set of available quantitative data (Figure 3).
Figure 3

A sample photo taken from air–water mixture inside air vessel.

Figure 3

A sample photo taken from air–water mixture inside air vessel.

Close modal

The quantitative data have been meticulously divided into training and test sets for machine learning purposes to ensure a broad selection of Reynolds numbers for a comprehensive representation. The test set constitutes of the available data, with the remaining forming the training set.

Image segmentation

The deep learning approach integrates both visual and numerical data streams to create a more comprehensive understanding of air vessel dynamics. The methodology consists of two main components: an image segmentation model for processing visual data, and pressure prediction models that combine conventional sensor readings with extracted visual features. During pressure surge events, the water level in the air vessel rises rapidly, creating patterns that range from mild to highly turbulent depending on the flow velocity. To capture and analyse these complex air–water mixture patterns, a deep learning model was developed for image segmentation. The model processes high-speed camera feeds to identify and quantify different fluid phases, providing crucial input for subsequent pressure predictions. An encoder-decoder architecture was implemented to analyse each video frame and extract the proportions of different fluid phases. The model takes grayscale images of size as input and produces segmentation masks of size , as shown in Figure 4.
Figure 4

The gray-scale input (left) and three channels of the output array (right) in the image segmentation process.

Figure 4

The gray-scale input (left) and three channels of the output array (right) in the image segmentation process.

Close modal

The image segmentation model known as U-Net (Ronneberger et al. 2015) consists of an Encoder and a Decoder. The encoder extracts features from the input image through a series of convolutional layers. Each convolutional layer acts as a filter to detect different features of the image. By passing the image through multiple convolutional layers, the encoder transforms the image into lower-dimensional feature representations. The decoder then takes these encoded feature representations and reconstructs the original input image through a series of upsampling and convolutional layers. Each step in the decoder uses the output of the corresponding encoder step to guide the reconstruction. This helps the decoder to produce an accurate mapping from the encoded features back to the original input image. Within the current model, the encoder has three blocks. In the first block, two convolutional layers with 64 filters are applied to the input image of size . This is followed by max pooling to downsample the feature maps to . The second block takes these feature maps as input, applies two convolutional layers with 128 filters, and downsamples to . The third block applies two convolutional layers with 256 filters and downsamples to . After two more convolutional layers with 512 filters, the original image is encoded into 512 feature maps of size . This compressed representation captures the essential features needed to reconstruct the input image.

The decoder mirrors the encoder (Figure 5), which is constituted from three upsampling blocks followed by convolutional layers to progressively reconstruct the image from the encoded feature maps. Skip connections are used to concatenate the output of each encoder block with the input of the corresponding decoder block. This helps the decoder to utilise both the encoded features and localised information from the encoder. The final model output contains three channels representing masks for air, mixture, and water regions in the input image. By leveraging both global features through the encoder path and localised details through the skip connections, the U-Net architecture is potentially capable of learning to segment the input images of an air vessel during a pressure surge.
Figure 5

U-Net model structure for semantic segmentation of the high speed camera feed.

Figure 5

U-Net model structure for semantic segmentation of the high speed camera feed.

Close modal

Pressure prediction

Building on both conventional sensor data and the image segmentation results, three neural network models were developed for predicting pressure within the air vessel (). These models differ in their input features, allowing to evaluate the impact of incorporating visual data into the prediction process. This hierarchical model structure allows for systematically evaluation of how incorporating visual features impacts prediction accuracy, with each subsequent model being built upon the previous one by adding new input features while maintaining the same underlying neural network architecture. Models are named Model #5, #6 and #7 with respect to their number of inputs. The models were trained using TensorFlow, a widely used deep learning framework in the Python programming language. The first model, Model #5, incorporated five input characteristics: (upstream pressure), (downstream pressure), (initial height of the air pocket in the vessel), (average fluid velocity in the pipe before valve closure) and time () (measured from the moment main operating valve is activated). Experimental data were read and organised into Pandas data frames via Python. The test data set is made up of interpolation and extrapolation tests. The interpolation tests are those for which their lower and upper neighbor tests are present in the training dataset color coded in Figure 6. The extrapolation tests experience the highest average fluid velocity and Reynolds number, as visible in Figure 6, illustrated along with the training and validation tests.
Figure 6

Split of the tests between training, validation, and test data sets.

Figure 6

Split of the tests between training, validation, and test data sets.

Close modal

The features were selected on the basis of model requirements and normalised using Min-Max scaling to ensure uniformity and facilitate model convergence. Furthermore, to optimise model performance and training stability, a custom Huber loss function was used. This loss function is less sensitive to outliers in the data and seeks a balance between the mean squared error and the mean absolute error. By minimising this hybrid loss, the model was effectively trained to make accurate predictions while mitigating the influence of anomalous data points, which is common when using realistic measured flow and pressure data.

Model #5 (Figure 7(a)) employed a sequential neural network architecture consisting of three dense layers. Activation functions included for hidden layers and for the output layer. Training was carried out with an Adam optimiser and a learning rate of 0.00005, using a batch size of 128 over 40 epochs. Early stopping based on validation loss was implemented as a stopping criterion. During training, a callback function monitored the validation loss and saved the version of the model with the minimum validation loss across the 40 epochs to be used in the test and prediction stages.
Figure 7

Architecture of the deep learning models and data workflows in Models #5, #6 and #7.

Figure 7

Architecture of the deep learning models and data workflows in Models #5, #6 and #7.

Close modal

In Model #6 (Figure 7(b)), an additional input feature was introduced, i.e., air percentage in the vessel, dynamically calculated using high-speed camera recordings and as the output of the discussed segmentation network. Specifically, the segmentation model outputs pixel-wise segmentation masks with three channels representing air, mixture, and water regions, which are post-processed to compute the proportional volumes of each phase. These computed proportions are normalised and used as input features alongside conventional parameters such as upstream and downstream pressures, fluid velocity, and time. This integration bridges the gap between discrete sensor readings and continuous spatial visual data, enabling the model to leverage both numerical measurements and visual information for pressure prediction. The model retained the architecture of Model #5 but incorporated this additional processed feature.

Model #7 (Figure 7(c)) extended upon Model #6 by including another additional input feature, namely the water percentage in the vessel, also calculated and normalised based on the segmentation of camera recordings. Other details in the model are similar to the Model #5. The trained models were evaluated on separate test datasets, including interpolation and extrapolation subsets, to assess their generalisation ability. Evaluation metrics such as loss and R-squared were calculated to quantify model performance. The predictions were compared with actual pressure values to visualise model accuracy.

The analysis follows a systematic evaluation of both the image segmentation and pressure prediction components of the deep learning framework. This two-stage approach first validates the accuracy of visual feature extraction before assessing how these features contribute to pressure predictions, which allows for verification of each component’s contribution to the overall methodology.

Image-based model

To generate training data for the image segmentation model, 27 photos capturing pressure surge states across multiple tests were carefully selected. The regions corresponding to air, mixture, and water in each photo were then manually annotated. To expand this limited dataset, the images were augmented through left-right mirroring and translation, yielding 25 augmented variants of each photo. In total, this augmentation process produced a training set of 675 labelled image pairs. Using this enriched dataset, the network was trained, and the training loss and accuracy over time were plotted, as shown in Figure 8(a).
Figure 8

Training and prediction of the image segmentation model, (a) training loss and accuracy of the model, (b) variations of the phase fraction vs height of the vessel, (c) segmentation masks and the input image.

Figure 8

Training and prediction of the image segmentation model, (a) training loss and accuracy of the model, (b) variations of the phase fraction vs height of the vessel, (c) segmentation masks and the input image.

Close modal

The training curves show that as the epochs increase, the accuracy of the model steadily improves while the loss decreases. Although this might be an overfitting case, considering the difficulty of the manual segmentation, it was not feasible to create additional validation samples, so the decision was made to use early stopping at epoch 20 to minimise overfitting issues. Visual inspection of the model revealed that further training brings negligible improvement. Rapid convergence within around 20 epochs highlights insufficient data, as the limited 675-image dataset cannot support extensive training. For more robust predictions, an expansion of the dataset is necessary.

With the training completed, the model was applied to new images. Using the predicted component, the proportional volumes of air, mixture, and water at varying water level heights and morphology were calculated. These fractions were stored in a matrix of shape (738, 3, 1,500), where the first dimension represented the height of the water pipe, the second dimension represented the categories air, mixture, and water, and the third dimension represented the total number of frames for each experiment. Thus, the fraction of each phase was visualised versus the height of the vessel, as shown in Figure 8.

Pressure prediction results

The performance of the pressure prediction models was evaluated during training by checking the validation loss. As shown in Figure 9(a)–9(c), overfitting was observed in all studied models, while only the version of the model with the lowest validation loss was used for prediction. In addition, scatter plots of predicted vs. measured pressure values provided insight on the performance of the models. According to Figure 9(d)–9(f), model #6 had the highest determination coefficient (), while model #5 had the lowest accuracy when tested on a combined dataset of extrapolation and interpolation sub-tests. To examine the capability of models, Huber losses were presented in Table 2, for different groups of test data and models. Theoretically, the interpolation loss should have been lower than the extrapolation loss, considering the fundamental limitation of the neural network models in generalisation. This was observed in Models #5 and #7, while the reverse was seen for Model #6, which could have occurred due to the specific data splitting, meaning that repeatability of this observation should be tested with a different set of training and testing data in future studies. A few of the sub-tests were selected to be compared among the three models, namely sub-tests 9, 49 and 77. Sub-test 9 was an interpolation example, and sub-tests 49 and 77 were extrapolation examples, which were expected to have a greater amount of discrepancy.
Table 2

Huber losses and determination coefficients of different models on the test dataset

Metrics/ModelsModel #5Model #6Model #7
Overal test loss 4.04E04 2.70E04 3.40E04 
Interpolation loss 3.74E04 3.44E04 2.78E04 
Extrapolation loss 4.23E04 2.25E04 3.77E04 
Determination coefficient 0.981 0.987 0.984 
Metrics/ModelsModel #5Model #6Model #7
Overal test loss 4.04E04 2.70E04 3.40E04 
Interpolation loss 3.74E04 3.44E04 2.78E04 
Extrapolation loss 4.23E04 2.25E04 3.77E04 
Determination coefficient 0.981 0.987 0.984 
Figure 9

Training and validation Huber loss for Models #5, #6 and #7 (a to c). In addition, predicted vs. measured vessel pressures are visualized in scatter plots, respectively (d to f), And sorted feature importance for each model (g to i).

Figure 9

Training and validation Huber loss for Models #5, #6 and #7 (a to c). In addition, predicted vs. measured vessel pressures are visualized in scatter plots, respectively (d to f), And sorted feature importance for each model (g to i).

Close modal
Due to the reflection of the pressure waves in the air vessel, a slight oscillation of pressure in measurements had been recorded in the majority of the sub-tests. An interesting observation was that, in certain cases, such as those shown in Figure 10(b), 10(c), 10(e), and 10(f), where visual input was also available, this behaviour seems to have been captured, possibly originating from the image-based features. Another observation was that in some predictions, such as in sub-test 9 (Figure 10(b) and 10(c)), there was a slight unrealistic increase in the pressure, which could be resolved by constraining the model to produce a relatively decreasing curve. However, this would limit the flexibility of the model for other possible experimental conditions. As illustrated in Figure 10(j), in such cases, the level of water after closing the valve was almost at the top of the vessel, which made it difficult to estimate the amount of trapped air, even manually.
Figure 10

Comparing pressure changes of time measure in the subtests 9, 49 and 77 and predicted using Models #5, #6 and #7.

Figure 10

Comparing pressure changes of time measure in the subtests 9, 49 and 77 and predicted using Models #5, #6 and #7.

Close modal

As inferred from Figure 10(b), it can be stated that model #6 is more capable of predicting pressure in the air vessel, especially in the extrapolation sub-tests. However, for sub-test #77, a disturbance in the predicted pressure is observed approximately 1 seconds after closing the valve. This deviation is postulated to originate from image-based feature of air percentage, which becomes disturbed due to the highly turbulent and transient behavior of the fluid in the vessel, thereby introducing inaccuracy into the phase segmentation (Figure 10(k)).

Feature importance and justification

The importance of the features in each of the models studied was also examined. A common technique to determine the importance of features in neural networks analyses is the learnt weights connecting inputs to the first hidden layer along with the inherent variability of features in the training data (Olden et al. 2004). The rationale is that features with stronger connections (larger weights) and higher variability across samples are more relevant for model predictions. In this approach, the feature importance score is defined as the absolute weight magnitude multiplied by the standard deviation for each input feature across training examples. The scores are then averaged across the hidden units and features are ranked by these importance values. These scores were visualised in Figure 9(g)–9(i) for each of the trained models. It can be seen that the air pocket initial height () is the most significant feature in predicting the final pressure of the vessel in the three models. This corroborates the theoretical understanding of adiabatic compression of gases. The adiabatic compression equation is given by Besharat & Ramos (2015):
(1)
where and are pressure and volume, respectively in initial and final states and is the adiabatic index or ratio of specific heats.
In terms of a power-law formulation, it can be expressed as:
(2)
In this case, since the cross sectional area of the air vessel is constant, it can be written based on the height of the air inside the vessel:
(3)
This equation allows for calculation of the final pressure based on the initial pressure , the initial height , the final height , and the adiabatic index of the gas. As the size of the air vessel remains constant, the initial height of the air pocket is proportional to the ratio, directly affecting the final pressure of . As shown in Figure 9(g)–9(i), the air percentage receives a higher score compared to the water percentage as an image-based feature. This can be explained by the disturbances in the predictions caused by high turbulence and bubbles created in the water, which introduced some uncertainty into the predicted water percentage.

Comparing the results with existing studies in the field of transient flow analysis, several key advancements emerge. While traditional numerical models achieve satisfying predictions of pressure variations in air vessels, they typically require significant computational resources and processing time, particularly when modelling complex phenomena like entrapped air. Model #6, achieving an value of 0.987 and successfully handling both interpolation and extrapolation cases, demonstrates acceptable accuracy but with substantially reduced computational overhead. The integration of image-based features with conventional sensor data represents a novel contribution to the field, as previous studies have typically relied solely on numerical measurements from pressure transducers and flow meters for transient analysis. This visual-numerical hybrid approach enables direct quantification of air–water interactions during pressure surges, providing insights that are typically approximated or simplified in conventional numerical models. Furthermore, the model’s ability to extrapolate beyond trained flow rates (as evidenced by the extrapolation loss of 2.25E04 in Model #6), addresses a common limitation in hydraulic modelling, where performance often depends heavily on calibration within specific operating ranges. The combination of reasonable computational cost and robust prediction accuracy, even under varying flow conditions, makes this approach particularly suitable for real-time monitoring and control applications in water distribution systems. However, availability of visual data in real field applications remains a limitation, which could be addressed via modern high resolution sensors.

Further applications in visualisation and digital twin

A valuable application of predictive analysis lies in its ability to visualise complex phenomena within pipeline systems, providing insights that extend beyond traditional methods. This visualisation capability plays a crucial role in various stages including design, education and operational management. By leveraging predictive analysis models, which are trained on historical data from real systems, engineers and operators can simulate and visualise the outcomes of adjusting flow parameters, thus enhancing decision-making processes. Moreover, these visualisation tools can be integrated into digital twin frameworks, where real-time data feeds into the predictive models to simulate the current state of the system and forecast future conditions.

In particular, visualisation through predictive analysis in the current study enables observing the dynamic behaviour of air–water mixtures within an air vessel under different operational scenarios. This capability is essential for understanding the intricate interactions between pressure surges, air pocket, and water flow within transient conditions. As another exploratory extension of this concept, a simple interface supported by machine learning models developed in this study demonstrated in Figure 11. This interface, which is created using the MATLAB App Designer toolbox, enables the visualisation of the air–water mixture within the air vessel for various flow velocities and air pocket heights. By inputting different velocities and air pocket heights, users can immediately see the resulting changes in the air–water mixture and navigate through the mixture visualised within the air vessel. This tool highlights machine learning applications in training operators/practitioners and informing operational decision-making.
Figure 11

The designed visualisation interface.

Figure 11

The designed visualisation interface.

Close modal

It is important to note that the segmented images presented are predictions from the encoder-decoder machine learning model used in this study. Currently, the visualisation tool is not using live field readings but works retrospectively. It is proposed that, in the future, machine learning models could enable the use of sensor readings as input, allowing the creation of multiple frames depicting the interior of the air vessel. Such an approach would facilitate practical visualisation of the air vessel enhancing the understanding and control of these systems.

Limitations

While this study presents promising results in applying machine learning techniques to predict and visualise transient flow phenomena in air vessels, several limitations should be acknowledged.

One significant limitation relates to the image segmentation process. As illustrated in Figure 12, the segmentation results exhibit some inaccurancies due to the presence of support structures surrounding the air vessel, which can interfere with the model’s prediction capability and introduce errors in the results. A feasible solution to this issues is to increase the amount of training data. Currently, the training dataset, comprising over 600 images, remains relatively small. Expanding the dataset in future works is expected to improve model performance.
Figure 12

Segmentation errors due to support structures in the experimental setup.

Figure 12

Segmentation errors due to support structures in the experimental setup.

Close modal

The models’ performance under extreme conditions or rare events not represented in the training data remains uncertain. This underscores the need for continuous model refinement and validation as more diverse data becomes available.

Finally, while the integration of visual and numerical data demonstrates potential, the approach may be constrained in real-world applications where visual data from inside pipes or vessels is not readily accessible. Future research could explore reverse modeling using generative AI techniques to create visualisations based solely on sensor readings. This method would enable internal observation of opaque pipes and vessels, providing valuable insights without requirung transparent experimental setups. However, a major challenge in implementing this method would be the scarcity of training data that pairs sensor readings with corresponding visual representations. To address this, transfer learning techniques could be explored. By leveraging pretrained models from related domains or simulated data, it may be possible to adapt and fine-tune the generative models with limited real-world data.

This study successfully demonstrated the potential of machine learning, including deep learning, for predicting and visualising complex transient flow phenomena in air vessels, specifically focusing on the challenging water hammer effect. A novel hybrid approach was developed, integrating conventional sensor data with visual information extracted from high-speed camera images using a U-Net segmentation model. This allowed for a more comprehensive understanding of the air–water dynamics within the vessel. Three neural network models were trained, progressively incorporating visual features (air and water percentages) derived from the image segmentation. The results showed that incorporating visual data, particularly air percentage, improved the accuracy of pressure predictions, achieving a high R-squared value and demonstrating the model’s ability to generalise to both interpolation and extrapolation scenarios. Feature importance analysis confirmed the significant role of initial air pocket height in pressure prediction, aligning with theoretical understanding. While limitations remain, including the need for larger image datasets for segmentation and the challenge of acquiring visual data in real-world applications, this work provides a valuable foundation for future research exploring generative AI for visualisation from sensor data and the development of robust, real-time monitoring and control systems for water distribution networks.

By bridging the gap between visual observations and pressure predictions in complex hydraulic systems, the research highlights the potential of machine learning to enhance water system management. Managing pressure surges is crucial for preventing significant water loss, and air vessels are highly effective in mitigating such risks. The models studied, particularly Model #6, demonstrated the ability to interpolate and extrapolate pressure beyond the trained flow rate, achieving R-squared values up to 0.987. This high accuracy, especially in extrapolation scenarios, underscores the potential of machine learning in real-world applications where operational conditions vary.

One key aspect explored was the relationship between training epochs and model loss. While accuracy improved as the number of epochs increased, diminishing returns were observed beyond 20 epochs due to dataset limitations. This highlights the need for expanding the dataset to enhance future model training. Additionally, the study identified several practical challenges. The experimental setup, while controlled, represents a simplified version of real-world systems, and the reliance on transparent vessels for image acquisition limits immediate industrial applications. However, future advances in high-resolution acoustic sensors could provide an alternative to transparent vessels. Moreover, leveraging correlations between image-based features and air pressure presents new possibilities for inverse modelling, enabling indirect predictions of air vessel behaviour from pressure and flow data.

For practical implementation, integrating these models into existing SCADA systems could enable real-time pressure prediction and system monitoring. Water utilities could leverage such approaches for predictive maintenance and operational optimisation, particularly in systems using air vessels. The visualisation capabilities demonstrated in this study also offer potential for operator training and system diagnostics, although adaptations would be required for opaque industrial systems.

Future research should focus on expanding training datasets to encompass a broader range of operational conditions and system configurations. The development of synthetic training data could help overcome experimental limitations, while transfer learning approaches could facilitate adaptation from laboratory-scale models to full-scale industrial applications. Additionally, generative AI techniques may enable the visualisation of internal flow conditions without requiring direct optical access.

The coupling of advanced visualisation techniques with the digital twin concept represents a forward-looking approach that could transform water system management. By continuously updating digital models with live data, operators can gain real-time insights into system performance, leading to more efficient and resilient management practices. In this context, machine learning and artificial intelligence offer a means to reduce computational intensity compared to traditional numerical methods, making large-scale system analysis feasible in live or near-live modes.

This interdisciplinary approach, combining hydraulic engineering, computer science, and data science, provides a novel framework for tackling current challenges in water systems. As water utilities worldwide strive for greater sustainability and efficiency, the integration of such technologies will be crucial in addressing issues related to ageing infrastructure, water loss, and the demand for more responsive and intelligent systems. Ultimately, this research paves the way for more intelligent and adaptive smart water systems (SWS), contributing to the ongoing evolution of smart infrastructure in the face of global water challenges.

The experiments were conducted in the Hydraulic Laboratory of Instituto Superior Técnico, University of Lisbon, Portugal, with support from the Civil Engineering Research and Innovation for Sustainability (CERIS) research centre.

This research was funded by School of Civil Engineering, University of Leeds, UK.

All relevant data are available from an online repositoryor repositories (please ensure the DOI/URL has been provided as a submssion item).

The authors declare there is no conflict.

Abd Rahman
N.
,
Muhammad
N. S.
&
Wan Mohtar
W. H. M.
(
2018
)
Evolution of research on water leakage control strategies: where are we now?
Urban Water Journal
,
15
(
8
),
812
826
.
Bell
C.
(
2016
)
The World Bank and the International Water Association to Establish a Partnership to Reduce Water Losses. World Bank. [accessed 24 April 2024]
.
Besharat
M.
&
Ramos
H. M.
(
2015
)
‘Theorical and experimental analysis of pressure surge in a two-phase compressed air vessel’, Proceedings of the 12th International Conference on Pressure Surges, Dublin, Ireland, 18–20
.
Besharat
M.
,
Tarinejad
R.
&
Ramos
H. M.
(
2016
)
The effect of water hammer on a confined air pocket towards flow energy storage system
,
Journal of Water Supply: Research and Technology—AQUA
,
65
(
2
),
116
126
.
Besharat
M.
,
Coronado-Hernández
O. E.
,
Fuertes-Miquel
V. S.
,
Viseu
M. T.
&
Ramos
H. M.
(
2018
)
Backflow air and pressure analysis in emptying a pipeline containing an entrapped air pocket
,
Urban Water Journal
,
15
(
8
),
769
779
.
Boulos
P. F.
,
Karney
B. W.
,
Wood
D. J.
&
Lingireddy
S.
(
2005
)
Hydraulic transient guidelines for protecting water distribution systems
,
Journal-American Water Works Association
,
97
(
5
),
111
124
.
Chaudhry
M. H.
(
2014
)
Applied Hydraulic Transients
,
Vol. 415
.
New York
:
Springer
.
Conejos Fuertes
P.
,
Martínez Alzamora
F.
,
Hervás Carot
M.
&
Alonso Campos
J.
(
2020
)
Building and exploiting a digital twin for the management of drinking water distribution networks
,
Urban Water Journal
,
17
(
8
),
704
713
.
Covas
D.
,
Ramos
H.
&
De Almeida
A. B.
(
2005
)
Standing wave difference method for leak detection in pipeline systems
,
Journal of Hydraulic Engineering
,
131
(
12
),
1106
1116
.
Donnelly
J.
,
Daneshkhah
A.
&
Abolfathi
S.
(
2024
)
Forecasting global climate drivers using gaussian processes and convolutional autoencoders
,
Engineering Applications of Artificial Intelligence
,
128
,
107536
.
Eliasson
J.
(
2015
)
The rising pressure of global water shortages
,
Nature
,
517
(
7532
),
6
6
.
Feng
R.-L.
,
Zhou
L.
,
Besharat
M.
,
Xue
Z.
,
Li
Y.
,
Chen
Q.
,
Hu
Y.
&
Lu
Y.
(
2024
)
Discrete air model for large scale rapid filling process contained entrapped air
,
Engineering Applications of Computational Fluid Mechanics
,
18
(
1
),
2428423
.
Frauendorfer
R.
&
Liemberger
R.
(
2010
)
The issues and challenges of reducing non-revenue water
.
Goodfellow
I.
,
Pouget-Abadie
J.
,
Mirza
M.
,
Xu
B.
,
Warde-Farley
D.
,
Ozair
S.
,
Courville
A.
&
Bengio
Y.
(
2014
)
Generative adversarial nets
,
Advances in neural information processing systems
,
27
.
He
C.
,
Liu
Z.
,
Wu
J.
,
Pan
X.
,
Fang
Z.
,
Li
J.
&
Bryan
B. A.
(
2021
)
Future global urban water scarcity and potential solutions
,
Nature Communications
,
12
(
1
),
4667
.
Howard
G.
(
2021
)
The future of water and sanitation: global challenges and the need for greater ambition
,
AQUA – Water Infrastructure, Ecosystems and Society
,
70
(
4
),
438
448
.
Ingildsen
P.
&
Olsson
G.
(
2016
)
Smart Water Utilities: Complexity Made Simple
.
London
:
IWA Publishing
.
Izquierdo
J.
,
Fuertes
V.
,
Cabrera
E.
,
Iglesias
P.
&
Garcia-Serra
J.
(
1999
)
Pipeline start-up with entrapped air
,
Journal of Hydraulic Research
,
37
(
5
),
579
590
.
Jafari-Asl
J.
,
Hashemi Monfared
S. A.
&
Abolfathi
S.
(
2024
)
Reducing water conveyance footprint through an advanced optimization framework
,
Water
,
16
(
6
),
874
.
Khosravi
K.
,
Rezaie
F.
,
Cooper
J. R.
,
Kalantari
Z.
,
Abolfathi
S.
&
Hatamiafkoueieh
J.
(
2023
)
Soil water erosion susceptibility assessment using deep learning algorithms
,
Journal of Hydrology
,
618
,
129229
.
Khosravi
K.
,
Attar
N.
,
Bateni
S. M.
,
Jun
C.
,
Kim
D.
,
Safari
M. J. S.
,
Heddam
S.
,
Farooque
A.
&
Abolfathi
S.
(
2024
)
Daily river flow simulation using ensemble disjoint aggregating m5-prime model
,
Heliyon
,
10
(
20
).
Kim
Y.
,
Lee
S. J.
,
Park
T.
,
Lee
G.
,
Suh
J. C.
&
Lee
J. M.
(
2016
)
Robust leak detection and its localization using interval estimation for water distribution network
,
Computers & Chemical Engineering
,
92
,
1
17
.
Lombana Cordoba
C.
,
Saltiel
G.
&
Perez Penalosa
F.
(
2022
)
Utility of the future 2.0
.
Morimoto
M.
,
Fukami
K.
,
Zhang
K.
,
Nair
A. G.
&
Fukagata
K.
(
2021
)
Convolutional neural networks for fluid flow analysis: toward effective metamodeling and low dimensionalization
,
Theoretical and Computational Fluid Dynamics
,
35
(
5
),
633
658
.
Olden
J. D.
,
Joy
M. K.
&
Death
R. G.
(
2004
)
An accurate comparison of methods for quantifying variable importance in artificial neural networks using simulated data
,
Ecological modelling
,
178
(
3–4
),
389
397
.
Ramos
H. M.
,
Kuriqi
A.
,
Besharat
M.
,
Creaco
E.
,
Tasca
E.
,
Coronado-Hernández
O. E.
,
Pienika
R.
&
Iglesias-Rey
P.
(
2023
)
Smart water grids and digital twin for the management of system efficiency in water distribution networks
,
Water
,
15
(
6
),
1129
.
Ronneberger
O.
,
Fischer
P.
&
Brox
T.
(
2015
)
‘U-Net: convolutional networks for biomedical image segmentation’, Medical image computing and computer-assisted intervention–MICCAI 2015: 18th international conference, Munich, Germany, October 5–9, 2015, Proceedings, Part III 18, 234–241. Springer
.
Sanders
J.
,
Marshallsay
D.
,
Mountfort
G.
,
Fox
G.
&
Butler
M.
(
2022
)
A Leakage Routemap to 2050. Water UK
.
Tomić
S.
,
Karmous-Edwards
G.
&
Kamojjala
S.
(
2022
)
Digital twins: case studies in water distribution management
,
Journal American Water Works Association
,
114
(
8
),
44
56
.
Vicente
D.
,
Garrote
L.
,
Sánchez
R.
&
Santillán
D.
(
2015
)
Pressure management in urban water distribution systems: current status, proposals and future trends
,
Journal of Water Resources Planning and Management
,
142
(
2
),
04015061
.
Wu
Z. Y.
,
Chew
A.
,
Meng
X.
,
Cai
J.
,
Pok
J.
,
Kalfarisi
R.
,
Lai
K. C.
,
Hew
S. F.
&
Wong
J. J.
(
2023
)
High fidelity digital twin-based anomaly detection and localization for smart water grid operation management
,
Sustainable Cities and Society
,
91
,
104446
.
Xiang
X.
,
Li
Q.
,
Khan
S.
&
Khalaf
O. I.
(
2021
)
Urban water resource management for sustainable environment planning using artificial intelligence techniques
,
Environmental Impact Assessment Review
,
86
,
106515
.
This is an Open Access article distributed under the terms of the Creative Commons Attribution Licence (CC BY 4.0), which permits copying, adaptation and redistribution, provided the original work is properly cited (http://creativecommons.org/licenses/by/4.0/).