This study introduces a novel framework for the estimation of creep function coefficients in viscoelastic pipelines, employing a combination of machine learning (ML) and transient hydraulics. Specifically, the efficacy of eXtreme Gradient Boosting (XGBoost) as an advanced regression method is evaluated within this framework. A transient simulation model, utilizing the method of characteristics, is formulated to create the reservoir-pipe-valve (RPV) scenarios under diverse boundary conditions. After conducting model calibration, the model is used to generate datasets with the transient hydraulic responses at the measurement points. The fast Fourier transform (FFT) is then applied to transform the generated samples into the frequency domain. Feature selection is accomplished through principal component analysis (PCA) to identify optimal input variables for XGBoost. In estimating the creep function, six coefficients are employed, with the feature selection analysis indicating that each coefficient is associated with a specific signal. Importantly, it is demonstrated that attempting to estimate all six coefficients using a single set of signals is unfeasible. The results affirm the accuracy of the proposed ML-based framework in determining creep function coefficients.

  • Determination of creep function coefficients of viscoelastic pipes.

  • Proposing a transient-guided machine learning model.

  • Using XGBoost as a regressor for predicting the creep function in pipes.

FFT

fast Fourier transform

GA

genetic algorithm

HDPE

high-density polyethylene

ITA

inverse transient analysis

KV

Kelvin–Voigt

MAE

mean absolute error

ML

machine learning

MOC

method of characteristics

MSE

mean square error

PCA

principal component analysis

PE

polyethylene

PVC

polyvinyl chloride

RMSE

root mean square error

RPV

reservoir-pipe-valve system

RSE

relative squared error

SD

standard deviation

XGBoost

eXtreme Gradient Boosting

In recent years, the use of plastic pipes (e.g., polyvinyl chloride (PVC) and polyethylene (PE)) in water supply and sewage transfer systems has become more popular than other types of pipes, such as steel, concrete, cast iron, and asbestos. This is because plastic pipes offer numerous advantages from both technical and economic perspectives. These include high resistance to heat, chemicals, and pressure, low cost and weight, excellent durability against erosion and scouring, and fast and easy installation (Apollonio et al. 2013). All plastic pipes have a unique property that makes them change their shape and size over time when stressed. This property is called viscoelasticity, and it affects how the pressure changes in the pipes when the water hammer occurs (Apollonio et al. 2013; Keramat et al. 2013).

It should be noted that this excess stress caused by pressure changes is called creep, which disappears after the pressure becomes uniform in the pipe. The viscoelastic properties of plastic pipes affect how they react to pressure waves, which are important for their structural and hydraulic performance. In a water supply system, pressure waves are generated when valves close abruptly, or pumps start or stop suddenly, resulting in periodic pressure fluctuations. This phenomenon, commonly referred to as ‘water hammer’, can potentially damage pipes (Pezzinga 2002; Ramos et al. 2004; Tricarico et al. 2007; Apollonio et al. 2013). Therefore, simulating transient flow in water transmission systems can contribute to the safe operation of transmission pipelines by determining the minimum and maximum water hammer pressures.

Numerous experimental and numerical studies on viscoelastic pipes have significantly enhanced our understanding. In Wahba (2017) and Bertaglia et al. (2018), the method of characteristics (MOC) was introduced as a means to model the viscoelastic behavior of the pipe wall in water hammer equations. These studies incorporated the Kelvin–Voigt (KV) elements to represent the impact of circumferential strains on the pipe wall. Subsequently, several researchers (i.e., Huang et al. 2017; Capponi et al. 2020; Bostan et al. 2021; Kim 2023) validated and expanded upon this mathematical model.

Mechanical models have proven effective in mathematically simulating viscoelastic behavior, with springs and dashpots representing elastic and viscous deformations, respectively. Among the various mechanical models available, the generalized KV model is widely utilized for studying the creep and relaxation characteristics of solid viscoelastic materials. However, it is important to note that the main limitation of this method is the requirement for calibration of the creep compliance functions through pipeline testing. In a study by Soares et al. (2008), the creep compliance functions were calibrated using pressure fluctuations induced by a water hammer. Additionally, other works (i.e., Keramat et al. 2010; Soares et al. 2012) have examined the impact of fluid column separation both numerically and experimentally. Soares et al. (2010) utilized a genetic algorithm (GA)-based model to calibrate various KV models using data from laboratory experiments on a hydraulic model with PVC pipes.

Furthermore, Keramat et al. (2013) investigated the numerical modeling of water hammers in viscoelastic pipes with a time-dependent Poisson ratio. Their findings indicated that the viscoelastic data obtained from mechanical tests were more accurate when considering a time-dependent Poisson ratio and unsteady friction than using the KV model for water hammer analysis. They also reported the successful calibration of the creep curve. Gong et al. (2016) introduced a new frequency-domain technique to estimate the creep function of viscoelastic pipes using a transient flow analysis. The basis of the proposed approach is the analytical relationship between the viscoelastic parameters of the pipeline and friction-related parameters and the resonant frequencies. The technique's relatively high accuracy for the elastic wave velocity and viscoelastic creep compliances was confirmed by numerical simulations on a high-density polyethylene (HDPE) pipeline. Ferrante & Capponi (2017) developed and compared several viscoelastic models to simulate the transients in polymer pipes on a reservoir-pipe-valve system (RPV) with two plastic pipes (i.e., high-density PE and oriented PVC). Their study compared the efficiency of three different models, namely the standard linear solid model, the Maxwell model, and the generalized Maxwell model, in simulating transient flow in pipes. The findings indicated that the generalized Maxwell model performed relatively superior to the other models investigated when simulating transient flow in PE and PVC pipes.

Bertaglia et al. (2018) studied the efficiency of three numerical models, including the MOC, semi-implicit staggered finite volume method, and explicit path conservative finite volume method, to simulate the transient flow in the viscoelastic pipes. Their study compared the experimental and analytical results obtained using three-parameter and multi-parameter linear viscoelastic models. They also investigated the viscoelastic effects of unsteady friction loss, pipe wall, cavitation, and cross-sectional changes. They found that all numerical methods agreed well with experimental results. They suggested that the MOC method could be used for simple systems. However, for complex configurations, it was important to consider the specific aspects involved in a particular case and the maximum acceptable error for the results (Bertaglia et al. 2018). Cheshme et al. (2021) conducted a study to investigate KV elements' impact on creep behavior and pressure fluctuations by calibrating creep coefficients using creep function and transient pressure. The findings indicate that the accuracy of three- and four-element models in the first method is higher than that of the second method for pipes longer than 1,000 m. However, for pipes shorter than 1,000 m, the accuracy of the three and four-element models is comparable in both methods. Regarding the two-element model, the accuracy of the first method is higher than the second method for pipes longer than 720 m, while the opposite is observed for lengths less than 720 m. Furthermore, the error analysis using both approaches highlights that the one-element KV model is largely inaccurate, leading to up to 60% errors. Consequently, it is not suitable for predicting transient pressures in viscoelastic pipes. To study the viscoelastic properties of plastic pipes in complex fluids under different flow conditions, such as laminar or turbulent, extensive experimentation and simulation tools are needed to deal with the nonlinear interactions of multiple variables, which is computationally expensive.

Machine learning (ML) models have become increasingly prevalent in addressing engineering challenges, as they can simulate complex phenomena and enhance decision-making across various engineering domains. Some of the recent examples of such applications are leak detection in the pipeline (Kang et al. 2018; Ayati et al. 2022), structural analysis (Seghier et al. 2021; Seghier et al. 2023), and water resources and hydraulic engineering (Granata et al. 2022; Ohadi et al. 2022; Di Nunno et al. 2023a, b; Vatani et al. 2023). For example, Ayati et al. (2022) developed and used several ML-based models to analyze transient flow in water network systems to identify the location of a leak. The results showed that the proposed ML models have high potential. In more recent work, Asghari et al. (2023) introduced an efficient ML-based framework to detect leaks in pipes using transient waves, achieving an accuracy of 97%. Although the ML models have been widely used in transient flow analysis of viscoelastic pipes, some important challenges have not been solved so far, see Tjuatja et al. (2023).

As delineated in this introduction, ML-based methods have demonstrated efficacy in simulating transient flow within pipelines, thereby addressing the shortcomings of conventional methodologies. Despite these advancements, there remains a lack of research employing ML-based techniques for the estimation of creep function coefficients in viscoelastic pipes under unsteady flow conditions. Consequently, a review of the existing literature highlights the critical need for an exhaustive study that applies ML strategies to accurately estimate the creep function coefficients in elastic pipes. The primary contribution of this study is the development of a transient-guided ML model designed to estimate the creep function coefficients of viscoelastic pipes during transient flow analysis. To facilitate this, a laboratory model underwent calibration via the inverse transient analysis (ITA) method, yielding a dataset encompassing a broad spectrum of transient scenarios at various measurement points. This dataset includes critical pipe information such as diameter (m), length (m), thickness (mm), elastic wave velocity (m/s), Brownian decay coefficient, and fluid velocity (m/s). Following this, the pressure signal data were transformed into the frequency domain, with features selected through principal component analysis (PCA). The eXtreme Gradient Boosting (XGBoost) algorithm was then applied to accurately estimate the creep function coefficients. A review of recent studies in the field of ML shows that XGBoost offers high performance, the ability to handle missing data, regularization to avoid overfitting, support for parallel processing, flexibility in performing various tasks, and complex tree pruning (Amjad et al. 2022; Nguyen et al. 2022; Han et al. 2023a, b). These features provide acceptable and stable results in solving complex problems, making XGBoost the most suitable choice for our study.

The rest of this study is structured as follows. The second section provides a detailed explanation of the mathematical formulation of the problem. Subsequently, the methodology and the proposed framework are discussed, starting from initial developments to final verification. The fourth chapter is dedicated to the specifications of the case study. The results of the proposed methodology applied to the case study and a corresponding discussion are presented. The paper concludes with a summary, conclusions, and recommendations for future research.

The continuity and momentum equations are used to describe the transient flow in viscoelastic pipes which are mathematically represented by Equations (1) and (2), respectively. Due to the dependence of flow and head in the transient analysis period, the mentioned equations are represented as hyperbolic partial differential equations (Apollonio et al. 2013):
(1)
(2)
where V is the velocity in pipe, H and are the piezometric pressure and the circumferential (hoop) strain, respectively. is the steady head loss, K is the bulk modulus of fluid, g is the gravitational acceleration, t is the time, is density of fluid, and x is the coordinates along the pipeline axis. The is calculated as follows:
(3)
where D and A are the inner diameter and the pipe's cross-sectional area, respectively. f is the Darcy–Weisbach friction factor and Q is the flow in pipe. The model has a delayed behavior because of the hoop strain-rate term in the continuity equation for a viscoelastic pipe. The hoop strain depends on the three-dimensional stress–strain relations in a viscoelastic media, which are expressed as follows in a cylindrical coordinate system with r, , and z axes:
(4)
(5)
where represents the Stieltjes convolution operator, is the ratio of Poisson, is the radial, is the hoop stress, and J is the creep function based on the generalized KV model with that is calculated by Equation (6) as follows:
(6)
in which, indicates the immediate response of the material, and are the creep compliance of the spring and retardation time of the k-th KV element, respectively. The pipeline is fully fixed from axial movements in the current research, so the fluid–structure interaction effects are insignificant. This means that in Equation (5), which simplifies Equation (4) to:
(7)
Also, the radial and hoop stress in a thin-walled pipe are estimated by Equations (8) and (9):
(8)
(9)
where is dynamic pressure head and is an averaging factor. The following steps are used to calculate the time derivative of the function that is shown by the Stieltjes convolution in Equation (7) (Apollonio et al. 2013):
(10)
According to the Leibniz rule, the integral term of Equation (10) can be estimated as:
(11)
As a result, Equation (1) can be written as the following:
(12)
where , L is the length of the pipe, and c is the wave speed. By using a linear combination of Equations (2) and (12) with the MOC, Equation (13) will appear at the valve location in the first half period of the transient period for the positive wave:
(13)

In the above equation, the length between the valve and the front of the positive pressure wave is , and is the velocity. The fluid velocity at the valve is zero, so where is the weight factor. This has a second-order accuracy. More details about the mathematical modeling of transient flow analysis in viscoelastic pipes can be found in Keramat & Haghighi (2014).

Figure 1 illustrates the flowchart of the proposed ML-based model for prediction of creep function coefficients. The following steps describe the proposed model in detail.
  • I. A mathematical model (numerical solution) of hydraulic analysis of transient flow in pipelines is developed. This model simulates the RPV problem under different boundary conditions.

  • II. The developed mathematical model is calibrated and validated with the results of a well-known laboratory study.

  • III. The model, which has been developed and calibrated, receives various types of pipes with different characteristics (such as diameter, length, thickness, elastic wave velocity, Brownian decay coefficient, and fluid velocity). The model then produces the hydraulic responses (pressure signals) for each pipe at the measurement sites.

  • IV. Before further processing, the data generated in the time domain are converted to the frequency domain using the fast Fourier transform (FFT) method.

  • V. The input matrix dimension of ML is reduced by applying the PCA for the feature selection. The creep function coefficients are predicted by using the XGBoost method.

Figure 1

Flowchart of the proposed framework.

Figure 1

Flowchart of the proposed framework.

Close modal

As depicted in the flowchart (Figure 1), the proposed framework comprises three distinct stages. Initially, a numerical model is developed utilizing the MOC to simulate the RPV system, with measurements derived via ITA-based GA. Subsequently, this model generates data pertinent to the creep coefficients of viscoelastic pipes, transforms pressure data from the time domain to the frequency domain, and selects relevant features. The final stage involves employing an XGBoost model to estimate the creep function coefficients.

Calibration and case study

The optimization–simulation ITA-based model calibrated the developed numerical hydraulic model on a laboratory case study before generating the dataset of variables for calculating the creep functions. The calibration process is used to find the best values of unknown parameters (i.e., creep function coefficients and unsteady friction loss coefficient). The calibration problem is defiend as a single-objective optimization problem which minimizes the error of estimated pressure by the numerical model compared with observed pressure values in the laboratory. The objective function of this problem is mathematically represented by:
(14)
where is the calculated head by the simulation model, H is the observed piezometric head, and M is the number of pressure observation locations. In the calibration process, an optimizer (e.g., gradient or metaheuristic methods) is used to find the values of the unknown parameters. A GA (Holland 1992) was also used in this study. A GA was used with the standard steps of creating the initial population, computing the fitness of the objective function for each chromosome, selecting parents with the Tournament method, and generating offspring uniformly.
The research of Covas et al. (2004) introduced a water supply system which is used in this study to calibrate the numerical model for data generation. The system is a simple RPV system with an HDPE. The pipeline has a length of 271.8 m, a wall thickness of 6.25 mm, and an internal diameter of 50.6 mm. An experiment based on this water supply system with a flow rate of 1.01 L/s and a valve closing time of 0.04 s calibrated the numerical model of the transient flow simulator. Figure 2(a) and 2(b) represents the schematic of the experimental model and the signal at the upstream boundary (reservoir) and behind the downstream valve.
Figure 2

Schematic and measured pressure signal of the case study.

Figure 2

Schematic and measured pressure signal of the case study.

Close modal
A GA is employed for automatic calibration of the simulation model with the decision variables of the creep function coefficents (J1, J2, J3, J4, J5, and J6 ()) and the velocity of pressure elastic wave (). Table 1 shows the optimal values of the decision variables and the objective value (Equation (14)). Figure 3(a)–3(c) displays the calibration results of the numerical hydraulic model, illustrating the model's accuracy in replicating the observed data.
Table 1

Results of calibration

Parameter
Coefficients of creep function (×10−10Pa−1)
Pressure wave Objective function (m2)
J1J2J3J4J5J6a
0.98 0.4 0.32 1.09 0.03 0.24 392 0.24 
Parameter
Coefficients of creep function (×10−10Pa−1)
Pressure wave Objective function (m2)
J1J2J3J4J5J6a
0.98 0.4 0.32 1.09 0.03 0.24 392 0.24 
Figure 3

Results of calibration: (a) convergence curve of GA; (b) experimental and numerical signals; and (c) creep function.

Figure 3

Results of calibration: (a) convergence curve of GA; (b) experimental and numerical signals; and (c) creep function.

Close modal
Figure 3(b) shows the consistency between the numerical and experimental signals indicating the accuracy of the automatic calibration model in finding optimal values for coefficients of the simulation model. This calibrated model is then used to generate 5,000 data series for each parameter. These data series were used as input parameters of the proposed framework. Table 2 presents the statistical values (i.e., average, standard deviation (SD), minimum, and maximum) of the input parameters of the numerical model to extract the system response. The numbers related to the diameter of the pipes were generated using the values of standard diameters. The following equation shows the uniform distribution function that places the generated numbers in the specified range:
(15)
where X represents the generated number, rand is a random number between 0 and 1, and b and a are the variables' upper bound and lower bound, respectively.
Table 2

Statistical values of the numerical model's input parameters

Input variables
a (m/s)L (m)D (cm)V (m/s)fkpJ1J6 × 10−10Pa−1
Average  – – 0.26 0.03 0.03 2.50 
SD 116.70 – – 0.27 0.03 0.00  
Min 200.00 50.00 0.00 0.02 – 0.00 
Max 600.00 1,000.00 100 1.18 0.05 – 5.00 
Input variables
a (m/s)L (m)D (cm)V (m/s)fkpJ1J6 × 10−10Pa−1
Average  – – 0.26 0.03 0.03 2.50 
SD 116.70 – – 0.27 0.03 0.00  
Min 200.00 50.00 0.00 0.02 – 0.00 
Max 600.00 1,000.00 100 1.18 0.05 – 5.00 
The model received the generated dataset and extracted a pressure signal in the time domain for each input data series. The extracted pressure signals were converted to the frequency domain using FFT. Figure 4 compares a signal in the time domain and its equivalent signal in the frequency domain.
Figure 4

Response of system: (a) time domain and (b) frequency domain.

Figure 4

Response of system: (a) time domain and (b) frequency domain.

Close modal

eXtreme Gradient Boosting

In this study, XGBoost is employed for prediction of coefficients of the creep function. XGBoost is an ML method recently developed and introduced for prediction and classification by Han et al. (2023a). It belongs to the category of supervised learning methods. It has a low-time calculation advantage over other methods when dealing with high-dimensional problems (Velthoen et al. 2023). In XGBoost, having a dataset DS with m features and n examples , the predicted output of an ensemble tree model, , is generated from the following equation:
(16)
in which K is the number of trees, and is the k-th tree. We need to minimize the loss and regularization objective to find the best set of functions that solve the equation:
(17)
The loss function l is the difference between the actual output and the predicted output . is a complexity indicator of the model which helps prevent overfitting and is calculated by:
(18)

This equation shows that T is the number of leaves of the tree and w is the leaf weight.

The model uses boosting to train the decision trees and minimize the objective function. This means that the model adds a new function f as it trains. Therefore, a new function (tree) is added in the t-th iteration as follows:
(19)
(20)
where:
(21)
(22)

Evaluation metrics

Several known metrics were used to evaluate the performance of the proposed transient-guided ML model for predicting the creep function coefficients including mean absolute error (MAE), relative squared error (RSE), root mean square error (RMSE), and coefficient of correlation (R). Equations (23)–(26) represent the formulas of these metrics (Rahmanshahi et al. 2023, 2024):
(23)
(24)
(25)
(26)
where and are the predicted and the observed values, respectively, and n is the number of observations. is the average value of the observed parameters.

Given that the viscoelastic coefficients constitute the overall structure of the signal, it can be concluded that low frequencies have the most significant impact on the coefficients of the creep function. It should be noted that the feature selection model was developed in MATLAB 2022, and the XGBoost model was developed on the Visual Studio Code (VSC) platform using Python. As a result, the input variables for the ML model were chosen to be the 20 primary frequencies of the signals. The complexity of any classification and regression model is influenced by its number of inputs. A feature selection process was conducted in this study to enhance the accuracy of creep function coefficient estimation and reduce the computational time of the ML model. PCA was utilized to assess the impact of each input on the creep function coefficients, resulting in a graded ranking. Subsequently, the inputs were categorized into four groups based on rankings. This categorization allows for the development of four separate forecasting models, each dedicated to predicting one of the creep coefficients. By doing so, the influence of the number of frequencies on the accuracy of the forecasts is evaluated. The outcomes of the feature selection process are presented in Table 3. The numbers in Table 3 show the data column numbers in dataset. For example, to predict J1 based on Group 1, columns number 8, 9, 7, 20, and 10 in dataset, which contain the measured pressure signals, were used.

Table 3

The results of feature selection by PCA

Rank
Rank
IIIIIIVIVIIIIIIVIV
J1 Group 1 20 10 J4 Group 1 
Group 2 20 10 Group 2 
19 13 14 12 18 17 16 19 20 
Group 3 20 10 Group 3 
19 13 14 12 18 17 16 19 20 
18 15 17 15 14 
Group 4 20 10 Group 4 
19 13 14 12 18 17 16 19 20 
18 15 17 15 14 
11 16 11 10 12 13 
J2 Group 1 20 19 J5 Group 1 16 17 
Group 2 20 19 Group 2 16 17 
18 13 12 10 15 18 14 19 
Group 3 20 19 Group 3 16 17 
18 13 12 10 15 18 14 19 
17 14 11 20 11 
Group 4 20 19 Group 4 16 17 
18 13 12 10 15 18 14 19 
17 14 11 20 11 
16 15 13 10 12 
J3 Group 1 20 19 J6 Group 1 10 11 
Group 2 20 19 Group 2 10 11 
18 17 12 19 
Group 3 20 19 Group 3 10 11 
18 17 12 19 
16 12 11 20 15 16 14 
Group 4 20 19 Group 4 10 11 
18 17 12 19 
16 12 11 20 15 16 14 
10 15 13 14 17 18 13 
Rank
Rank
IIIIIIVIVIIIIIIVIV
J1 Group 1 20 10 J4 Group 1 
Group 2 20 10 Group 2 
19 13 14 12 18 17 16 19 20 
Group 3 20 10 Group 3 
19 13 14 12 18 17 16 19 20 
18 15 17 15 14 
Group 4 20 10 Group 4 
19 13 14 12 18 17 16 19 20 
18 15 17 15 14 
11 16 11 10 12 13 
J2 Group 1 20 19 J5 Group 1 16 17 
Group 2 20 19 Group 2 16 17 
18 13 12 10 15 18 14 19 
Group 3 20 19 Group 3 16 17 
18 13 12 10 15 18 14 19 
17 14 11 20 11 
Group 4 20 19 Group 4 16 17 
18 13 12 10 15 18 14 19 
17 14 11 20 11 
16 15 13 10 12 
J3 Group 1 20 19 J6 Group 1 10 11 
Group 2 20 19 Group 2 10 11 
18 17 12 19 
Group 3 20 19 Group 3 10 11 
18 17 12 19 
16 12 11 20 15 16 14 
Group 4 20 19 Group 4 10 11 
18 17 12 19 
16 12 11 20 15 16 14 
10 15 13 14 17 18 13 
Table 4

Parameters of XGBoost

ParameterValue
Learning rate 0.05 
Max depth of a tree 
Sample ratio of training data 
Sample ratio of futures 
Number of estimators 500 
Weight values of labels 
ParameterValue
Learning rate 0.05 
Max depth of a tree 
Sample ratio of training data 
Sample ratio of futures 
Number of estimators 500 
Weight values of labels 

The generated data were randomly divided into two sets: 80% for training the model and 20% for testing it. There are various methods for selecting and separating data, and random selection is a common approach which is used in this research. However, since the data ranges are unequal, this can potentially impact the accuracy of the model's predictions. When the model is trained on a specific data range, its performance may deteriorate when faced with a different range during the testing phase. To alleviate this, the data were normalized to be in the range of [0,1] by using Equation (27) which is mathematically written as:
(27)
where and are the minimum and maximum values of the parameter x, respectively.

The proper values of the XGBoost hyperparameters were determined following a trial and error process and are presented in Table 4. After adjusting the parameters of the prediction model, the patterns obtained from the feature selection process were used to predict each of the coefficients of the creep function.

According to the groups listed in Table 3, four patterns have been identified: the first pattern consists of Group 1; the second pattern combines Groups 1 and 2; the third pattern encompasses Groups 1, 2, and 3; and the fourth pattern includes Groups 1, 2, 3, and 4. These patterns were formed and utilized to estimate creep coefficients.

Estimation of creep coefficients by using the XGBoost model

Table 5 presents the values of statistical parameters calculated for the creep function coefficients during the training and testing phases of the XGBoost model. The model has demonstrated strong performance based on these statistical parameters in both phases. For the J1 coefficient, the model's accuracy improved with an increase in input signals among the four patterns used, leading to a more precise estimation of all coefficients. Specifically, the XGBoost model with Pattern D exhibited the lowest values of RMSE (Training = 0.032 × 10−10Pa−1; Testing = 0.044 × 10−10Pa−1) and MAE (Training = 0.004 × 10−10Pa−1; Testing = 0.005 × 10−10Pa−1) along with the highest R value, indicating superior estimation of the J1 coefficient compared with other patterns. In contrast, for the J2 coefficient, the model's accuracy decreased as the number of input signals increased. Notably, the RMSE, MAE, and RSE values for the first pattern are zero during both training and testing stages, and the R value is 1.00, signifying a high correlation between predicted and observed data. Conversely, for Pattern D, the RMSE, MAE, and RSE values are the highest, at 0.733 × 10−10Pa−1, 0.455 × 10−10Pa−1, and 0.402 respectively, and the R value is the lowest at 0.773.

Table 5

Performance assessment of the XGBoost model for predicting the creep function coefficients

PatternPhaseRMSE (×10−10Pa−1)MAE (×10−10Pa−1)RSERRMSE (×10−10Pa−1)MAE (×10−10Pa−1)RSER
Training J1 0.641 0.391 0.201 0.894 J2 0.000 0.000 0.000 1.000 
Testing 0.642 0.388 0.195 0.897 0.000 0.000 0.000 1.000 
Training 0.159 0.038 0.012 0.994 0.685 0.427 0.376 0.790 
Testing 0.162 0.046 0.013 0.994 0.688 0.427 0.355 0.803 
Training 0.070 0.012 0.002 0.999 0.724 0.450 0.421 0.761 
Testing 0.073 0.012 0.003 0.999 0.725 0.451 0.393 0.779 
Training 0.032 0.004 0.000 1.000 0.734 0.456 0.432 0.753 
Testing 0.044 0.005 0.001 1.000 0.733 0.455 0.402 0.773 
Training J3 0.628 0.375 0.187 0.902 J4 0.443 0.220 0.093 0.953 
Testing 0.657 0.385 0.202 0.893 0.442 0.222 0.096 0.951 
Training 0.150 0.038 0.010 0.995 0.114 0.026 0.006 0.997 
Testing 0.155 0.040 0.011 0.994 0.146 0.035 0.010 0.995 
Training 0.053 0.008 0.001 0.999 0.047 0.007 0.001 0.999 
Testing 0.061 0.010 0.002 0.999 0.064 0.010 0.002 0.999 
Training 0.032 0.003 0.000 1.000 0.033 0.004 0.001 1.000 
Testing 0.040 0.005 0.001 1.000 0.050 0.005 0.001 0.999 
Training J5 0.456 0.205 0.100 0.949 J6 0.313 0.126 0.047 0.976 
Testing 0.417 0.189 0.086 0.956 0.295 0.117 0.044 0.978 
Training 0.158 0.037 0.012 0.994 0.110 0.024 0.006 0.997 
Testing 0.146 0.030 0.011 0.995 0.117 0.022 0.007 0.997 
Training 0.069 0.010 0.002 0.999 0.060 0.009 0.002 0.999 
Testing 0.063 0.009 0.002 0.999 0.070 0.010 0.002 0.999 
Training 0.051 0.006 0.001 0.999 0.043 0.005 0.001 1.000 
Testing 0.045 0.005 0.001 0.999 0.052 0.005 0.001 0.999 
PatternPhaseRMSE (×10−10Pa−1)MAE (×10−10Pa−1)RSERRMSE (×10−10Pa−1)MAE (×10−10Pa−1)RSER
Training J1 0.641 0.391 0.201 0.894 J2 0.000 0.000 0.000 1.000 
Testing 0.642 0.388 0.195 0.897 0.000 0.000 0.000 1.000 
Training 0.159 0.038 0.012 0.994 0.685 0.427 0.376 0.790 
Testing 0.162 0.046 0.013 0.994 0.688 0.427 0.355 0.803 
Training 0.070 0.012 0.002 0.999 0.724 0.450 0.421 0.761 
Testing 0.073 0.012 0.003 0.999 0.725 0.451 0.393 0.779 
Training 0.032 0.004 0.000 1.000 0.734 0.456 0.432 0.753 
Testing 0.044 0.005 0.001 1.000 0.733 0.455 0.402 0.773 
Training J3 0.628 0.375 0.187 0.902 J4 0.443 0.220 0.093 0.953 
Testing 0.657 0.385 0.202 0.893 0.442 0.222 0.096 0.951 
Training 0.150 0.038 0.010 0.995 0.114 0.026 0.006 0.997 
Testing 0.155 0.040 0.011 0.994 0.146 0.035 0.010 0.995 
Training 0.053 0.008 0.001 0.999 0.047 0.007 0.001 0.999 
Testing 0.061 0.010 0.002 0.999 0.064 0.010 0.002 0.999 
Training 0.032 0.003 0.000 1.000 0.033 0.004 0.001 1.000 
Testing 0.040 0.005 0.001 1.000 0.050 0.005 0.001 0.999 
Training J5 0.456 0.205 0.100 0.949 J6 0.313 0.126 0.047 0.976 
Testing 0.417 0.189 0.086 0.956 0.295 0.117 0.044 0.978 
Training 0.158 0.037 0.012 0.994 0.110 0.024 0.006 0.997 
Testing 0.146 0.030 0.011 0.995 0.117 0.022 0.007 0.997 
Training 0.069 0.010 0.002 0.999 0.060 0.009 0.002 0.999 
Testing 0.063 0.009 0.002 0.999 0.070 0.010 0.002 0.999 
Training 0.051 0.006 0.001 0.999 0.043 0.005 0.001 1.000 
Testing 0.045 0.005 0.001 0.999 0.052 0.005 0.001 0.999 

For the J3 coefficient, Table 5 indicates that Patterns C and D yield comparably favorable results, whereas Patterns A and B produce less accurate estimations for this coefficient. The XGBoost model's precision in estimating the J3 coefficient has improved with an increased number of model inputs. Consequently, the lowest values of RMSE, MAE, and RSE are 0.040 × 10−10Pa−1, 0.005 × 10−10Pa−1, and 0.001, respectively, with the highest R value being 1.00 during the testing phase of Pattern D. Regarding the J4 coefficient, Pattern D also demonstrates the lowest RMSE, MAE, and RSE values at 0.033 × 10−10Pa−1, 0.005 × 10−10Pa−1, and 0.001, respectively. Conversely, the highest values for these parameters are observed with Pattern A, indicating that the model's accuracy in estimating the J4 coefficient improves as the number of input signals increases. The correlation coefficient obtained for the model is nearly 1.00 across all four patterns, indicating a perfect correlation between the predicted and actual data values.

The results obtained for coefficient J5, as shown in Table 5, indicate that the values of the statistical parameters for Pattern D are the lowest, whereas those for Pattern A are the highest. It is concluded that Patterns D and C provide the most accurate estimations for J5. Similarly, the model's prediction accuracy for coefficient J6 has improved with an increased number of input signals. Table 5 reveals that the XGBoost model predicts coefficient J6 with greater accuracy when using Pattern D. The values of RMSE, MAE, and RSE parameters are 0.052 × 10−10Pa−1, 0.005 × 10−10Pa−1, and 0.001 for the testing phase, respectively, which are the lowest among the models evaluated. The highest value of the R parameter in the table underscores the superiority of Pattern D.

Table 6

Original and calibrated creep function parameter values for different models

(s) (s) (s) (×10−10Pa−1) (×10−10Pa−1) (×10−10Pa−1)a (m/s)
Original values 
0.04 0.7 10 0.5 1.3 400 
Predicted by XGBoost model 
0.04 0.7 10 0.51 1.28 1.19 400.12 
Predicted by ITA 
0.04 0.7 10 0.59 0.82 5.71 401.58 
0.04 0.7 10 0.71 0.80 4.33 404.21 
0.04 0.7 10 0.50 1.25 1.61 399.99 
(s) (s) (s) (×10−10Pa−1) (×10−10Pa−1) (×10−10Pa−1)a (m/s)
Original values 
0.04 0.7 10 0.5 1.3 400 
Predicted by XGBoost model 
0.04 0.7 10 0.51 1.28 1.19 400.12 
Predicted by ITA 
0.04 0.7 10 0.59 0.82 5.71 401.58 
0.04 0.7 10 0.71 0.80 4.33 404.21 
0.04 0.7 10 0.50 1.25 1.61 399.99 

The results of modeling the creep function coefficients showed that each coefficient depended on specific signals that could improve the prediction and modeling accuracy by determining them optimally. The results showed that the proposed approach could accurately estimate the creep function coefficients in viscoelastic pipes based on pressure signals. Using all 20 signals for J1, J3, J4, J5, and J6 improved the accuracy and performance of XGBoost. The best prediction for the J2 coefficient used five signals with the numbers 8, 9, 7, 20, and 19. Moreover, to demonstrate the accuracy of the proposed framework in estimating the creep function coefficients precisely, the time series plots for the test phase are shown in Figure 5. A close match between the predicted and experimental values indicates that the proposed model has established a connection between the inputs and the coefficients of the creep function.
Figure 5

Time series plot of prediction and experimental creep function coefficients.

Figure 5

Time series plot of prediction and experimental creep function coefficients.

Close modal
Figure 6 shows the accuracy of the proposed framework developed based on ML in estimating creep function coefficients. The creep function calculated based on the XGBoost model's estimated coefficients is entirely consistent with the actual function. Note that the functions in Figure 6 were randomly selected from the total data.
Figure 6

Predicted and numerical creep function by using the XGBoost model.

Figure 6

Predicted and numerical creep function by using the XGBoost model.

Close modal

Comparison of ITA and XGBoost models

As mentioned earlier, the numerical ITA method is the most commonly used method for determining the coefficients of the creep function in viscoelastic pipes. Therefore, to demonstrate the effectiveness of the proposed framework utilizing the XGBoost method, a comparison between these two methods is presented in this section. For this purpose, data from the water distribution system and usage patterns were employed to train and develop the XGBoost model for evaluating the ITA method. After training the XGBoost model and providing creep function coefficients for an example with , , and , the ITA model was run three times. Figure 7(a) demonstrates that both methods have accurately reproduced the pressure signal. However, as shown in Figure 7(b), the creep function determined by the XGBoost model more closely matches the actual creep curve, yielding more precise results. Conversely, the ITA method produces nonunique results across different simulations, with each iteration calibrating a distinct set of creep coefficients.
Figure 7

(a) Comparison of pressure signal in time domain and (b) creep function of different modeling.

Figure 7

(a) Comparison of pressure signal in time domain and (b) creep function of different modeling.

Close modal

For a better comparison, Table 6 shows the actual and recalibrated values of creep function coefficients using XGBoost and ITA models.

Furthermore, in terms of computational efficiency, the XGBoost model, once trained, can be applied multiple times, whereas the ITA model requires repetition for each determination of the creep function. Given that the computational demand of the ITA model escalates with system complexity – unlike the XGBoost model – the latter exhibits superior capabilities in terms of computational cost. Future studies may delve into the resilience of these methods against noise, system defects, and other uncertainties in greater detail.

This study shows that the proposed ML-based framework can be applied successfully to estimate the creep function coefficients of viscoelastic pipes in transient flow. As such, future works can adopt these methods to other transient flow analyses (e.g., leak detection) under various conditions (e.g., deterministic and probabilistic). These models can potentially supplement numerical and analytical techniques or validate intricate water hammer analyses that require significant time savings.

In this study, a novel approach was proposed that used XGBoost, an efficient ML method, to accurately predict the creep function coefficients of viscoelastic pipes under transient flow condition. A numerical model based on the MOC was developed and automatically calibrated, and a large set of pipe specifications was considered. Then, the hydraulic response of an RPV system was calculated using these specifications. The XGBoost model used the measured signals as the input and estimated the coefficients of the creep function. The outcomes of this study are as follows:

  • In this study, a dataset was generated by modeling the system transiently. It is worthy to note that transient waves have more information content about the system specification and faults than steady-state methods. The proposed method in this study showed the promising performance of using transient data in the ML approach to estimate the creep function coefficients accurately.

  • The proposed scheme requires most of the computational effort in the training stage. After this stage, it can instantly process new measurements and predict new creep function coefficients. Therefore, unlike the conventional methods, it does not need any complicated and lengthy optimization for real field application.

  • The feature selection revealed that each creep function coefficient depends on specific inputs and requires more than one input type for estimation.

Moreover, the study successfully demonstrated the superior efficiency of the proposed ML-based model in comparison with the numerical ITA technique for estimating the creep function coefficients of viscoelastic pipes. However, it is important to note that these models operate as black boxes (e.g., support vector regression, artificial neural networks, random forest, etc.), meaning that the relationship between input and output variables remains unknown. This limitation is particularly significant in cases where access to computers and soft computing models is restricted. Therefore, it is recommended that future research endeavors focus on employing white box (e.g., model three, gene expression programming, multivariate adaptive regression splines, etc.) methods to establish accurate relationships for estimating creep function coefficients in viscoelastic tubes. This approach would facilitate properly managing the complex and nonlinear water hammer phenomenon.

All relevant data are included in the paper or its Supplementary Information.

The authors declare there is no conflict.

Amjad
M.
,
Ahmad
I.
,
Ahmad
M.
,
Wróblewski
P.
,
Kamiński
P.
&
Amjad
U.
(
2022
)
Prediction of pile bearing capacity using XGBoost algorithm: Modeling and performance evaluation
,
Applied Sciences
,
12
(
4
),
2126
.
Apollonio
C.
,
Covas
D.
,
De Marinis
G.
,
Leopardi
A.
&
Ramos
H. M.
(
2013
)
Creep functions for transients in HDPE pipes
,
Urban Water Journal
,
11
(
2
),
160
166
.
https://doi.org/10.1080/1573062x.2012.758295
.
Asghari
V.
,
Kazemi
M. H.
,
Duan
H.
,
Hsu
S.
&
Keramat
A.
(
2023
)
Machine learning modelling for spectral transient-based leak detection
,
Automation in Construction
,
146
,
104686
.
https://doi.org/10.1016/j.autcon.2022.104686
.
Ayati
A. H.
,
Haghighi
A.
&
Ghafouri
H. R.
(
2022
)
Machine learning-assisted model for leak detection in water distribution networks using hydraulic transient flows
,
Journal of Water Resources Planning and Management
,
148
(
2
).
https://doi.org/10.1061/(asce)wr.1943-5452.0001508
.
Bertaglia
G.
,
Ioriatti
M.
,
Valiani
A.
,
Dumbser
M.
&
Caleffi
V.
(
2018
)
Numerical methods for hydraulic transients in visco-elastic pipes
,
Journal of Fluids and Structures
,
81
,
230
254
.
https://doi.org/10.1016/j.jfluidstructs.2018.05.004
.
Bostan
M.
,
Azimi
A. H.
,
Akhtari
A. A.
&
Bonakdari
H.
(
2021
)
An implicit approach for numerical simulation of water hammer induced pressure in a straight pipe
,
Water Resources Management
,
35
(
15
),
5155
5167
.
https://doi.org/10.1007/s11269-021-02992-3
.
Capponi
C.
,
Meniconi
S.
,
Lee
P.
,
Brunone
B.
&
Cifrodelli
M.
(
2020
)
Time-domain analysis of laboratory experiments on the transient pressure damping in a leaky polymeric pipe
,
Water Resources Management
,
34
(
2
),
501
514
.
https://doi.org/10.1007/s11269-019-02454-x
.
Cheshme
J. J. O.
,
Ahmadi
A.
,
Keramat
A.
&
Arniazi
A. S.
(
2021
)
Sensitivity of creep coefficients to the fundamental water hammer period in viscoelastic pipes
,
Urban Water Journal
,
18
(
3
),
183
194
.
https://doi.org/10.1080/1573062x.2021.1877738
.
Covas
D.
,
Stoianov
I.
,
Ramos
H. M.
,
Graham
N.
,
Maksimović
Č.
&
Butler
D.
(
2004
)
Water hammer in pressurized polyethylene pipes: A conceptual model and experimental analysis
,
Urban Water Journal
,
1
(
2
),
177
197
.
https://doi.org/10.1080/15730620412331289977
.
Di Nunno
F.
,
De Marinis
G.
&
Granata
F.
(
2023a
)
Short-term forecasts of streamflow in the UK based on a novel hybrid artificial intelligence algorithm
,
Scientific Reports
,
13
(
1
),
7036
.
https://doi.org/10.1038/s41598-023-34316-3
.
Di Nunno
F.
,
Zhu
S.
,
Ptak
M.
,
Sojka
M.
&
Granata
F.
(
2023b
)
A stacked machine learning model for multi-step ahead prediction of lake surface water temperature
,
Science of the Total Environment
,
890
,
164323
.
https://doi.org/10.1016/j.scitotenv.2023.164323
.
Ferrante
M.
&
Capponi
C.
(
2017
)
Viscoelastic models for transients simulation in polymeric pipes
,
Journal of Hydraulic Research
,
55
(
5
),
599
612
.
https://doi.org/10.1080/00221686.2017.1354935
.
Gong
J.
,
Zecchin
A. C.
,
Lambert
M. F.
&
Simpson
A. R.
(
2016
)
Determination of the creep function of viscoelastic pipelines using system resonant frequencies with hydraulic transient analysis
,
Journal of Hydraulic Engineering
,
142
(
9
).
https://doi.org/10.1061/(asce)hy.1943-7900.0001149
.
Granata
F.
,
Di Nunno
F.
,
Najafzadeh
M.
&
Demir
İ
. (
2022
)
A stacked machine learning algorithm for multi-step ahead prediction of soil moisture
,
Hydrology
,
10
(
1
),
1
.
https://doi.org/10.3390/hydrology10010001
.
Holland
J. H.
(
1992
)
Genetic algorithms
,
Scientific American
,
267
(
1
),
66
72
.
https://doi.org/10.1038/scientificamerican0792-66
.
Huang
Y.
,
Duan
H.
,
Zhao
M.
,
Zhang
Q.
,
Zhao
H.
&
Zhang
K.
(
2017
)
Transient influence zone based decomposition of water distribution networks for efficient transient analysis
,
Water Resources Management
,
31
(
6
),
1915
1929
.
https://doi.org/10.1007/s11269-017-1621-x
.
Kang
J.
,
Park
Y.-J.
,
Lee
J.-H.
,
Wang
S.-H.
&
Eom
D.-S.
(
2018
)
Novel leakage detection by ensemble CNN-SVM and graph-based localization in water distribution systems
,
IEEE Transactions on Industrial Electronics
,
65
(
5
),
4279
4289
.
https://doi.org/10.1109/tie.2017.2764861
.
Keramat
A.
&
Haghighi
A.
(
2014
)
Straightforward transient-based approach for the creep function determination in viscoelastic pipes
,
Journal of Hydraulic Engineering
,
140
(
12
).
https://doi.org/10.1061/(asce)hy.1943-7900.0000929
.
Keramat
A.
,
Tijsseling
A. A.
&
Ahmadi
A.
(
2010
)
Investigation of transient cavitating flow in viscoelastic pipes
,
IOP Conference Series
,
12
,
012081
.
https://doi.org/10.1088/1755-1315/12/1/012081
.
Keramat
A.
,
Kolahi
A. G.
&
Ahmadi
A.
(
2013
)
Waterhammer modelling of viscoelastic pipes with a time-dependent Poisson's ratio
,
Journal of Fluids and Structures
,
43
,
164
178
.
https://doi.org/10.1016/j.jfluidstructs.2013.08.013
.
Kim
S.
(
2023
)
Generalized impedance-based transient analysis for multi-branched pipeline systems
,
Water Resources Management
,
37
(
4
),
1581
1597
.
https://doi.org/10.1007/s11269-023-03445-9
.
Nguyen
N. H.
,
Abellán-García
J.
,
Lee
S.
,
Garcia-Castano
E.
&
Vo
T. P.
(
2022
)
Efficient estimating compressive strength of ultra-high performance concrete using XGBoost model
,
Journal of Building Engineering
,
52
,
104302
.
Ohadi
S.
,
Monfared
S. A. H.
,
Moghaddam
M. A.
&
Givehchi
M.
(
2022
)
Feasibility of a novel predictive model based on multilayer perceptron optimized with Harris hawk optimization for estimating of the longitudinal dispersion coefficient in rivers
,
Neural Computing and Applications
,
35
(
9
),
7081
7105
.
https://doi.org/10.1007/s00521-022-08074-8
.
Pezzinga
G.
(
2002
)
Unsteady flow in hydraulic networks with polymeric additional pipe
,
Journal of Hydraulic Engineering
,
128
(
2
),
238
244
.
https://doi.org/10.1061/(ASCE)0733-9429(2002)128:2(238)
.
Rahmanshahi
M.
,
Jafari-Asl
J.
,
Bejestan
M. S.
&
Mirjalili
S.
(
2023
)
A hybrid model for predicting the energy dissipation on the block ramp hydraulic structures
,
Water Resources Management
,
37
(
8
),
3187
3209
.
https://doi.org/10.1007/s11269-023-03497-x
.
Rahmanshahi
M.
,
Jafari-Asl
J.
,
Fathi-Moghadam
M.
,
Ohadi
S.
&
Mirjalili
S.
(
2024
)
Metaheuristic learning algorithms for accurate prediction of hydraulic performance of porous embankment weirs
,
Applied Soft Computing
,
151
,
111150
.
https://doi.org/10.1016/j.asoc.2023.111150
.
Ramos
H.
,
Covas
D.
,
Borga
A.
&
Loureiro
D.
(
2004
)
Surge damping analysis in pipe systems: Modelling and experiments
,
Journal of Hydraulic Research
,
42
(
4
),
413
425
.
https://doi.org/10.1080/00221686.2004.9641209
.
Seghier
M. E. A. B.
,
Correia
J. a. F. O.
,
Jafari-Asl
J.
,
Malekjafarian
A.
,
Plevris
V.
&
Trung
N. T.
(
2021
)
On the modelling of the annual corrosion rate in main cables of suspension bridges using combined soft computing model and a novel nature-inspired algorithm
,
Neural Computing and Applications
,
33
(
23
),
15969
15985
.
https://doi.org/10.1007/s00521-021-06199-w
.
Seghier
M. E. A. B.
,
Golafshani
E. M.
,
Jafari-Asl
J.
&
Arashpour
M.
(
2023
)
Metaheuristic-based machine learning modelling of the compressive strength of concrete containing waste glass
,
Structural Concrete
,
24
(
4
),
5417
5440
.
https://doi.org/10.1002/suco.20220026
.
Soares
A. K.
,
Covas
D.
&
Reis
L. F. R.
(
2008
)
Analysis of PVC pipe-wall viscoelasticity during water hammer
,
Journal of Hydraulic Engineering
,
134
(
9
),
1389
1394
.
https://doi.org/10.1061/(asce)0733-9429(2008)134:9(1389)
.
Soares
A. K.
,
Covas
D.
&
Reis
L. F. R.
(
2010
)
Leak detection by inverse transient analysis in an experimental PVC pipe system
,
Journal of Hydroinformatics
,
13
(
2
),
153
166
.
https://doi.org/10.2166/hydro.2010.012
.
Soares
A. K.
,
Covas
D.
&
Carriço
N.
(
2012
)
Transient vaporous cavitation in viscoelastic pipes
,
Journal of Hydraulic Research
,
50
(
2
),
228
235
.
https://doi.org/10.1080/00221686.2012.669143
.
Tjuatja
V.
,
Keramat
A.
,
Pan
B.
,
Duan
H.
,
Brunone
B.
&
Meniconi
S.
(
2023
)
Transient flow modelling in viscoelastic pipes: A comprehensive review of literature and analysis
,
Physics of Fluids
,
35
(
8
), 081302.
https://doi.org/10.1063/5.0155708
.
Tricarico
C.
,
De Marinis
G.
,
Gargano
R.
&
Leopardi
A.
(
2007
)
Peak residential water demand
,
Water Management
,
160
(
2
),
115
121
.
https://doi.org/10.1680/wama.2007.160.2.115
.
Vatani
A.
,
Jafari-Asl
J.
,
Ohadi
S.
,
Hamzehkolaei
N. S.
,
Ahmadabadi
S. A.
&
Correia
J. A. F. O.
(
2023
)
An efficient surrogate model for reliability analysis of the marine structure piles
,
Maritime Engineering
,
176
(
4
),
176
192
.
https://doi.org/10.1680/jmaen.2022.020
.
Velthoen
J.
,
Dombry
C.
,
Cai
J. J.
&
Engelke
S.
(
2023
)
Gradient boosting for extreme quantile regression
,
Extremes
,
26
(
4
),
639
667
.
Wahba
E.
(
2017
)
On the two-dimensional characteristics of laminar fluid transients in viscoelastic pipes
,
Journal of Fluids and Structures
,
68
,
113
124
.
https://doi.org/10.1016/j.jfluidstructs.2016.10.012
.
This is an Open Access article distributed under the terms of the Creative Commons Attribution Licence (CC BY 4.0), which permits copying, adaptation and redistribution, provided the original work is properly cited (http://creativecommons.org/licenses/by/4.0/).