ABSTRACT
This study introduces a novel framework for the estimation of creep function coefficients in viscoelastic pipelines, employing a combination of machine learning (ML) and transient hydraulics. Specifically, the efficacy of eXtreme Gradient Boosting (XGBoost) as an advanced regression method is evaluated within this framework. A transient simulation model, utilizing the method of characteristics, is formulated to create the reservoir-pipe-valve (RPV) scenarios under diverse boundary conditions. After conducting model calibration, the model is used to generate datasets with the transient hydraulic responses at the measurement points. The fast Fourier transform (FFT) is then applied to transform the generated samples into the frequency domain. Feature selection is accomplished through principal component analysis (PCA) to identify optimal input variables for XGBoost. In estimating the creep function, six coefficients are employed, with the feature selection analysis indicating that each coefficient is associated with a specific signal. Importantly, it is demonstrated that attempting to estimate all six coefficients using a single set of signals is unfeasible. The results affirm the accuracy of the proposed ML-based framework in determining creep function coefficients.
HIGHLIGHTS
Determination of creep function coefficients of viscoelastic pipes.
Proposing a transient-guided machine learning model.
Using XGBoost as a regressor for predicting the creep function in pipes.
ABBREVIATIONS
- FFT
fast Fourier transform
- GA
genetic algorithm
- HDPE
high-density polyethylene
- ITA
inverse transient analysis
- KV
Kelvin–Voigt
- MAE
mean absolute error
- ML
machine learning
- MOC
method of characteristics
- MSE
mean square error
- PCA
principal component analysis
- PE
polyethylene
- PVC
polyvinyl chloride
- RMSE
root mean square error
- RPV
reservoir-pipe-valve system
- RSE
relative squared error
- SD
standard deviation
- XGBoost
eXtreme Gradient Boosting
INTRODUCTION
In recent years, the use of plastic pipes (e.g., polyvinyl chloride (PVC) and polyethylene (PE)) in water supply and sewage transfer systems has become more popular than other types of pipes, such as steel, concrete, cast iron, and asbestos. This is because plastic pipes offer numerous advantages from both technical and economic perspectives. These include high resistance to heat, chemicals, and pressure, low cost and weight, excellent durability against erosion and scouring, and fast and easy installation (Apollonio et al. 2013). All plastic pipes have a unique property that makes them change their shape and size over time when stressed. This property is called viscoelasticity, and it affects how the pressure changes in the pipes when the water hammer occurs (Apollonio et al. 2013; Keramat et al. 2013).
It should be noted that this excess stress caused by pressure changes is called creep, which disappears after the pressure becomes uniform in the pipe. The viscoelastic properties of plastic pipes affect how they react to pressure waves, which are important for their structural and hydraulic performance. In a water supply system, pressure waves are generated when valves close abruptly, or pumps start or stop suddenly, resulting in periodic pressure fluctuations. This phenomenon, commonly referred to as ‘water hammer’, can potentially damage pipes (Pezzinga 2002; Ramos et al. 2004; Tricarico et al. 2007; Apollonio et al. 2013). Therefore, simulating transient flow in water transmission systems can contribute to the safe operation of transmission pipelines by determining the minimum and maximum water hammer pressures.
Numerous experimental and numerical studies on viscoelastic pipes have significantly enhanced our understanding. In Wahba (2017) and Bertaglia et al. (2018), the method of characteristics (MOC) was introduced as a means to model the viscoelastic behavior of the pipe wall in water hammer equations. These studies incorporated the Kelvin–Voigt (KV) elements to represent the impact of circumferential strains on the pipe wall. Subsequently, several researchers (i.e., Huang et al. 2017; Capponi et al. 2020; Bostan et al. 2021; Kim 2023) validated and expanded upon this mathematical model.
Mechanical models have proven effective in mathematically simulating viscoelastic behavior, with springs and dashpots representing elastic and viscous deformations, respectively. Among the various mechanical models available, the generalized KV model is widely utilized for studying the creep and relaxation characteristics of solid viscoelastic materials. However, it is important to note that the main limitation of this method is the requirement for calibration of the creep compliance functions through pipeline testing. In a study by Soares et al. (2008), the creep compliance functions were calibrated using pressure fluctuations induced by a water hammer. Additionally, other works (i.e., Keramat et al. 2010; Soares et al. 2012) have examined the impact of fluid column separation both numerically and experimentally. Soares et al. (2010) utilized a genetic algorithm (GA)-based model to calibrate various KV models using data from laboratory experiments on a hydraulic model with PVC pipes.
Furthermore, Keramat et al. (2013) investigated the numerical modeling of water hammers in viscoelastic pipes with a time-dependent Poisson ratio. Their findings indicated that the viscoelastic data obtained from mechanical tests were more accurate when considering a time-dependent Poisson ratio and unsteady friction than using the KV model for water hammer analysis. They also reported the successful calibration of the creep curve. Gong et al. (2016) introduced a new frequency-domain technique to estimate the creep function of viscoelastic pipes using a transient flow analysis. The basis of the proposed approach is the analytical relationship between the viscoelastic parameters of the pipeline and friction-related parameters and the resonant frequencies. The technique's relatively high accuracy for the elastic wave velocity and viscoelastic creep compliances was confirmed by numerical simulations on a high-density polyethylene (HDPE) pipeline. Ferrante & Capponi (2017) developed and compared several viscoelastic models to simulate the transients in polymer pipes on a reservoir-pipe-valve system (RPV) with two plastic pipes (i.e., high-density PE and oriented PVC). Their study compared the efficiency of three different models, namely the standard linear solid model, the Maxwell model, and the generalized Maxwell model, in simulating transient flow in pipes. The findings indicated that the generalized Maxwell model performed relatively superior to the other models investigated when simulating transient flow in PE and PVC pipes.
Bertaglia et al. (2018) studied the efficiency of three numerical models, including the MOC, semi-implicit staggered finite volume method, and explicit path conservative finite volume method, to simulate the transient flow in the viscoelastic pipes. Their study compared the experimental and analytical results obtained using three-parameter and multi-parameter linear viscoelastic models. They also investigated the viscoelastic effects of unsteady friction loss, pipe wall, cavitation, and cross-sectional changes. They found that all numerical methods agreed well with experimental results. They suggested that the MOC method could be used for simple systems. However, for complex configurations, it was important to consider the specific aspects involved in a particular case and the maximum acceptable error for the results (Bertaglia et al. 2018). Cheshme et al. (2021) conducted a study to investigate KV elements' impact on creep behavior and pressure fluctuations by calibrating creep coefficients using creep function and transient pressure. The findings indicate that the accuracy of three- and four-element models in the first method is higher than that of the second method for pipes longer than 1,000 m. However, for pipes shorter than 1,000 m, the accuracy of the three and four-element models is comparable in both methods. Regarding the two-element model, the accuracy of the first method is higher than the second method for pipes longer than 720 m, while the opposite is observed for lengths less than 720 m. Furthermore, the error analysis using both approaches highlights that the one-element KV model is largely inaccurate, leading to up to 60% errors. Consequently, it is not suitable for predicting transient pressures in viscoelastic pipes. To study the viscoelastic properties of plastic pipes in complex fluids under different flow conditions, such as laminar or turbulent, extensive experimentation and simulation tools are needed to deal with the nonlinear interactions of multiple variables, which is computationally expensive.
Machine learning (ML) models have become increasingly prevalent in addressing engineering challenges, as they can simulate complex phenomena and enhance decision-making across various engineering domains. Some of the recent examples of such applications are leak detection in the pipeline (Kang et al. 2018; Ayati et al. 2022), structural analysis (Seghier et al. 2021; Seghier et al. 2023), and water resources and hydraulic engineering (Granata et al. 2022; Ohadi et al. 2022; Di Nunno et al. 2023a, b; Vatani et al. 2023). For example, Ayati et al. (2022) developed and used several ML-based models to analyze transient flow in water network systems to identify the location of a leak. The results showed that the proposed ML models have high potential. In more recent work, Asghari et al. (2023) introduced an efficient ML-based framework to detect leaks in pipes using transient waves, achieving an accuracy of 97%. Although the ML models have been widely used in transient flow analysis of viscoelastic pipes, some important challenges have not been solved so far, see Tjuatja et al. (2023).
As delineated in this introduction, ML-based methods have demonstrated efficacy in simulating transient flow within pipelines, thereby addressing the shortcomings of conventional methodologies. Despite these advancements, there remains a lack of research employing ML-based techniques for the estimation of creep function coefficients in viscoelastic pipes under unsteady flow conditions. Consequently, a review of the existing literature highlights the critical need for an exhaustive study that applies ML strategies to accurately estimate the creep function coefficients in elastic pipes. The primary contribution of this study is the development of a transient-guided ML model designed to estimate the creep function coefficients of viscoelastic pipes during transient flow analysis. To facilitate this, a laboratory model underwent calibration via the inverse transient analysis (ITA) method, yielding a dataset encompassing a broad spectrum of transient scenarios at various measurement points. This dataset includes critical pipe information such as diameter (m), length (m), thickness (mm), elastic wave velocity (m/s), Brownian decay coefficient, and fluid velocity (m/s). Following this, the pressure signal data were transformed into the frequency domain, with features selected through principal component analysis (PCA). The eXtreme Gradient Boosting (XGBoost) algorithm was then applied to accurately estimate the creep function coefficients. A review of recent studies in the field of ML shows that XGBoost offers high performance, the ability to handle missing data, regularization to avoid overfitting, support for parallel processing, flexibility in performing various tasks, and complex tree pruning (Amjad et al. 2022; Nguyen et al. 2022; Han et al. 2023a, b). These features provide acceptable and stable results in solving complex problems, making XGBoost the most suitable choice for our study.
The rest of this study is structured as follows. The second section provides a detailed explanation of the mathematical formulation of the problem. Subsequently, the methodology and the proposed framework are discussed, starting from initial developments to final verification. The fourth chapter is dedicated to the specifications of the case study. The results of the proposed methodology applied to the case study and a corresponding discussion are presented. The paper concludes with a summary, conclusions, and recommendations for future research.
MATHEMATICAL MODEL
In the above equation, the length between the valve and the front of the positive pressure wave is , and is the velocity. The fluid velocity at the valve is zero, so where is the weight factor. This has a second-order accuracy. More details about the mathematical modeling of transient flow analysis in viscoelastic pipes can be found in Keramat & Haghighi (2014).
PROPOSED FRAMEWORK
I. A mathematical model (numerical solution) of hydraulic analysis of transient flow in pipelines is developed. This model simulates the RPV problem under different boundary conditions.
II. The developed mathematical model is calibrated and validated with the results of a well-known laboratory study.
III. The model, which has been developed and calibrated, receives various types of pipes with different characteristics (such as diameter, length, thickness, elastic wave velocity, Brownian decay coefficient, and fluid velocity). The model then produces the hydraulic responses (pressure signals) for each pipe at the measurement sites.
IV. Before further processing, the data generated in the time domain are converted to the frequency domain using the fast Fourier transform (FFT) method.
V. The input matrix dimension of ML is reduced by applying the PCA for the feature selection. The creep function coefficients are predicted by using the XGBoost method.
As depicted in the flowchart (Figure 1), the proposed framework comprises three distinct stages. Initially, a numerical model is developed utilizing the MOC to simulate the RPV system, with measurements derived via ITA-based GA. Subsequently, this model generates data pertinent to the creep coefficients of viscoelastic pipes, transforms pressure data from the time domain to the frequency domain, and selects relevant features. The final stage involves employing an XGBoost model to estimate the creep function coefficients.
Calibration and case study
Parameter . | . | ||||||
---|---|---|---|---|---|---|---|
Coefficients of creep function (×10−10Pa−1) . | Pressure wave . | Objective function (m2) . | |||||
J1 . | J2 . | J3 . | J4 . | J5 . | J6 . | a . | |
0.98 | 0.4 | 0.32 | 1.09 | 0.03 | 0.24 | 392 | 0.24 |
Parameter . | . | ||||||
---|---|---|---|---|---|---|---|
Coefficients of creep function (×10−10Pa−1) . | Pressure wave . | Objective function (m2) . | |||||
J1 . | J2 . | J3 . | J4 . | J5 . | J6 . | a . | |
0.98 | 0.4 | 0.32 | 1.09 | 0.03 | 0.24 | 392 | 0.24 |
Input variables . | |||||||
---|---|---|---|---|---|---|---|
. | a (m/s) . | L (m) . | D (cm) . | V (m/s) . | f . | kp . | J1 – J6 × 10−10Pa−1 . |
Average | – | – | 0.26 | 0.03 | 0.03 | 2.50 | |
SD | 116.70 | – | – | 0.27 | 0.03 | 0.00 | |
Min | 200.00 | 50.00 | 5 | 0.00 | 0.02 | – | 0.00 |
Max | 600.00 | 1,000.00 | 100 | 1.18 | 0.05 | – | 5.00 |
Input variables . | |||||||
---|---|---|---|---|---|---|---|
. | a (m/s) . | L (m) . | D (cm) . | V (m/s) . | f . | kp . | J1 – J6 × 10−10Pa−1 . |
Average | – | – | 0.26 | 0.03 | 0.03 | 2.50 | |
SD | 116.70 | – | – | 0.27 | 0.03 | 0.00 | |
Min | 200.00 | 50.00 | 5 | 0.00 | 0.02 | – | 0.00 |
Max | 600.00 | 1,000.00 | 100 | 1.18 | 0.05 | – | 5.00 |
eXtreme Gradient Boosting
This equation shows that T is the number of leaves of the tree and w is the leaf weight.
Evaluation metrics
IMPLEMENTATION AND RESULTS
Given that the viscoelastic coefficients constitute the overall structure of the signal, it can be concluded that low frequencies have the most significant impact on the coefficients of the creep function. It should be noted that the feature selection model was developed in MATLAB 2022, and the XGBoost model was developed on the Visual Studio Code (VSC) platform using Python. As a result, the input variables for the ML model were chosen to be the 20 primary frequencies of the signals. The complexity of any classification and regression model is influenced by its number of inputs. A feature selection process was conducted in this study to enhance the accuracy of creep function coefficient estimation and reduce the computational time of the ML model. PCA was utilized to assess the impact of each input on the creep function coefficients, resulting in a graded ranking. Subsequently, the inputs were categorized into four groups based on rankings. This categorization allows for the development of four separate forecasting models, each dedicated to predicting one of the creep coefficients. By doing so, the influence of the number of frequencies on the accuracy of the forecasts is evaluated. The outcomes of the feature selection process are presented in Table 3. The numbers in Table 3 show the data column numbers in dataset. For example, to predict J1 based on Group 1, columns number 8, 9, 7, 20, and 10 in dataset, which contain the measured pressure signals, were used.
. | Rank . | . | Rank . | ||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|
I . | II . | III . | V . | IV . | I . | II . | III . | V . | IV . | ||||
J1 | Group 1 | 8 | 9 | 7 | 20 | 10 | J4 | Group 1 | 5 | 6 | 7 | 8 | 4 |
Group 2 | 8 | 9 | 7 | 20 | 10 | Group 2 | 5 | 6 | 7 | 8 | 4 | ||
19 | 13 | 6 | 14 | 12 | 18 | 17 | 16 | 19 | 20 | ||||
Group 3 | 8 | 9 | 7 | 20 | 10 | Group 3 | 5 | 6 | 7 | 8 | 4 | ||
19 | 13 | 6 | 14 | 12 | 18 | 17 | 16 | 19 | 20 | ||||
18 | 4 | 3 | 15 | 17 | 15 | 3 | 9 | 2 | 14 | ||||
Group 4 | 8 | 9 | 7 | 20 | 10 | Group 4 | 5 | 6 | 7 | 8 | 4 | ||
19 | 13 | 6 | 14 | 12 | 18 | 17 | 16 | 19 | 20 | ||||
18 | 4 | 3 | 15 | 17 | 15 | 3 | 9 | 2 | 14 | ||||
5 | 2 | 11 | 16 | 1 | 11 | 1 | 10 | 12 | 13 | ||||
J2 | Group 1 | 8 | 9 | 7 | 20 | 19 | J5 | Group 1 | 5 | 6 | 4 | 16 | 17 |
Group 2 | 8 | 9 | 7 | 20 | 19 | Group 2 | 5 | 6 | 4 | 16 | 17 | ||
6 | 18 | 13 | 12 | 10 | 15 | 18 | 7 | 14 | 19 | ||||
Group 3 | 8 | 9 | 7 | 20 | 19 | Group 3 | 5 | 6 | 4 | 16 | 17 | ||
6 | 18 | 13 | 12 | 10 | 15 | 18 | 7 | 14 | 19 | ||||
3 | 4 | 17 | 14 | 11 | 8 | 3 | 20 | 11 | 1 | ||||
Group 4 | 8 | 9 | 7 | 20 | 19 | Group 4 | 5 | 6 | 4 | 16 | 17 | ||
6 | 18 | 13 | 12 | 10 | 15 | 18 | 7 | 14 | 19 | ||||
3 | 4 | 17 | 14 | 11 | 8 | 3 | 20 | 11 | 1 | ||||
2 | 5 | 16 | 1 | 15 | 13 | 10 | 9 | 2 | 12 | ||||
J3 | Group 1 | 7 | 6 | 8 | 20 | 19 | J6 | Group 1 | 5 | 10 | 11 | 4 | 6 |
Group 2 | 7 | 6 | 8 | 20 | 19 | Group 2 | 5 | 10 | 11 | 4 | 6 | ||
9 | 18 | 5 | 17 | 3 | 1 | 2 | 9 | 12 | 19 | ||||
Group 3 | 7 | 6 | 8 | 20 | 19 | Group 3 | 5 | 10 | 11 | 4 | 6 | ||
9 | 18 | 5 | 17 | 3 | 1 | 2 | 9 | 12 | 19 | ||||
16 | 2 | 12 | 11 | 1 | 20 | 15 | 3 | 16 | 14 | ||||
Group 4 | 7 | 6 | 8 | 20 | 19 | Group 4 | 5 | 10 | 11 | 4 | 6 | ||
9 | 18 | 5 | 17 | 3 | 1 | 2 | 9 | 12 | 19 | ||||
16 | 2 | 12 | 11 | 1 | 20 | 15 | 3 | 16 | 14 | ||||
10 | 15 | 13 | 4 | 14 | 7 | 17 | 18 | 13 | 8 |
. | Rank . | . | Rank . | ||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|
I . | II . | III . | V . | IV . | I . | II . | III . | V . | IV . | ||||
J1 | Group 1 | 8 | 9 | 7 | 20 | 10 | J4 | Group 1 | 5 | 6 | 7 | 8 | 4 |
Group 2 | 8 | 9 | 7 | 20 | 10 | Group 2 | 5 | 6 | 7 | 8 | 4 | ||
19 | 13 | 6 | 14 | 12 | 18 | 17 | 16 | 19 | 20 | ||||
Group 3 | 8 | 9 | 7 | 20 | 10 | Group 3 | 5 | 6 | 7 | 8 | 4 | ||
19 | 13 | 6 | 14 | 12 | 18 | 17 | 16 | 19 | 20 | ||||
18 | 4 | 3 | 15 | 17 | 15 | 3 | 9 | 2 | 14 | ||||
Group 4 | 8 | 9 | 7 | 20 | 10 | Group 4 | 5 | 6 | 7 | 8 | 4 | ||
19 | 13 | 6 | 14 | 12 | 18 | 17 | 16 | 19 | 20 | ||||
18 | 4 | 3 | 15 | 17 | 15 | 3 | 9 | 2 | 14 | ||||
5 | 2 | 11 | 16 | 1 | 11 | 1 | 10 | 12 | 13 | ||||
J2 | Group 1 | 8 | 9 | 7 | 20 | 19 | J5 | Group 1 | 5 | 6 | 4 | 16 | 17 |
Group 2 | 8 | 9 | 7 | 20 | 19 | Group 2 | 5 | 6 | 4 | 16 | 17 | ||
6 | 18 | 13 | 12 | 10 | 15 | 18 | 7 | 14 | 19 | ||||
Group 3 | 8 | 9 | 7 | 20 | 19 | Group 3 | 5 | 6 | 4 | 16 | 17 | ||
6 | 18 | 13 | 12 | 10 | 15 | 18 | 7 | 14 | 19 | ||||
3 | 4 | 17 | 14 | 11 | 8 | 3 | 20 | 11 | 1 | ||||
Group 4 | 8 | 9 | 7 | 20 | 19 | Group 4 | 5 | 6 | 4 | 16 | 17 | ||
6 | 18 | 13 | 12 | 10 | 15 | 18 | 7 | 14 | 19 | ||||
3 | 4 | 17 | 14 | 11 | 8 | 3 | 20 | 11 | 1 | ||||
2 | 5 | 16 | 1 | 15 | 13 | 10 | 9 | 2 | 12 | ||||
J3 | Group 1 | 7 | 6 | 8 | 20 | 19 | J6 | Group 1 | 5 | 10 | 11 | 4 | 6 |
Group 2 | 7 | 6 | 8 | 20 | 19 | Group 2 | 5 | 10 | 11 | 4 | 6 | ||
9 | 18 | 5 | 17 | 3 | 1 | 2 | 9 | 12 | 19 | ||||
Group 3 | 7 | 6 | 8 | 20 | 19 | Group 3 | 5 | 10 | 11 | 4 | 6 | ||
9 | 18 | 5 | 17 | 3 | 1 | 2 | 9 | 12 | 19 | ||||
16 | 2 | 12 | 11 | 1 | 20 | 15 | 3 | 16 | 14 | ||||
Group 4 | 7 | 6 | 8 | 20 | 19 | Group 4 | 5 | 10 | 11 | 4 | 6 | ||
9 | 18 | 5 | 17 | 3 | 1 | 2 | 9 | 12 | 19 | ||||
16 | 2 | 12 | 11 | 1 | 20 | 15 | 3 | 16 | 14 | ||||
10 | 15 | 13 | 4 | 14 | 7 | 17 | 18 | 13 | 8 |
Parameter . | Value . |
---|---|
Learning rate | 0.05 |
Max depth of a tree | 6 |
Sample ratio of training data | 1 |
Sample ratio of futures | 1 |
Number of estimators | 500 |
Weight values of labels | 1 |
Parameter . | Value . |
---|---|
Learning rate | 0.05 |
Max depth of a tree | 6 |
Sample ratio of training data | 1 |
Sample ratio of futures | 1 |
Number of estimators | 500 |
Weight values of labels | 1 |
The proper values of the XGBoost hyperparameters were determined following a trial and error process and are presented in Table 4. After adjusting the parameters of the prediction model, the patterns obtained from the feature selection process were used to predict each of the coefficients of the creep function.
According to the groups listed in Table 3, four patterns have been identified: the first pattern consists of Group 1; the second pattern combines Groups 1 and 2; the third pattern encompasses Groups 1, 2, and 3; and the fourth pattern includes Groups 1, 2, 3, and 4. These patterns were formed and utilized to estimate creep coefficients.
Estimation of creep coefficients by using the XGBoost model
Table 5 presents the values of statistical parameters calculated for the creep function coefficients during the training and testing phases of the XGBoost model. The model has demonstrated strong performance based on these statistical parameters in both phases. For the J1 coefficient, the model's accuracy improved with an increase in input signals among the four patterns used, leading to a more precise estimation of all coefficients. Specifically, the XGBoost model with Pattern D exhibited the lowest values of RMSE (Training = 0.032 × 10−10Pa−1; Testing = 0.044 × 10−10Pa−1) and MAE (Training = 0.004 × 10−10Pa−1; Testing = 0.005 × 10−10Pa−1) along with the highest R value, indicating superior estimation of the J1 coefficient compared with other patterns. In contrast, for the J2 coefficient, the model's accuracy decreased as the number of input signals increased. Notably, the RMSE, MAE, and RSE values for the first pattern are zero during both training and testing stages, and the R value is 1.00, signifying a high correlation between predicted and observed data. Conversely, for Pattern D, the RMSE, MAE, and RSE values are the highest, at 0.733 × 10−10Pa−1, 0.455 × 10−10Pa−1, and 0.402 respectively, and the R value is the lowest at 0.773.
Pattern . | Phase . | . | RMSE (×10−10Pa−1) . | MAE (×10−10Pa−1) . | RSE . | R . | . | RMSE (×10−10Pa−1) . | MAE (×10−10Pa−1) . | RSE . | R . |
---|---|---|---|---|---|---|---|---|---|---|---|
A | Training | J1 | 0.641 | 0.391 | 0.201 | 0.894 | J2 | 0.000 | 0.000 | 0.000 | 1.000 |
Testing | 0.642 | 0.388 | 0.195 | 0.897 | 0.000 | 0.000 | 0.000 | 1.000 | |||
B | Training | 0.159 | 0.038 | 0.012 | 0.994 | 0.685 | 0.427 | 0.376 | 0.790 | ||
Testing | 0.162 | 0.046 | 0.013 | 0.994 | 0.688 | 0.427 | 0.355 | 0.803 | |||
C | Training | 0.070 | 0.012 | 0.002 | 0.999 | 0.724 | 0.450 | 0.421 | 0.761 | ||
Testing | 0.073 | 0.012 | 0.003 | 0.999 | 0.725 | 0.451 | 0.393 | 0.779 | |||
D | Training | 0.032 | 0.004 | 0.000 | 1.000 | 0.734 | 0.456 | 0.432 | 0.753 | ||
Testing | 0.044 | 0.005 | 0.001 | 1.000 | 0.733 | 0.455 | 0.402 | 0.773 | |||
A | Training | J3 | 0.628 | 0.375 | 0.187 | 0.902 | J4 | 0.443 | 0.220 | 0.093 | 0.953 |
Testing | 0.657 | 0.385 | 0.202 | 0.893 | 0.442 | 0.222 | 0.096 | 0.951 | |||
B | Training | 0.150 | 0.038 | 0.010 | 0.995 | 0.114 | 0.026 | 0.006 | 0.997 | ||
Testing | 0.155 | 0.040 | 0.011 | 0.994 | 0.146 | 0.035 | 0.010 | 0.995 | |||
C | Training | 0.053 | 0.008 | 0.001 | 0.999 | 0.047 | 0.007 | 0.001 | 0.999 | ||
Testing | 0.061 | 0.010 | 0.002 | 0.999 | 0.064 | 0.010 | 0.002 | 0.999 | |||
D | Training | 0.032 | 0.003 | 0.000 | 1.000 | 0.033 | 0.004 | 0.001 | 1.000 | ||
Testing | 0.040 | 0.005 | 0.001 | 1.000 | 0.050 | 0.005 | 0.001 | 0.999 | |||
A | Training | J5 | 0.456 | 0.205 | 0.100 | 0.949 | J6 | 0.313 | 0.126 | 0.047 | 0.976 |
Testing | 0.417 | 0.189 | 0.086 | 0.956 | 0.295 | 0.117 | 0.044 | 0.978 | |||
B | Training | 0.158 | 0.037 | 0.012 | 0.994 | 0.110 | 0.024 | 0.006 | 0.997 | ||
Testing | 0.146 | 0.030 | 0.011 | 0.995 | 0.117 | 0.022 | 0.007 | 0.997 | |||
C | Training | 0.069 | 0.010 | 0.002 | 0.999 | 0.060 | 0.009 | 0.002 | 0.999 | ||
Testing | 0.063 | 0.009 | 0.002 | 0.999 | 0.070 | 0.010 | 0.002 | 0.999 | |||
D | Training | 0.051 | 0.006 | 0.001 | 0.999 | 0.043 | 0.005 | 0.001 | 1.000 | ||
Testing | 0.045 | 0.005 | 0.001 | 0.999 | 0.052 | 0.005 | 0.001 | 0.999 |
Pattern . | Phase . | . | RMSE (×10−10Pa−1) . | MAE (×10−10Pa−1) . | RSE . | R . | . | RMSE (×10−10Pa−1) . | MAE (×10−10Pa−1) . | RSE . | R . |
---|---|---|---|---|---|---|---|---|---|---|---|
A | Training | J1 | 0.641 | 0.391 | 0.201 | 0.894 | J2 | 0.000 | 0.000 | 0.000 | 1.000 |
Testing | 0.642 | 0.388 | 0.195 | 0.897 | 0.000 | 0.000 | 0.000 | 1.000 | |||
B | Training | 0.159 | 0.038 | 0.012 | 0.994 | 0.685 | 0.427 | 0.376 | 0.790 | ||
Testing | 0.162 | 0.046 | 0.013 | 0.994 | 0.688 | 0.427 | 0.355 | 0.803 | |||
C | Training | 0.070 | 0.012 | 0.002 | 0.999 | 0.724 | 0.450 | 0.421 | 0.761 | ||
Testing | 0.073 | 0.012 | 0.003 | 0.999 | 0.725 | 0.451 | 0.393 | 0.779 | |||
D | Training | 0.032 | 0.004 | 0.000 | 1.000 | 0.734 | 0.456 | 0.432 | 0.753 | ||
Testing | 0.044 | 0.005 | 0.001 | 1.000 | 0.733 | 0.455 | 0.402 | 0.773 | |||
A | Training | J3 | 0.628 | 0.375 | 0.187 | 0.902 | J4 | 0.443 | 0.220 | 0.093 | 0.953 |
Testing | 0.657 | 0.385 | 0.202 | 0.893 | 0.442 | 0.222 | 0.096 | 0.951 | |||
B | Training | 0.150 | 0.038 | 0.010 | 0.995 | 0.114 | 0.026 | 0.006 | 0.997 | ||
Testing | 0.155 | 0.040 | 0.011 | 0.994 | 0.146 | 0.035 | 0.010 | 0.995 | |||
C | Training | 0.053 | 0.008 | 0.001 | 0.999 | 0.047 | 0.007 | 0.001 | 0.999 | ||
Testing | 0.061 | 0.010 | 0.002 | 0.999 | 0.064 | 0.010 | 0.002 | 0.999 | |||
D | Training | 0.032 | 0.003 | 0.000 | 1.000 | 0.033 | 0.004 | 0.001 | 1.000 | ||
Testing | 0.040 | 0.005 | 0.001 | 1.000 | 0.050 | 0.005 | 0.001 | 0.999 | |||
A | Training | J5 | 0.456 | 0.205 | 0.100 | 0.949 | J6 | 0.313 | 0.126 | 0.047 | 0.976 |
Testing | 0.417 | 0.189 | 0.086 | 0.956 | 0.295 | 0.117 | 0.044 | 0.978 | |||
B | Training | 0.158 | 0.037 | 0.012 | 0.994 | 0.110 | 0.024 | 0.006 | 0.997 | ||
Testing | 0.146 | 0.030 | 0.011 | 0.995 | 0.117 | 0.022 | 0.007 | 0.997 | |||
C | Training | 0.069 | 0.010 | 0.002 | 0.999 | 0.060 | 0.009 | 0.002 | 0.999 | ||
Testing | 0.063 | 0.009 | 0.002 | 0.999 | 0.070 | 0.010 | 0.002 | 0.999 | |||
D | Training | 0.051 | 0.006 | 0.001 | 0.999 | 0.043 | 0.005 | 0.001 | 1.000 | ||
Testing | 0.045 | 0.005 | 0.001 | 0.999 | 0.052 | 0.005 | 0.001 | 0.999 |
For the J3 coefficient, Table 5 indicates that Patterns C and D yield comparably favorable results, whereas Patterns A and B produce less accurate estimations for this coefficient. The XGBoost model's precision in estimating the J3 coefficient has improved with an increased number of model inputs. Consequently, the lowest values of RMSE, MAE, and RSE are 0.040 × 10−10Pa−1, 0.005 × 10−10Pa−1, and 0.001, respectively, with the highest R value being 1.00 during the testing phase of Pattern D. Regarding the J4 coefficient, Pattern D also demonstrates the lowest RMSE, MAE, and RSE values at 0.033 × 10−10Pa−1, 0.005 × 10−10Pa−1, and 0.001, respectively. Conversely, the highest values for these parameters are observed with Pattern A, indicating that the model's accuracy in estimating the J4 coefficient improves as the number of input signals increases. The correlation coefficient obtained for the model is nearly 1.00 across all four patterns, indicating a perfect correlation between the predicted and actual data values.
The results obtained for coefficient J5, as shown in Table 5, indicate that the values of the statistical parameters for Pattern D are the lowest, whereas those for Pattern A are the highest. It is concluded that Patterns D and C provide the most accurate estimations for J5. Similarly, the model's prediction accuracy for coefficient J6 has improved with an increased number of input signals. Table 5 reveals that the XGBoost model predicts coefficient J6 with greater accuracy when using Pattern D. The values of RMSE, MAE, and RSE parameters are 0.052 × 10−10Pa−1, 0.005 × 10−10Pa−1, and 0.001 for the testing phase, respectively, which are the lowest among the models evaluated. The highest value of the R parameter in the table underscores the superiority of Pattern D.
(s) . | (s) . | (s) . | (×10−10Pa−1) . | (×10−10Pa−1) . | (×10−10Pa−1) . | a (m/s) . |
---|---|---|---|---|---|---|
Original values | ||||||
0.04 | 0.7 | 10 | 0.5 | 1.3 | 1 | 400 |
Predicted by XGBoost model | ||||||
0.04 | 0.7 | 10 | 0.51 | 1.28 | 1.19 | 400.12 |
Predicted by ITA | ||||||
0.04 | 0.7 | 10 | 0.59 | 0.82 | 5.71 | 401.58 |
0.04 | 0.7 | 10 | 0.71 | 0.80 | 4.33 | 404.21 |
0.04 | 0.7 | 10 | 0.50 | 1.25 | 1.61 | 399.99 |
(s) . | (s) . | (s) . | (×10−10Pa−1) . | (×10−10Pa−1) . | (×10−10Pa−1) . | a (m/s) . |
---|---|---|---|---|---|---|
Original values | ||||||
0.04 | 0.7 | 10 | 0.5 | 1.3 | 1 | 400 |
Predicted by XGBoost model | ||||||
0.04 | 0.7 | 10 | 0.51 | 1.28 | 1.19 | 400.12 |
Predicted by ITA | ||||||
0.04 | 0.7 | 10 | 0.59 | 0.82 | 5.71 | 401.58 |
0.04 | 0.7 | 10 | 0.71 | 0.80 | 4.33 | 404.21 |
0.04 | 0.7 | 10 | 0.50 | 1.25 | 1.61 | 399.99 |
Comparison of ITA and XGBoost models
For a better comparison, Table 6 shows the actual and recalibrated values of creep function coefficients using XGBoost and ITA models.
Furthermore, in terms of computational efficiency, the XGBoost model, once trained, can be applied multiple times, whereas the ITA model requires repetition for each determination of the creep function. Given that the computational demand of the ITA model escalates with system complexity – unlike the XGBoost model – the latter exhibits superior capabilities in terms of computational cost. Future studies may delve into the resilience of these methods against noise, system defects, and other uncertainties in greater detail.
This study shows that the proposed ML-based framework can be applied successfully to estimate the creep function coefficients of viscoelastic pipes in transient flow. As such, future works can adopt these methods to other transient flow analyses (e.g., leak detection) under various conditions (e.g., deterministic and probabilistic). These models can potentially supplement numerical and analytical techniques or validate intricate water hammer analyses that require significant time savings.
CONCLUSIONS
In this study, a novel approach was proposed that used XGBoost, an efficient ML method, to accurately predict the creep function coefficients of viscoelastic pipes under transient flow condition. A numerical model based on the MOC was developed and automatically calibrated, and a large set of pipe specifications was considered. Then, the hydraulic response of an RPV system was calculated using these specifications. The XGBoost model used the measured signals as the input and estimated the coefficients of the creep function. The outcomes of this study are as follows:
In this study, a dataset was generated by modeling the system transiently. It is worthy to note that transient waves have more information content about the system specification and faults than steady-state methods. The proposed method in this study showed the promising performance of using transient data in the ML approach to estimate the creep function coefficients accurately.
The proposed scheme requires most of the computational effort in the training stage. After this stage, it can instantly process new measurements and predict new creep function coefficients. Therefore, unlike the conventional methods, it does not need any complicated and lengthy optimization for real field application.
The feature selection revealed that each creep function coefficient depends on specific inputs and requires more than one input type for estimation.
Moreover, the study successfully demonstrated the superior efficiency of the proposed ML-based model in comparison with the numerical ITA technique for estimating the creep function coefficients of viscoelastic pipes. However, it is important to note that these models operate as black boxes (e.g., support vector regression, artificial neural networks, random forest, etc.), meaning that the relationship between input and output variables remains unknown. This limitation is particularly significant in cases where access to computers and soft computing models is restricted. Therefore, it is recommended that future research endeavors focus on employing white box (e.g., model three, gene expression programming, multivariate adaptive regression splines, etc.) methods to establish accurate relationships for estimating creep function coefficients in viscoelastic tubes. This approach would facilitate properly managing the complex and nonlinear water hammer phenomenon.
DATA AVAILABILITY STATEMENT
All relevant data are included in the paper or its Supplementary Information.
CONFLICT OF INTEREST
The authors declare there is no conflict.