## Abstract

The non-revenue water (NRW) ratio parameter is significantly important for performance evaluation of water distribution systems. In order to evaluate the NRW ratio, the variables influencing this parameter should be determined. Therefore, the first aim of the paper is to define the variables which are influential on the estimation of the NRW ratio and then analyze these variables by using artificial neural networks (ANNs) methodology by means of 50 models with one, two, three, and four-variable input. Secondly, in this study, the NRW ratios have been predicted for the first time by using the Kriging methodology through only two variables. By using the data measured in 12 district meter areas (DMA) in Kocaeli, 60 models in total have been established for NRW ratio prediction through the ANN and Kriging methodologies. The ANN models are closed-box models and therefore the interpretation of the ANN model results requires higher expert opinion. As a consequence, the results show that Kriging model graphs produce much more useful information than ANN models in terms of application and interpretation.

## HIGHLIGHTS

The variables and combinations which have an impact on the prediction of NRW ratios have been researched by using the ANN methodology.

The mean age of pipes and the mean pressure networks have a greater impact on NRW prediction.

Kriging method has been used for the first time in this study for NRW ratio prediction.

Kriging model results are much better than those of the ANN models.

## LIST OF ABBREVIATIONS

**Acronym****Definition**- ANN
Artificial neural network

- AWWA
American Water Works Association

- BI
Bias

- CC
Correlation coefficient

- DMA
District meter area

- FR
Failure ratio

- IBNET
International Benchmarking Network

- IWA
International Water Association

- MAP
Mean age of pipe

- MDP
Mean diameter of pipe

- MPN
Mean pressure of network

- MRA
Multiple regression analysis

- MSE
Mean square error

- NF
Number of failures

- NJ
Number of junctions

- NL
Network length

- NRW
Non-revenue water

- NRWR
Non-revenue water ratio

- NSC
Number of service connections

- NSE
Nash–Sutcliffe efficiency

- PMS
Pressure management system

- PRV
Pressure-reducing valve

*R*^{2}Coefficient of determination

- SCL
Service connection length

- SIV
System input volume

- ST
Storage tank

- WM
Water meter

## INTRODUCTION

Electricity, telephone lines, natural gas, fiber optics and water distribution systems are essential infrastructures in mega-cities. Especially, the water distribution system is one of the most important components of infrastructure systems (Shuang *et al.* 2017). The infrastructures of water distribution systems (WDS) consist of various components such as transmission and distribution lines, water supply pump stations, storage tanks, pumps, valves, fire hydrants, air release valves and drain valves. Random failures within the components of these systems lead to critical water loss and reduce the quality of the service. Enabling the reliability and the sustainability of water distribution systems is necessary and required for people's life quality. Therefore, the problems should be dealt with using innovative approaches in order to decrease and control the NRW ratio in WDS.

The analyses of water distribution systems show that there is a little or large water loss resulting from physical losses, water meter measurement errors, illegal use of water, non-existence of control at the operating pressure level, topography and consumption patterns (Rajani & Kleiner 2001; González-Gómez *et al.* 2011). Water loss, water balance and water loss performance indicators which are suggested by the American Water Works Association (AWWA) and the International Water Association (IWA) can be used for the evaluating of water distribution systems (WDS) (AWWA 2003). The term of non-revenue water (NRW) that is predicted and discussed with the models in this study was suggested for the first time by the IWA and it consists of apparent losses, real losses and unbilled authorized consumption components.

It is predicted that the annual total NRW in the world is roughly 126 billion m^{3} and its financial cost is roughly 39 billion dollars (Liemberger & Wyatt 2019). The high level of NRW is definitely an unacceptable situation for water management. In developed countries, the NRW ratio is below the 10% level, and in developing countries it is seen that this level is between 20% and 60% (IBNET 2018). A high level of NRW ratio has a negative impact on the budget of water management and on investment plans in the short- and long-term. Due to this reason, attempts to reduce the NRW level in water distribution systems have gained importance and innovative approaches have been suggested and prioritized in many countries.

The analyses of losses that led to the NRW ratio show the impacts of real losses such as physical losses caused by random pipe-bursts, leakages, pressure fluctuations at the network, pump errors caused by power cuts etc.; of apparent losses such as executive losses caused by water meter errors; and many uncertainties.

Leakage is one of the most effective indicators of real losses. The leakages in networks change depending on the pressure and high pressure results in high leakage and thereby high non-revenue water (Kanakoudis & Muhammetoglu 2014). Cascading failures and severe water losses are most likely to occur in networks with high-pressure. Therefore, pressure management system (PMS) applications have been preferred as a cost-effective method to reduce operation costs, to increase service quality, and to minimize pipe bursts and leakages in water supply network systems (Kanakoudis & Gonelas 2014; Kanakoudis & Gonelas 2016a, 2016b). An ideal pressure management system (PMS) may be designed by constructing pressure-reducing valves (PRVs) installation and district meter areas (DMAs) (Kanakoudis & Gonelas 2016b; Patelis *et al.* 2016). Pressure-reducing valves help to reduce water losses in parallel with the network pressure (Patelis *et al.* 2017). It is also possible to reduce water losses by monitoring night flow rate in district meter areas. Another method is a partial network replacement to prevent leakages in a DMA. Replacement of worn-out parts in the network is not effective on losses occurring in every part of the network. Consequently, construction of district meter areas and pressure management systems (PMS) are important and necessary for reducing the leakages in water distribution systems and for reducing the NRW to acceptable levels.

The studies in this field show that mostly statistical and stochastic methods have been developed (Kanakoudis & Tsitsifli 2012; Kanakoudis *et al.* 2013, 2015, 2016; van den Berg 2015; Tsitsifli *et al.* 2017; Güngör-Demirci *et al.* 2018; Tabesh *et al.* 2018). There has been little study in the literature about NRW ratio prediction. For example, Jang & Choi (2018) have developed models through multiple regression analysis (MRA) and artificial neural networks (ANNs) in order to predict the NRW ratio. In the ANN model, the input variables are as follows: mean pipe diameter, water supply quantity per demand junction, pipe length per demand junction, deteriorated pipe ratio, demand energy ratio, and number of leaks. When the models developed with ANN and MRA methods were compared, it was seen that the *R*^{2} value of the ANNs approach was 0.63 and the *R*^{2} value of the MRA approach was 0.19. In another study, Incheon (Republic of Korea) has predicted the NRW ratio via the ANN method by using the specific variables which have an impact on the leak variable of the water distribution system (Jang & Choi 2017). The inputs of these models consist of variables such as water demand quantity per junction, deteriorated pipe ratio and demand energy ratio. In these models, the hidden layers have been designed by using 10, 20, and 30 neurons; and the best model result has been obtained with 20 neurons, with *R*^{2} = 0.397.

In this study, the variables and combinations which have an impact on the prediction of NRW ratios have been researched by using the ANN methodology and by making 50 models with one, two, three and four-variable-input combinations. In addition, the Kriging method has been used for the first time in this study for NRW ratio prediction and these model results have been evaluated together with the ANN models.

## STUDY AREA AND DATA

The model data has been collected from Kocaeli, which is an industrial city, for the application. The total length of the drinking water distribution system is 8,936 km and the number of service connections is 796,577. The water supplied by the Administration of Water and Sewage was used by 1,883,270 people in total in 2018. The infrastructure of the water distribution system of the city is controlled via a SCADA system. Throughout the city, there are 195 drinking water storage tanks, 105 drinking water supply pump stations and 11 drinking water refinement plants. In addition, the city has 12 main DMA regions.

At the end of 2018, the NRW ratio of Kocaeli was calculated for the first time by taking into consideration the water balance components suggested by Alegre *et al.* (2016). According to the records and calculations, real losses were 24.79%, apparent losses 6.03% and unbilled authorized consumption was 1.49% and billed authorized consumption 67.69% (Table 1).

System input volume (SIV) 163,627,918 m ^{3}/year100% | Authorized consumption 69.18% | Billed authorized consumption 67.69% | Billed meter consumption 67.35% | Revenue water 67.69% |

Billed unmetered consumption 0.34% | ||||

Unbilled authorized consumption 1.49% | Unbilled meter consumption 0.64% | Non-revenue water (NRW) 32.31% | ||

Unbilled unmetered consumption 0.86% | ||||

Water losses 30.82% | Apparent losses 6.03% | Unauthorized consumption 1.18% | ||

Authorized consumption errors 4.85% | ||||

Real losses 24.79% | Leakage on transmission and distribution mains and service connections 24.63% | |||

Leakage and overflows at storage tanks 0.16% |

System input volume (SIV) 163,627,918 m ^{3}/year100% | Authorized consumption 69.18% | Billed authorized consumption 67.69% | Billed meter consumption 67.35% | Revenue water 67.69% |

Billed unmetered consumption 0.34% | ||||

Unbilled authorized consumption 1.49% | Unbilled meter consumption 0.64% | Non-revenue water (NRW) 32.31% | ||

Unbilled unmetered consumption 0.86% | ||||

Water losses 30.82% | Apparent losses 6.03% | Unauthorized consumption 1.18% | ||

Authorized consumption errors 4.85% | ||||

Real losses 24.79% | Leakage on transmission and distribution mains and service connections 24.63% | |||

Leakage and overflows at storage tanks 0.16% |

The NRW ratio of Kocaeli is equivalent to the difference between SIV and billed authorized consumption, 32.31%. It is seen that the metering errors of water meters have an effect on apparent losses of 4.85%. The accuracy measurement of meters was performed in the Weights and Measures Center of the Ministry of Industry and Trade located in Kocaeli in order to ensure the accuracy of NRW figures for meters of different diameter, age and model selected from DMAs in accordance with 50% flow (1/2 flow rate), 20% flow (1/5 flow rate) and 5% flow (1/20 flow rate). The loss amounts based on measurement errors for all DMAs in Kocaeli are also calculated by taking as a reference these measurement results. In addition, the end-of-life meters (10–15 years), defective meters and meters determined to be very faulty according to the measurement results have been replaced to ensure the sustainability of the reliability of measurements and to reduce the effect of apparent losses.

The model input parameters for NRW ratio are as follows: system input volume (SIV), network length (NL), number of failures (NF), mean pressure of networks (MPN), mean age of pipes (MAP), failure ratio (FR = NL/NF), mean diameter of pipes (MDP), storage tank (ST), number of service connections (NSC), service connections length (SCL), number of water meters (WM) and number of junctions (NJ), and the values of these parameters in Kocaeli have been submitted in Table 2 for the year 2018. When Table 2 is analyzed in more detail, DMA 1 and DMA 5 come into prominence in terms of system input volume (m^{3}), number of service connections, service connections length, number of water meters and number of junctions. On the other hand, it is seen that DMA 10 has the shortest network length and the least water consumption. The network length of DMA 2 is the longest at 1,600 km. According to the analyses of 2018, the highest network pressure was measured in DMA 5 and the lowest pressure in DMA 3. The highest (FR = 1.69) annual failures per km of pipes were detected in DMA 4, the least (FR = 0.13) in DMA 12 in addition to this. While the greatest value of mean pipe diameter is 159 mm in DMA 3, the smallest value is 108 mm in DMA 8. The DMA which is the newest and has the lowest non-revenue water rate is DMA 11 with 23.01%. In contrast, the highest non-revenue water rate is seen in DMA 3. Finally, the oldest district meter area is DMA 4.

DMA . | SIV . | NL . | MPN . | MAP . | FR . | MDP . | ST . | NSC . | SCL . | WM . | NJ . | NF . | NRWR . |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|

10^{5} × m^{3}
. | km . | m . | year . | – . | mm . | m^{3}
. | number . | % . | |||||

1 | 30.8 | 1.114 | 54 | 12 | 0.99 | 134 | 85.901 | 51.735 | 527 | 160.135 | 52.565 | 1.106 | 29.74 |

2 | 6.6 | 1.600 | 56 | 14 | 0.56 | 121 | 19.250 | 31.496 | 388 | 32.487 | 27.110 | 900 | 34.05 |

3 | 14.1 | 918 | 43 | 19 | 1.21 | 159 | 15.910 | 29.783 | 342 | 51.962 | 32.380 | 1.114 | 44.36 |

4 | 14.7 | 680 | 58 | 30 | 1.69 | 125 | 29.730 | 22.738 | 225 | 66.121 | 25.841 | 1.153 | 32.28 |

5 | 29.8 | 1.050 | 64 | 16 | 0.64 | 144 | 36.420 | 45.302 | 410 | 147.275 | 45.513 | 672 | 25.40 |

6 | 8.9 | 854 | 53 | 22 | 0.57 | 119 | 40.880 | 21.324 | 232 | 43.254 | 27.354 | 484 | 28.50 |

7 | 14.2 | 803 | 60 | 24 | 0.70 | 124 | 28.800 | 21.549 | 225 | 70.595 | 21.867 | 560 | 37.02 |

8 | 5.1 | 443 | 45 | 27 | 0.61 | 108 | 17.615 | 12.022 | 134 | 29.018 | 9.660 | 269 | 41.10 |

9 | 11.1 | 447 | 47 | 26 | 1.26 | 125 | 14.705 | 17.319 | 171 | 55.088 | 18.766 | 563 | 40.11 |

10 | 6.2 | 310 | 51 | 18 | 1.42 | 125 | 14.750 | 9.657 | 134 | 15.471 | 10.216 | 439 | 26.16 |

11 | 9.4 | 339 | 49 | 8 | 0.51 | 133 | 12.000 | 11.873 | 130 | 49.244 | 17.483 | 173 | 23.01 |

12 | 12.2 | 378 | 57 | 25 | 0.13 | 121 | 4.000 | 21.927 | 197 | 75.927 | 24.282 | 48 | 26.01 |

DMA . | SIV . | NL . | MPN . | MAP . | FR . | MDP . | ST . | NSC . | SCL . | WM . | NJ . | NF . | NRWR . |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|

10^{5} × m^{3}
. | km . | m . | year . | – . | mm . | m^{3}
. | number . | % . | |||||

1 | 30.8 | 1.114 | 54 | 12 | 0.99 | 134 | 85.901 | 51.735 | 527 | 160.135 | 52.565 | 1.106 | 29.74 |

2 | 6.6 | 1.600 | 56 | 14 | 0.56 | 121 | 19.250 | 31.496 | 388 | 32.487 | 27.110 | 900 | 34.05 |

3 | 14.1 | 918 | 43 | 19 | 1.21 | 159 | 15.910 | 29.783 | 342 | 51.962 | 32.380 | 1.114 | 44.36 |

4 | 14.7 | 680 | 58 | 30 | 1.69 | 125 | 29.730 | 22.738 | 225 | 66.121 | 25.841 | 1.153 | 32.28 |

5 | 29.8 | 1.050 | 64 | 16 | 0.64 | 144 | 36.420 | 45.302 | 410 | 147.275 | 45.513 | 672 | 25.40 |

6 | 8.9 | 854 | 53 | 22 | 0.57 | 119 | 40.880 | 21.324 | 232 | 43.254 | 27.354 | 484 | 28.50 |

7 | 14.2 | 803 | 60 | 24 | 0.70 | 124 | 28.800 | 21.549 | 225 | 70.595 | 21.867 | 560 | 37.02 |

8 | 5.1 | 443 | 45 | 27 | 0.61 | 108 | 17.615 | 12.022 | 134 | 29.018 | 9.660 | 269 | 41.10 |

9 | 11.1 | 447 | 47 | 26 | 1.26 | 125 | 14.705 | 17.319 | 171 | 55.088 | 18.766 | 563 | 40.11 |

10 | 6.2 | 310 | 51 | 18 | 1.42 | 125 | 14.750 | 9.657 | 134 | 15.471 | 10.216 | 439 | 26.16 |

11 | 9.4 | 339 | 49 | 8 | 0.51 | 133 | 12.000 | 11.873 | 130 | 49.244 | 17.483 | 173 | 23.01 |

12 | 12.2 | 378 | 57 | 25 | 0.13 | 121 | 4.000 | 21.927 | 197 | 75.927 | 24.282 | 48 | 26.01 |

## METHODOLOGY

In this study, the NRW ratios of 12 different DMA regions have been predicted via developed models for different combinations of input variables by using ANNs (artificial neuron networks) and Kriging methods. The model inputs used for NRW ratio prediction are as follows: system input volume, network length, service connection lengths, mean diameter of pipes, mean age of pipes, mean pressure of networks, number of service connections, number of junctions, storage tank, water meter and failure ratio variables. By using the ANN method, models with single-input/single-output have been designed and then two, three and four-input models have been developed by increasing the number of inputs. In the Kriging method, models with two-input/single-output have been developed.

### Artificial neural networks (ANNs)

Artificial neural networks (ANNs) consist of mathematical algorithms with learning ability. Many key features are available in ANN methods, for example: ability to learn from data, recognition, prediction, classification, generalization and studying with unlimited number of variables. ANN architecture consists in the simplest way of main layers: input, hidden and output layers as shown in Figure 1 (Piotrowski *et al.* 2015; Salami Shahid & Ehteshami 2016). Each layer has a different number of neurons and presents an example model network structure by connecting with different weight values to the neurons of the previous layer.

One of the most important features of ANNs is their trainable ability. Thanks to training, the connection weights can be determined by means of an appropriate learning method. Learning is carried out by repeatedly adjusting each connection weight in a neural network (Negnevitsky 2002).

A feed forward back propagation algorithm, which is frequently preferred as a network type in training, has been used in this study. The Levenberg–Marquardt back propagation algorithm has been selected, because learning with this algorithm is often faster than with its alternatives (Kermani *et al.* 2005; Haykin 2008; Kızılöz *et al.* 2015). Constitutively, the nonlinear least squares approach and the maximum neighborhood principle have been used in the Levenberg–Marquardt algorithm. The network is operated backwardly by taking into consideration the errors between the expected values and the outputs generated by the network for inputs and thanks to these errors the model is readjusted via the weights. This is repeated until the model results closest to the expected values are obtained.

In addition, the data of this method is as follows; trained (55%), validated (35%) and tested (10%). A sigmoid function in the hidden layer and linear function in the output layer have been preferred with the thought that the most appropriate ANN architectural structure is a three-layer feed forward back propagation network, FFBP, and all data has been trained in accordance with this structure. The ANN models have been designed by one hidden layer. There is no general rule for determining the neuron number in the hidden layer, and the highest model accuracy has been tried to be obtained by increasing the neuron number.

### Kriging modeling

*et al.*2004). The weights have been determined with the condition that the estimation error is the minimum. The method uses a weight model in classical statistical theory which ensures being affected by the close points, similar to the weighted average method (Krige 1966). The Kriging method has been defined through Equation (1) (Yeşilkanat

*et al.*2015): where is the unknown but predicted value at the point, is the weight values corresponding to each used in the calculation of , is the measured data used in the prediction of and

*N*is the number of points used in the calculation of .

Similarly to classical statistics, the data is usually expected to be congruent with the normal distribution in geostatistical analyses (Clark 1979). The models developed by using the data that is congruent with the normal distribution give the best results. Due to this reason, primarily, checking the congruency of the data with the normal distribution is suggested. If the data is not congruent with the normal distribution, use of appropriate transformations in order to make the data congruent with the normal distribution is suggested.

In the literature research about the Kriging method, it is seen that there have been many studies in recent years: Simple Kriging (Elbasiouny *et al.* 2014), Ordinary Kriging (Sanusi *et al.* 2014), Universal Kriging (Lark *et al.* 2014), Co-Kriging (Chica-Olmo *et al.* 2014), Indicator Kriging (Armstrong 1998), Punctual Kriging and Block Kriging. In this study, Point Kriging and Block Kriging approaches developed through the Surfer software program have been used. Both of them do gridding through interpolation. In the point technique, the probable points in the grid zones enter into the process. In the block technique, however, a rectangular block is created around each grid zone and only the average of the probable points within this block is included in the gridding process.

### Model efficiency formulation

*R*

^{2}). where is the model prediction, is the measurement, are the arithmetic averages of the model predictions, are the arithmetic averages of the measurements,

*n*is total number of data, are standard deviations for the measurements and are standard deviations for the predictions.

NSE takes values between minus infinity and 1. If the predictions match exactly with the model, then this value will be 1. On the other hand, the MSE criterion takes changing values between minus infinity and 1 and its best prediction value is 0. The best prediction has been carried out for the value of 1 in *R*^{2}. Finally, when the relationship between the model and the prediction is analyzed in terms of CC, it is seen that CC takes values between 0 and +1. The most successful predictions have been obtained when CC approaches 1.

## RESULTS AND DISCUSSION

Before the model setting, the distribution of data is first analyzed. The Normal Q-Q plot test given in Figure 2 demonstrates that the research data is congruent with the normal distribution.

### Artificial neural networks

In the prediction of NRW ratios, firstly, single-input/single-output ANN models have been set in order to determine the suitable model variables. Secondly, the model combinations with one, two, three and four input and single output have been researched. The data is divided as 55% training, 35% validation, and 10% testing through the existing algorithm in the MATLAB program. Before each training process, the models with the irregular starting weights and biases should be repeatedly brought to the initial state (Coulibaly *et al.* 2000; Kızılöz *et al.* 2015). In the models, four neurons have been preferred in the hidden layers. A large number of ANN models has been created for the prediction of the NRW ratio. The performance values of the model results are given in Tables 3–6.

Model no . | Input combinations . | R^{2}
. | MSE . | NSE . | CC . |
---|---|---|---|---|---|

ANN.1.1 | NL | 0.66 | 0.0027 | 0.659 | 0.812 |

ANN.1.2 | NL/MPN | 0.63 | 0.0029 | 0.625 | 0.793 |

ANN.1.3 | ST | 0.55 | 0.0035 | 0.552 | 0.745 |

ANN.1.4 | ST/NJ | 0.53 | 0.0037 | 0.531 | 0.73 |

ANN.1.5 | MDP | 0.52 | 0.0038 | 0.511 | 0.718 |

ANN.1.6 | MPN | 0.46 | 0.0043 | 0.450 | 0.680 |

ANN.1.7 | NL/MAP | 0.46 | 0.0043 | 0.454 | 0.674 |

ANN.1.8 | NSC | 0.45 | 0.0044 | 0.441 | 0.669 |

ANN.1.9 | SCL | 0.40 | 0.0048 | 0.388 | 0.631 |

Model no . | Input combinations . | R^{2}
. | MSE . | NSE . | CC . |
---|---|---|---|---|---|

ANN.1.1 | NL | 0.66 | 0.0027 | 0.659 | 0.812 |

ANN.1.2 | NL/MPN | 0.63 | 0.0029 | 0.625 | 0.793 |

ANN.1.3 | ST | 0.55 | 0.0035 | 0.552 | 0.745 |

ANN.1.4 | ST/NJ | 0.53 | 0.0037 | 0.531 | 0.73 |

ANN.1.5 | MDP | 0.52 | 0.0038 | 0.511 | 0.718 |

ANN.1.6 | MPN | 0.46 | 0.0043 | 0.450 | 0.680 |

ANN.1.7 | NL/MAP | 0.46 | 0.0043 | 0.454 | 0.674 |

ANN.1.8 | NSC | 0.45 | 0.0044 | 0.441 | 0.669 |

ANN.1.9 | SCL | 0.40 | 0.0048 | 0.388 | 0.631 |

Model no . | Input combinations . | R^{2}
. | MSE . | NSE . | CC . |
---|---|---|---|---|---|

ANN.2.1 | ST/NJ–NL | 0.73 | 0.0021 | 0.728 | 0.854 |

ANN.2.2 | SIV/NL–WM | 0.67 | 0.0027 | 0.660 | 0.817 |

ANN.2.3 | ST/NJ–FR | 0.63 | 0.0029 | 0.627 | 0.796 |

ANN.2.4 | NL/MPN-MAP | 0.62 | 0.0030 | 0.622 | 0.789 |

ANN.2.5 | SIV-MPN | 0.61 | 0.0031 | 0.608 | 0.780 |

ANN.2.6 | NL-MPN | 0.61 | 0.0031 | 0.610 | 0.783 |

ANN.2.7 | MPN-FR | 0.61 | 0.0031 | 0.606 | 0.780 |

ANN.2.8 | SIV/NL-MPN | 0.59 | 0.0032 | 0.591 | 0.770 |

ANN.2.9 | MDP-MPN | 0.55 | 0.0036 | 0.539 | 0.738 |

ANN.2.10 | SIV/NJ-MPN | 0.55 | 0.0036 | 0.536 | 0.740 |

ANN.2.11 | SIV/NJ–MDP | 0.54 | 0.0037 | 0.530 | 0.738 |

ANN.2.12 | MAP-FR | 0.52 | 0.0037 | 0.523 | 0.725 |

ANN.2.13 | NL/MAP-MPN | 0.50 | 0.0040 | 0.488 | 0.714 |

Model no . | Input combinations . | R^{2}
. | MSE . | NSE . | CC . |
---|---|---|---|---|---|

ANN.2.1 | ST/NJ–NL | 0.73 | 0.0021 | 0.728 | 0.854 |

ANN.2.2 | SIV/NL–WM | 0.67 | 0.0027 | 0.660 | 0.817 |

ANN.2.3 | ST/NJ–FR | 0.63 | 0.0029 | 0.627 | 0.796 |

ANN.2.4 | NL/MPN-MAP | 0.62 | 0.0030 | 0.622 | 0.789 |

ANN.2.5 | SIV-MPN | 0.61 | 0.0031 | 0.608 | 0.780 |

ANN.2.6 | NL-MPN | 0.61 | 0.0031 | 0.610 | 0.783 |

ANN.2.7 | MPN-FR | 0.61 | 0.0031 | 0.606 | 0.780 |

ANN.2.8 | SIV/NL-MPN | 0.59 | 0.0032 | 0.591 | 0.770 |

ANN.2.9 | MDP-MPN | 0.55 | 0.0036 | 0.539 | 0.738 |

ANN.2.10 | SIV/NJ-MPN | 0.55 | 0.0036 | 0.536 | 0.740 |

ANN.2.11 | SIV/NJ–MDP | 0.54 | 0.0037 | 0.530 | 0.738 |

ANN.2.12 | MAP-FR | 0.52 | 0.0037 | 0.523 | 0.725 |

ANN.2.13 | NL/MAP-MPN | 0.50 | 0.0040 | 0.488 | 0.714 |

Model no . | Input combinations . | R^{2}
. | MSE . | NSE . | CC . |
---|---|---|---|---|---|

ANN.3.1 | ST/NJ-NL-MAP | 0.76 | 0.0019 | 0.754 | 0.869 |

ANN.3.2 | NL-SCL-MPN | 0.76 | 0.019 | 0.752 | 0.868 |

ANN.3.3 | NL-MPN-MAP | 0.75 | 0.0020 | 0.744 | 0.865 |

ANN.3.4 | NL/MPN-NSC-MAP | 0.75 | 0.0020 | 0.744 | 0.863 |

ANN.3.5 | NL/MPN-MDP-MAP | 0.75 | 0.0020 | 0.747 | 0.865 |

ANN.3.6 | ST/NJ-NL-MPN | 0.74 | 0.0021 | 0.736 | 0.860 |

ANN.3.7 | SIV-NL-MPN | 0.73 | 0.0021 | 0.728 | 0.853 |

ANN.3.8 | SIV-NL-MAP | 0.72 | 0.0022 | 0.722 | 0.850 |

ANN.3.9 | SIV/NL-MDP-MPN | 0.72 | 0.0023 | 0.713 | 0.846 |

ANN.3.10 | SIV/NL-MPN-MAP | 0.72 | 0.0022 | 0.719 | 0.849 |

ANN.3.11 | SIV/NJ-MPN-MAP | 0.71 | 0.0023 | 0.709 | 0.845 |

Model no . | Input combinations . | R^{2}
. | MSE . | NSE . | CC . |
---|---|---|---|---|---|

ANN.3.1 | ST/NJ-NL-MAP | 0.76 | 0.0019 | 0.754 | 0.869 |

ANN.3.2 | NL-SCL-MPN | 0.76 | 0.019 | 0.752 | 0.868 |

ANN.3.3 | NL-MPN-MAP | 0.75 | 0.0020 | 0.744 | 0.865 |

ANN.3.4 | NL/MPN-NSC-MAP | 0.75 | 0.0020 | 0.744 | 0.863 |

ANN.3.5 | NL/MPN-MDP-MAP | 0.75 | 0.0020 | 0.747 | 0.865 |

ANN.3.6 | ST/NJ-NL-MPN | 0.74 | 0.0021 | 0.736 | 0.860 |

ANN.3.7 | SIV-NL-MPN | 0.73 | 0.0021 | 0.728 | 0.853 |

ANN.3.8 | SIV-NL-MAP | 0.72 | 0.0022 | 0.722 | 0.850 |

ANN.3.9 | SIV/NL-MDP-MPN | 0.72 | 0.0023 | 0.713 | 0.846 |

ANN.3.10 | SIV/NL-MPN-MAP | 0.72 | 0.0022 | 0.719 | 0.849 |

ANN.3.11 | SIV/NJ-MPN-MAP | 0.71 | 0.0023 | 0.709 | 0.845 |

Model no . | Input combinations . | R^{2}
. | MSE . | NSE . | CC . | d
. |
---|---|---|---|---|---|---|

ANN.4.1 | ST/NJ-NL-MPN-MAP | 0.75 | 0.0020 | 0.748 | 0.865 | 0.924 |

ANN.4.2 | NL-MPN-MAP-MDP | 0.75 | 0.0021 | 0.736 | 0.865 | 0.926 |

ANN.4.3 | NL-SCL-MPN-MAP | 0.75 | 0.0020 | 0.743 | 0.865 | 0.924 |

ANN.4.4 | SIV-NL-MPN-MAP | 0.74 | 0.0021 | 0.738 | 0.860 | 0.918 |

ANN.4.5 | SIV-NL-MPN-MDP | 0.74 | 0.0020 | 0.740 | 0.860 | 0.921 |

ANN.4.6 | NL-NSC-MPN-MAP | 0.74 | 0.0020 | 0.740 | 0.861 | 0.924 |

ANN.4.7 | SCL-MPN-MAP-MDP | 0.72 | 0.0023 | 0.713 | 0.846 | 0.914 |

ANN.4.8 | SIV/NL-MDP-MPN-MAP | 0.72 | 0.0022 | 0.715 | 0.847 | 0.907 |

ANN.4.9 | NJ-MAP-MPN-MDP | 0.72 | 0.0022 | 0.718 | 0.848 | 0.913 |

ANN.4.10 | SIV-NJ-MPN-MDP | 0.71 | 0.0024 | 0.697 | 0.843 | 0.915 |

Model no . | Input combinations . | R^{2}
. | MSE . | NSE . | CC . | d
. |
---|---|---|---|---|---|---|

ANN.4.1 | ST/NJ-NL-MPN-MAP | 0.75 | 0.0020 | 0.748 | 0.865 | 0.924 |

ANN.4.2 | NL-MPN-MAP-MDP | 0.75 | 0.0021 | 0.736 | 0.865 | 0.926 |

ANN.4.3 | NL-SCL-MPN-MAP | 0.75 | 0.0020 | 0.743 | 0.865 | 0.924 |

ANN.4.4 | SIV-NL-MPN-MAP | 0.74 | 0.0021 | 0.738 | 0.860 | 0.918 |

ANN.4.5 | SIV-NL-MPN-MDP | 0.74 | 0.0020 | 0.740 | 0.860 | 0.921 |

ANN.4.6 | NL-NSC-MPN-MAP | 0.74 | 0.0020 | 0.740 | 0.861 | 0.924 |

ANN.4.7 | SCL-MPN-MAP-MDP | 0.72 | 0.0023 | 0.713 | 0.846 | 0.914 |

ANN.4.8 | SIV/NL-MDP-MPN-MAP | 0.72 | 0.0022 | 0.715 | 0.847 | 0.907 |

ANN.4.9 | NJ-MAP-MPN-MDP | 0.72 | 0.0022 | 0.718 | 0.848 | 0.913 |

ANN.4.10 | SIV-NJ-MPN-MDP | 0.71 | 0.0024 | 0.697 | 0.843 | 0.915 |

First of all, the different ANN model outputs developed for the single-input parameters that have an impact on the NRW ratio can be seen in Table 3. Sixteen different models in total have been developed for each of 12 variables given in Table 2 and for five parameters derived from these variables such as ST/NJ, SIV/NL, NL/MPN, SIV/NJ and NL/MAP. According to the first model results, the best ones in terms of statistical criteria in the estimation of NRW have been aligned in Table 3. As regards the models with one input, the most effective variables are respectively as follows: NL, NL/MPN, ST, ST/NJ and MDP.

According to the model predictions with two inputs, the best 13 models have been aligned in terms of five performance evaluation benchmarks given in Table 4. It is seen that the best model is the ST/NJ-NL combination in accordance with this alignment. When comparing the model results with two inputs with the models with one input, it is seen that the first one has better performance.

In the ANN models with three-inputs, it is seen that there is a great number of combinations with better performance. The 11 models given in Table 5 are not distinctly separated from each other in terms of the *R*^{2}, MSE, NSE, CC and *d* benchmarks. The NRW estimates have signally improved on the three-input models in contrast to the two-input models.

When the performances of the four-input and three-input combinations given in Table 6 are evaluated together, it can be understood that the additional variable has not sufficiently ameliorated the prediction performance. In the four-input models made by adding an input to the three-input model combinations, it can be seen that there is not a considerable performance improvement in the prediction of the NRW ratio, according to the obtained performance results.

The graphs of the two combinations with the best measure–model-prediction relationship among all ANN models are given in Figure 3.

A comparative evaluation of all models shows that the three-input models have better results. In general, the best predictions have been made with the three-input models. When the performances of the three-input models are compared with the performances of the four-input models by means of the model performances given in Tables 5 and 6, a significant amelioration cannot be observed in models with four inputs. On the other hand, it is seen that the two-input models cannot adequately explain the NRW predictions. In addition, the closed-box structure of the ANN method does not allow an evaluation in detail. When the best models were analyzed, it was concluded that variables such as the storage tank, the number of junctions, the network length, the mean age of pipes, the service connection lengths and the mean pressure networks have a greater impact on NRW prediction.

### Kriging method

All data has been set according to the *XYZ* coordinate system and three-dimensional Kriging prediction maps (with high accuracy and different grid zones) have been obtained. In Kriging models, the first parameters have been represented on the *x*-axis and the second ones on the *y*-axis. In graphs, the nodes of Kriging Model 1 are given as *x*-direction 100, *y*-direction 95 and the nodes of Kriging Model 2 as *x*-direction 88, *y*-direction 100.

Cross-validation has been used for all gridding methods. Cross-validation helps to assess the relative quality of the grid by computing and investigating the gridding errors. Cross-validation has been performed on the linear Z values, not on the transformed *Z* values in all Kriging models. The model outputs of NRW ratios have been obtained over the *Z* coordinate system through the prediction model maps developed by Surfer (2016). The outcomes of Kriging model performance are given in Table 7. The prediction maps developed for the best Kriging models (nos 1 and 2) are given in Figures 4 and 5. Thanks to the obtained model figures, the evaluation of the relationship between the variables and the NRW ratios is made easier.

Model no . | Input combinations . | R^{2}
. | MSE . | NSE . | CC . | d
. |
---|---|---|---|---|---|---|

Kriging 1 | SIV/NJ-MPN | 0.95 | 0.0004 | 0.944 | 0.974 | 0.985 |

Kriging 2 | SIV/NJ-MAP | 0.94 | 0.0005 | 0.937 | 0.971 | 0.982 |

Kriging 3 | SIV/SCL-MAP | 0.93 | 0.0006 | 0.928 | 0.966 | 0.979 |

Kriging 4 | SIV/SCL-MPN | 0.93 | 0.0006 | 0.929 | 0.966 | 0.979 |

Kriging 5 | SIV/NL-MPN | 0.90 | 0.0008 | 0.901 | 0.949 | 0.973 |

Kriging 6 | SIV/NL-MAP | 0.90 | 0.0008 | 0.899 | 0.949 | 0.972 |

Kriging 7 | NL/MPN-MAP | 0.79 | 0.0017 | 0.782 | 0.886 | 0.939 |

Kriging 8 | NL/MAP-MPN | 0.78 | 0.0017 | 0.778 | 0.884 | 0.932 |

Kriging 9 | NL/MPN-MPD | 0.78 | 0.0017 | 0.779 | 0.883 | 0.936 |

Kriging 10 | SCL/MPN-MAP | 0.74 | 0.0021 | 0.735 | 0.859 | 0.921 |

Model no . | Input combinations . | R^{2}
. | MSE . | NSE . | CC . | d
. |
---|---|---|---|---|---|---|

Kriging 1 | SIV/NJ-MPN | 0.95 | 0.0004 | 0.944 | 0.974 | 0.985 |

Kriging 2 | SIV/NJ-MAP | 0.94 | 0.0005 | 0.937 | 0.971 | 0.982 |

Kriging 3 | SIV/SCL-MAP | 0.93 | 0.0006 | 0.928 | 0.966 | 0.979 |

Kriging 4 | SIV/SCL-MPN | 0.93 | 0.0006 | 0.929 | 0.966 | 0.979 |

Kriging 5 | SIV/NL-MPN | 0.90 | 0.0008 | 0.901 | 0.949 | 0.973 |

Kriging 6 | SIV/NL-MAP | 0.90 | 0.0008 | 0.899 | 0.949 | 0.972 |

Kriging 7 | NL/MPN-MAP | 0.79 | 0.0017 | 0.782 | 0.886 | 0.939 |

Kriging 8 | NL/MAP-MPN | 0.78 | 0.0017 | 0.778 | 0.884 | 0.932 |

Kriging 9 | NL/MPN-MPD | 0.78 | 0.0017 | 0.779 | 0.883 | 0.936 |

Kriging 10 | SCL/MPN-MAP | 0.74 | 0.0021 | 0.735 | 0.859 | 0.921 |

The best three-input ANN.3.1 model and Kriging 1 model developed within the framework of this study have been evaluated together in Figure 6 and it is seen that the Kriging model makes better predictions than the best ANN model. When all the ANNs and Kriging model results in Tables 5–7 are evaluated, it is seen that the Kriging model results are much better than those of the ANN models. Furthermore, in the Kriging models, the prediction with the figure information is highly important and expert evaluation of the model outputs is possible.

In this study, the comparison of ANN and Kriging model results with common inputs is given in Table 8. It is seen that the NRW ratio model predictions of the Kriging method is better than the ANN solutions. The *R*^{2} value of the NL/MPN-MAP combination, the best one to estimate the NRW by using ANN, is 0.62. However, when Kriging methodology is used for the same model inputs, this value is 0.79. In this study, the best model prediction of NRW ratio is obtained by means of the SIV/NJ-MPN combination and Kriging methodology. When the same combination is used with ANN, the *R*^{2} value is determined as 0.55 in terms of model performance, lower than that of the Kriging model.

Input combinations . | Methodology . | R^{2}
. | MSE . | NSE . | CC . | d
. |
---|---|---|---|---|---|---|

SIV/NJ-MPN | Kriging 1 | 0.95 | 0.0004 | 0.944 | 0.974 | 0.985 |

ANN.2.10 | 0.55 | 0.0036 | 0.536 | 0.740 | 0.829 | |

SIV/NL-MPN | Kriging 5 | 0.90 | 0.0008 | 0.901 | 0.949 | 0.973 |

ANN.2.8 | 0.59 | 0.0032 | 0.591 | 0.770 | 0.862 | |

NL/MPN-MAP | Kriging 7 | 0.79 | 0.0017 | 0.782 | 0.886 | 0.939 |

ANN.2.4 | 0.62 | 0.0030 | 0.622 | 0.789 | 0.872 | |

NL/MAP-MPN | Kriging 8 | 0.78 | 0.0017 | 0.778 | 0.884 | 0.932 |

ANN.2.13 | 0.50 | 0.0040 | 0.488 | 0.714 | 0.830 |

Input combinations . | Methodology . | R^{2}
. | MSE . | NSE . | CC . | d
. |
---|---|---|---|---|---|---|

SIV/NJ-MPN | Kriging 1 | 0.95 | 0.0004 | 0.944 | 0.974 | 0.985 |

ANN.2.10 | 0.55 | 0.0036 | 0.536 | 0.740 | 0.829 | |

SIV/NL-MPN | Kriging 5 | 0.90 | 0.0008 | 0.901 | 0.949 | 0.973 |

ANN.2.8 | 0.59 | 0.0032 | 0.591 | 0.770 | 0.862 | |

NL/MPN-MAP | Kriging 7 | 0.79 | 0.0017 | 0.782 | 0.886 | 0.939 |

ANN.2.4 | 0.62 | 0.0030 | 0.622 | 0.789 | 0.872 | |

NL/MAP-MPN | Kriging 8 | 0.78 | 0.0017 | 0.778 | 0.884 | 0.932 |

ANN.2.13 | 0.50 | 0.0040 | 0.488 | 0.714 | 0.830 |

Finally, the comparison of the models used in this study for NRW estimation with the performance of models studied in the literature is made and given in Table 9. The *R*^{2} values of the models with three and six inputs developed by Jang & Choi (2017) by means of the ANN approach are respectively 0.40 and 0.63. On the other hand, a better prediction model with three inputs, *R*^{2} value: 0.76, could be developed in this study thanks to the combinations of different input variables. In addition, a much better model prediction, *R*^{2} value: 0.95, is also obtained thanks to a new approach used for the first time in NRW estimation.

No . | Study . | Model . | Input number . | R^{2}
. |
---|---|---|---|---|

1 | Jang & Choi (2018) | ANN | 6 | 0.63 |

2 | Jang & Choi (2017) | ANN | 3 | 0.40 |

3 | This study | ANN | 3 | 0.76 |

4 | This study | Kriging | 2 | 0.95 |

No . | Study . | Model . | Input number . | R^{2}
. |
---|---|---|---|---|

1 | Jang & Choi (2018) | ANN | 6 | 0.63 |

2 | Jang & Choi (2017) | ANN | 3 | 0.40 |

3 | This study | ANN | 3 | 0.76 |

4 | This study | Kriging | 2 | 0.95 |

## CONCLUSIONS

Firstly, as a result of this study, it is seen that the new four-input variable suggested in addition to the studies on NRW ratio estimation in the literature has a positive effect on MPN, MAP, NL/MPN and NL/MAP model performances. The importance of independent variables to be selected for the best NRW ratio model predictions is introduced once again by the obtained results. On the other hand, the most important contribution of this study is the effectiveness of an alternative method, namely Kriging, used in NRW ratio estimations. Although the number of independent variables for predictions carried out by ANN methodology was increased, it was not possible to exceed a certain level of performance in model predictions. However, the results of Kriging modeling used for the first time for NRW ratio estimations are very good. In addition, much better predictions than those of ANN approaches are obtained by Kriging models with two inputs. In ANN models, evaluation through the predictions becomes difficult. It is not possible to reach directly the information on why and how the model results occur. On the other hand, developing innovative models which allow expert evaluation and interpretation is possible by the new Kriging technique in this study.