Abstract

The non-revenue water (NRW) ratio parameter is significantly important for performance evaluation of water distribution systems. In order to evaluate the NRW ratio, the variables influencing this parameter should be determined. Therefore, the first aim of the paper is to define the variables which are influential on the estimation of the NRW ratio and then analyze these variables by using artificial neural networks (ANNs) methodology by means of 50 models with one, two, three, and four-variable input. Secondly, in this study, the NRW ratios have been predicted for the first time by using the Kriging methodology through only two variables. By using the data measured in 12 district meter areas (DMA) in Kocaeli, 60 models in total have been established for NRW ratio prediction through the ANN and Kriging methodologies. The ANN models are closed-box models and therefore the interpretation of the ANN model results requires higher expert opinion. As a consequence, the results show that Kriging model graphs produce much more useful information than ANN models in terms of application and interpretation.

HIGHLIGHTS

  • The variables and combinations which have an impact on the prediction of NRW ratios have been researched by using the ANN methodology.

  • The mean age of pipes and the mean pressure networks have a greater impact on NRW prediction.

  • Kriging method has been used for the first time in this study for NRW ratio prediction.

  • Kriging model results are much better than those of the ANN models.

LIST OF ABBREVIATIONS

     
  • Acronym

    Definition

  •  
  • ANN

    Artificial neural network

  •  
  • AWWA

    American Water Works Association

  •  
  • BI

    Bias

  •  
  • CC

    Correlation coefficient

  •  
  • DMA

    District meter area

  •  
  • FR

    Failure ratio

  •  
  • IBNET

    International Benchmarking Network

  •  
  • IWA

    International Water Association

  •  
  • MAP

    Mean age of pipe

  •  
  • MDP

    Mean diameter of pipe

  •  
  • MPN

    Mean pressure of network

  •  
  • MRA

    Multiple regression analysis

  •  
  • MSE

    Mean square error

  •  
  • NF

    Number of failures

  •  
  • NJ

    Number of junctions

  •  
  • NL

    Network length

  •  
  • NRW

    Non-revenue water

  •  
  • NRWR

    Non-revenue water ratio

  •  
  • NSC

    Number of service connections

  •  
  • NSE

    Nash–Sutcliffe efficiency

  •  
  • PMS

    Pressure management system

  •  
  • PRV

    Pressure-reducing valve

  •  
  • R2

    Coefficient of determination

  •  
  • SCL

    Service connection length

  •  
  • SIV

    System input volume

  •  
  • ST

    Storage tank

  •  
  • WM

    Water meter

INTRODUCTION

Electricity, telephone lines, natural gas, fiber optics and water distribution systems are essential infrastructures in mega-cities. Especially, the water distribution system is one of the most important components of infrastructure systems (Shuang et al. 2017). The infrastructures of water distribution systems (WDS) consist of various components such as transmission and distribution lines, water supply pump stations, storage tanks, pumps, valves, fire hydrants, air release valves and drain valves. Random failures within the components of these systems lead to critical water loss and reduce the quality of the service. Enabling the reliability and the sustainability of water distribution systems is necessary and required for people's life quality. Therefore, the problems should be dealt with using innovative approaches in order to decrease and control the NRW ratio in WDS.

The analyses of water distribution systems show that there is a little or large water loss resulting from physical losses, water meter measurement errors, illegal use of water, non-existence of control at the operating pressure level, topography and consumption patterns (Rajani & Kleiner 2001; González-Gómez et al. 2011). Water loss, water balance and water loss performance indicators which are suggested by the American Water Works Association (AWWA) and the International Water Association (IWA) can be used for the evaluating of water distribution systems (WDS) (AWWA 2003). The term of non-revenue water (NRW) that is predicted and discussed with the models in this study was suggested for the first time by the IWA and it consists of apparent losses, real losses and unbilled authorized consumption components.

It is predicted that the annual total NRW in the world is roughly 126 billion m3 and its financial cost is roughly 39 billion dollars (Liemberger & Wyatt 2019). The high level of NRW is definitely an unacceptable situation for water management. In developed countries, the NRW ratio is below the 10% level, and in developing countries it is seen that this level is between 20% and 60% (IBNET 2018). A high level of NRW ratio has a negative impact on the budget of water management and on investment plans in the short- and long-term. Due to this reason, attempts to reduce the NRW level in water distribution systems have gained importance and innovative approaches have been suggested and prioritized in many countries.

The analyses of losses that led to the NRW ratio show the impacts of real losses such as physical losses caused by random pipe-bursts, leakages, pressure fluctuations at the network, pump errors caused by power cuts etc.; of apparent losses such as executive losses caused by water meter errors; and many uncertainties.

Leakage is one of the most effective indicators of real losses. The leakages in networks change depending on the pressure and high pressure results in high leakage and thereby high non-revenue water (Kanakoudis & Muhammetoglu 2014). Cascading failures and severe water losses are most likely to occur in networks with high-pressure. Therefore, pressure management system (PMS) applications have been preferred as a cost-effective method to reduce operation costs, to increase service quality, and to minimize pipe bursts and leakages in water supply network systems (Kanakoudis & Gonelas 2014; Kanakoudis & Gonelas 2016a, 2016b). An ideal pressure management system (PMS) may be designed by constructing pressure-reducing valves (PRVs) installation and district meter areas (DMAs) (Kanakoudis & Gonelas 2016b; Patelis et al. 2016). Pressure-reducing valves help to reduce water losses in parallel with the network pressure (Patelis et al. 2017). It is also possible to reduce water losses by monitoring night flow rate in district meter areas. Another method is a partial network replacement to prevent leakages in a DMA. Replacement of worn-out parts in the network is not effective on losses occurring in every part of the network. Consequently, construction of district meter areas and pressure management systems (PMS) are important and necessary for reducing the leakages in water distribution systems and for reducing the NRW to acceptable levels.

The studies in this field show that mostly statistical and stochastic methods have been developed (Kanakoudis & Tsitsifli 2012; Kanakoudis et al. 2013, 2015, 2016; van den Berg 2015; Tsitsifli et al. 2017; Güngör-Demirci et al. 2018; Tabesh et al. 2018). There has been little study in the literature about NRW ratio prediction. For example, Jang & Choi (2018) have developed models through multiple regression analysis (MRA) and artificial neural networks (ANNs) in order to predict the NRW ratio. In the ANN model, the input variables are as follows: mean pipe diameter, water supply quantity per demand junction, pipe length per demand junction, deteriorated pipe ratio, demand energy ratio, and number of leaks. When the models developed with ANN and MRA methods were compared, it was seen that the R2 value of the ANNs approach was 0.63 and the R2 value of the MRA approach was 0.19. In another study, Incheon (Republic of Korea) has predicted the NRW ratio via the ANN method by using the specific variables which have an impact on the leak variable of the water distribution system (Jang & Choi 2017). The inputs of these models consist of variables such as water demand quantity per junction, deteriorated pipe ratio and demand energy ratio. In these models, the hidden layers have been designed by using 10, 20, and 30 neurons; and the best model result has been obtained with 20 neurons, with R2 = 0.397.

In this study, the variables and combinations which have an impact on the prediction of NRW ratios have been researched by using the ANN methodology and by making 50 models with one, two, three and four-variable-input combinations. In addition, the Kriging method has been used for the first time in this study for NRW ratio prediction and these model results have been evaluated together with the ANN models.

STUDY AREA AND DATA

The model data has been collected from Kocaeli, which is an industrial city, for the application. The total length of the drinking water distribution system is 8,936 km and the number of service connections is 796,577. The water supplied by the Administration of Water and Sewage was used by 1,883,270 people in total in 2018. The infrastructure of the water distribution system of the city is controlled via a SCADA system. Throughout the city, there are 195 drinking water storage tanks, 105 drinking water supply pump stations and 11 drinking water refinement plants. In addition, the city has 12 main DMA regions.

At the end of 2018, the NRW ratio of Kocaeli was calculated for the first time by taking into consideration the water balance components suggested by Alegre et al. (2016). According to the records and calculations, real losses were 24.79%, apparent losses 6.03% and unbilled authorized consumption was 1.49% and billed authorized consumption 67.69% (Table 1).

Table 1

Water balance components of Kocaeli in 2018

System input
volume (SIV)
163,627,918
m3/year
100% 
Authorized consumption
69.18% 
Billed
authorized
consumption
67.69% 
Billed meter consumption 67.35% Revenue
water
67.69% 
Billed unmetered consumption 0.34% 
Unbilled
authorized
consumption
1.49% 
Unbilled meter consumption 0.64% Non-revenue
water (NRW)
32.31% 
Unbilled unmetered consumption 0.86% 
Water
losses
30.82% 
Apparent
losses
6.03% 
Unauthorized consumption 1.18% 
Authorized consumption
errors
4.85% 
Real losses
24.79% 
Leakage on transmission and distribution mains and service connections
24.63% 
Leakage and overflows at storage tanks
0.16% 
System input
volume (SIV)
163,627,918
m3/year
100% 
Authorized consumption
69.18% 
Billed
authorized
consumption
67.69% 
Billed meter consumption 67.35% Revenue
water
67.69% 
Billed unmetered consumption 0.34% 
Unbilled
authorized
consumption
1.49% 
Unbilled meter consumption 0.64% Non-revenue
water (NRW)
32.31% 
Unbilled unmetered consumption 0.86% 
Water
losses
30.82% 
Apparent
losses
6.03% 
Unauthorized consumption 1.18% 
Authorized consumption
errors
4.85% 
Real losses
24.79% 
Leakage on transmission and distribution mains and service connections
24.63% 
Leakage and overflows at storage tanks
0.16% 

The NRW ratio of Kocaeli is equivalent to the difference between SIV and billed authorized consumption, 32.31%. It is seen that the metering errors of water meters have an effect on apparent losses of 4.85%. The accuracy measurement of meters was performed in the Weights and Measures Center of the Ministry of Industry and Trade located in Kocaeli in order to ensure the accuracy of NRW figures for meters of different diameter, age and model selected from DMAs in accordance with 50% flow (1/2 flow rate), 20% flow (1/5 flow rate) and 5% flow (1/20 flow rate). The loss amounts based on measurement errors for all DMAs in Kocaeli are also calculated by taking as a reference these measurement results. In addition, the end-of-life meters (10–15 years), defective meters and meters determined to be very faulty according to the measurement results have been replaced to ensure the sustainability of the reliability of measurements and to reduce the effect of apparent losses.

The model input parameters for NRW ratio are as follows: system input volume (SIV), network length (NL), number of failures (NF), mean pressure of networks (MPN), mean age of pipes (MAP), failure ratio (FR = NL/NF), mean diameter of pipes (MDP), storage tank (ST), number of service connections (NSC), service connections length (SCL), number of water meters (WM) and number of junctions (NJ), and the values of these parameters in Kocaeli have been submitted in Table 2 for the year 2018. When Table 2 is analyzed in more detail, DMA 1 and DMA 5 come into prominence in terms of system input volume (m3), number of service connections, service connections length, number of water meters and number of junctions. On the other hand, it is seen that DMA 10 has the shortest network length and the least water consumption. The network length of DMA 2 is the longest at 1,600 km. According to the analyses of 2018, the highest network pressure was measured in DMA 5 and the lowest pressure in DMA 3. The highest (FR = 1.69) annual failures per km of pipes were detected in DMA 4, the least (FR = 0.13) in DMA 12 in addition to this. While the greatest value of mean pipe diameter is 159 mm in DMA 3, the smallest value is 108 mm in DMA 8. The DMA which is the newest and has the lowest non-revenue water rate is DMA 11 with 23.01%. In contrast, the highest non-revenue water rate is seen in DMA 3. Finally, the oldest district meter area is DMA 4.

Table 2

DMA characteristics and model parameters in 2018

DMASIVNLMPNMAPFRMDPSTNSC
SCL
WM
NJ
NF
NRWR
105 × m3kmmyearmmm3number
%
30.8 1.114 54 12 0.99 134 85.901 51.735 527 160.135 52.565 1.106 29.74 
6.6 1.600 56 14 0.56 121 19.250 31.496 388 32.487 27.110 900 34.05 
14.1 918 43 19 1.21 159 15.910 29.783 342 51.962 32.380 1.114 44.36 
14.7 680 58 30 1.69 125 29.730 22.738 225 66.121 25.841 1.153 32.28 
29.8 1.050 64 16 0.64 144 36.420 45.302 410 147.275 45.513 672 25.40 
8.9 854 53 22 0.57 119 40.880 21.324 232 43.254 27.354 484 28.50 
14.2 803 60 24 0.70 124 28.800 21.549 225 70.595 21.867 560 37.02 
5.1 443 45 27 0.61 108 17.615 12.022 134 29.018 9.660 269 41.10 
11.1 447 47 26 1.26 125 14.705 17.319 171 55.088 18.766 563 40.11 
10 6.2 310 51 18 1.42 125 14.750 9.657 134 15.471 10.216 439 26.16 
11 9.4 339 49 0.51 133 12.000 11.873 130 49.244 17.483 173 23.01 
12 12.2 378 57 25 0.13 121 4.000 21.927 197 75.927 24.282 48 26.01 
DMASIVNLMPNMAPFRMDPSTNSC
SCL
WM
NJ
NF
NRWR
105 × m3kmmyearmmm3number
%
30.8 1.114 54 12 0.99 134 85.901 51.735 527 160.135 52.565 1.106 29.74 
6.6 1.600 56 14 0.56 121 19.250 31.496 388 32.487 27.110 900 34.05 
14.1 918 43 19 1.21 159 15.910 29.783 342 51.962 32.380 1.114 44.36 
14.7 680 58 30 1.69 125 29.730 22.738 225 66.121 25.841 1.153 32.28 
29.8 1.050 64 16 0.64 144 36.420 45.302 410 147.275 45.513 672 25.40 
8.9 854 53 22 0.57 119 40.880 21.324 232 43.254 27.354 484 28.50 
14.2 803 60 24 0.70 124 28.800 21.549 225 70.595 21.867 560 37.02 
5.1 443 45 27 0.61 108 17.615 12.022 134 29.018 9.660 269 41.10 
11.1 447 47 26 1.26 125 14.705 17.319 171 55.088 18.766 563 40.11 
10 6.2 310 51 18 1.42 125 14.750 9.657 134 15.471 10.216 439 26.16 
11 9.4 339 49 0.51 133 12.000 11.873 130 49.244 17.483 173 23.01 
12 12.2 378 57 25 0.13 121 4.000 21.927 197 75.927 24.282 48 26.01 

METHODOLOGY

In this study, the NRW ratios of 12 different DMA regions have been predicted via developed models for different combinations of input variables by using ANNs (artificial neuron networks) and Kriging methods. The model inputs used for NRW ratio prediction are as follows: system input volume, network length, service connection lengths, mean diameter of pipes, mean age of pipes, mean pressure of networks, number of service connections, number of junctions, storage tank, water meter and failure ratio variables. By using the ANN method, models with single-input/single-output have been designed and then two, three and four-input models have been developed by increasing the number of inputs. In the Kriging method, models with two-input/single-output have been developed.

Artificial neural networks (ANNs)

Artificial neural networks (ANNs) consist of mathematical algorithms with learning ability. Many key features are available in ANN methods, for example: ability to learn from data, recognition, prediction, classification, generalization and studying with unlimited number of variables. ANN architecture consists in the simplest way of main layers: input, hidden and output layers as shown in Figure 1 (Piotrowski et al. 2015; Salami Shahid & Ehteshami 2016). Each layer has a different number of neurons and presents an example model network structure by connecting with different weight values to the neurons of the previous layer.

One of the most important features of ANNs is their trainable ability. Thanks to training, the connection weights can be determined by means of an appropriate learning method. Learning is carried out by repeatedly adjusting each connection weight in a neural network (Negnevitsky 2002).

A feed forward back propagation algorithm, which is frequently preferred as a network type in training, has been used in this study. The Levenberg–Marquardt back propagation algorithm has been selected, because learning with this algorithm is often faster than with its alternatives (Kermani et al. 2005; Haykin 2008; Kızılöz et al. 2015). Constitutively, the nonlinear least squares approach and the maximum neighborhood principle have been used in the Levenberg–Marquardt algorithm. The network is operated backwardly by taking into consideration the errors between the expected values and the outputs generated by the network for inputs and thanks to these errors the model is readjusted via the weights. This is repeated until the model results closest to the expected values are obtained.

In addition, the data of this method is as follows; trained (55%), validated (35%) and tested (10%). A sigmoid function in the hidden layer and linear function in the output layer have been preferred with the thought that the most appropriate ANN architectural structure is a three-layer feed forward back propagation network, FFBP, and all data has been trained in accordance with this structure. The ANN models have been designed by one hidden layer. There is no general rule for determining the neuron number in the hidden layer, and the highest model accuracy has been tried to be obtained by increasing the neuron number.

Kriging modeling

Kriging is a geostatistical gridding method which is popularly used in various disciplines and which obtains two- and three-dimensional maps from randomly distributed irregular data. In this study, Kriging methodology and the Surfer software program have been used for model application. This methodology is, statistically, the best linear unbiased estimator (McGrath et al. 2004). The weights have been determined with the condition that the estimation error is the minimum. The method uses a weight model in classical statistical theory which ensures being affected by the close points, similar to the weighted average method (Krige 1966). The Kriging method has been defined through Equation (1) (Yeşilkanat et al. 2015): 
formula
(1)
where is the unknown but predicted value at the point, is the weight values corresponding to each used in the calculation of , is the measured data used in the prediction of and N is the number of points used in the calculation of .

Similarly to classical statistics, the data is usually expected to be congruent with the normal distribution in geostatistical analyses (Clark 1979). The models developed by using the data that is congruent with the normal distribution give the best results. Due to this reason, primarily, checking the congruency of the data with the normal distribution is suggested. If the data is not congruent with the normal distribution, use of appropriate transformations in order to make the data congruent with the normal distribution is suggested.

In the literature research about the Kriging method, it is seen that there have been many studies in recent years: Simple Kriging (Elbasiouny et al. 2014), Ordinary Kriging (Sanusi et al. 2014), Universal Kriging (Lark et al. 2014), Co-Kriging (Chica-Olmo et al. 2014), Indicator Kriging (Armstrong 1998), Punctual Kriging and Block Kriging. In this study, Point Kriging and Block Kriging approaches developed through the Surfer software program have been used. Both of them do gridding through interpolation. In the point technique, the probable points in the grid zones enter into the process. In the block technique, however, a rectangular block is created around each grid zone and only the average of the probable points within this block is included in the gridding process.

Model efficiency formulation

The model prediction results and the measurement data have been evaluated through the statistical model efficiency equations below and the predictability of the models has been presented. For the convenience of the developed models, different statistical measures have been carried out such as mean square error (MSE), Nash–Sutcliffe efficiency (NSE), correlation coefficient (CC) and coefficient of determination (R2). 
formula
(2)
 
formula
(3)
 
formula
(4)
 
formula
(5)
where is the model prediction, is the measurement, are the arithmetic averages of the model predictions, are the arithmetic averages of the measurements, n is total number of data, are standard deviations for the measurements and are standard deviations for the predictions.

NSE takes values between minus infinity and 1. If the predictions match exactly with the model, then this value will be 1. On the other hand, the MSE criterion takes changing values between minus infinity and 1 and its best prediction value is 0. The best prediction has been carried out for the value of 1 in R2. Finally, when the relationship between the model and the prediction is analyzed in terms of CC, it is seen that CC takes values between 0 and +1. The most successful predictions have been obtained when CC approaches 1.

RESULTS AND DISCUSSION

Before the model setting, the distribution of data is first analyzed. The Normal Q-Q plot test given in Figure 2 demonstrates that the research data is congruent with the normal distribution.

Figure 1

ANN structure.

Figure 1

ANN structure.

Figure 2

Test of normality.

Figure 2

Test of normality.

Artificial neural networks

In the prediction of NRW ratios, firstly, single-input/single-output ANN models have been set in order to determine the suitable model variables. Secondly, the model combinations with one, two, three and four input and single output have been researched. The data is divided as 55% training, 35% validation, and 10% testing through the existing algorithm in the MATLAB program. Before each training process, the models with the irregular starting weights and biases should be repeatedly brought to the initial state (Coulibaly et al. 2000; Kızılöz et al. 2015). In the models, four neurons have been preferred in the hidden layers. A large number of ANN models has been created for the prediction of the NRW ratio. The performance values of the model results are given in Tables 36.

Table 3

Performance of ANN models with one input variable

Model noInput combinationsR2MSENSECC
ANN.1.1 NL 0.66 0.0027 0.659 0.812 
ANN.1.2 NL/MPN 0.63 0.0029 0.625 0.793 
ANN.1.3 ST 0.55 0.0035 0.552 0.745 
ANN.1.4 ST/NJ 0.53 0.0037 0.531 0.73 
ANN.1.5 MDP 0.52 0.0038 0.511 0.718 
ANN.1.6 MPN 0.46 0.0043 0.450 0.680 
ANN.1.7 NL/MAP 0.46 0.0043 0.454 0.674 
ANN.1.8 NSC 0.45 0.0044 0.441 0.669 
ANN.1.9 SCL 0.40 0.0048 0.388 0.631 
Model noInput combinationsR2MSENSECC
ANN.1.1 NL 0.66 0.0027 0.659 0.812 
ANN.1.2 NL/MPN 0.63 0.0029 0.625 0.793 
ANN.1.3 ST 0.55 0.0035 0.552 0.745 
ANN.1.4 ST/NJ 0.53 0.0037 0.531 0.73 
ANN.1.5 MDP 0.52 0.0038 0.511 0.718 
ANN.1.6 MPN 0.46 0.0043 0.450 0.680 
ANN.1.7 NL/MAP 0.46 0.0043 0.454 0.674 
ANN.1.8 NSC 0.45 0.0044 0.441 0.669 
ANN.1.9 SCL 0.40 0.0048 0.388 0.631 
Table 4

Performance of ANN models with two input variables

Model noInput combinationsR2MSENSECC
ANN.2.1 ST/NJ–NL 0.73 0.0021 0.728 0.854 
ANN.2.2 SIV/NL–WM 0.67 0.0027 0.660 0.817 
ANN.2.3 ST/NJ–FR 0.63 0.0029 0.627 0.796 
ANN.2.4 NL/MPN-MAP 0.62 0.0030 0.622 0.789 
ANN.2.5 SIV-MPN 0.61 0.0031 0.608 0.780 
ANN.2.6 NL-MPN 0.61 0.0031 0.610 0.783 
ANN.2.7 MPN-FR 0.61 0.0031 0.606 0.780 
ANN.2.8 SIV/NL-MPN 0.59 0.0032 0.591 0.770 
ANN.2.9 MDP-MPN 0.55 0.0036 0.539 0.738 
ANN.2.10 SIV/NJ-MPN 0.55 0.0036 0.536 0.740 
ANN.2.11 SIV/NJ–MDP 0.54 0.0037 0.530 0.738 
ANN.2.12 MAP-FR 0.52 0.0037 0.523 0.725 
ANN.2.13 NL/MAP-MPN 0.50 0.0040 0.488 0.714 
Model noInput combinationsR2MSENSECC
ANN.2.1 ST/NJ–NL 0.73 0.0021 0.728 0.854 
ANN.2.2 SIV/NL–WM 0.67 0.0027 0.660 0.817 
ANN.2.3 ST/NJ–FR 0.63 0.0029 0.627 0.796 
ANN.2.4 NL/MPN-MAP 0.62 0.0030 0.622 0.789 
ANN.2.5 SIV-MPN 0.61 0.0031 0.608 0.780 
ANN.2.6 NL-MPN 0.61 0.0031 0.610 0.783 
ANN.2.7 MPN-FR 0.61 0.0031 0.606 0.780 
ANN.2.8 SIV/NL-MPN 0.59 0.0032 0.591 0.770 
ANN.2.9 MDP-MPN 0.55 0.0036 0.539 0.738 
ANN.2.10 SIV/NJ-MPN 0.55 0.0036 0.536 0.740 
ANN.2.11 SIV/NJ–MDP 0.54 0.0037 0.530 0.738 
ANN.2.12 MAP-FR 0.52 0.0037 0.523 0.725 
ANN.2.13 NL/MAP-MPN 0.50 0.0040 0.488 0.714 
Table 5

Performance of ANN models with three input variables

Model noInput combinationsR2MSENSECC
ANN.3.1 ST/NJ-NL-MAP 0.76 0.0019 0.754 0.869 
ANN.3.2 NL-SCL-MPN 0.76 0.019 0.752 0.868 
ANN.3.3 NL-MPN-MAP 0.75 0.0020 0.744 0.865 
ANN.3.4 NL/MPN-NSC-MAP 0.75 0.0020 0.744 0.863 
ANN.3.5 NL/MPN-MDP-MAP 0.75 0.0020 0.747 0.865 
ANN.3.6 ST/NJ-NL-MPN 0.74 0.0021 0.736 0.860 
ANN.3.7 SIV-NL-MPN 0.73 0.0021 0.728 0.853 
ANN.3.8 SIV-NL-MAP 0.72 0.0022 0.722 0.850 
ANN.3.9 SIV/NL-MDP-MPN 0.72 0.0023 0.713 0.846 
ANN.3.10 SIV/NL-MPN-MAP 0.72 0.0022 0.719 0.849 
ANN.3.11 SIV/NJ-MPN-MAP 0.71 0.0023 0.709 0.845 
Model noInput combinationsR2MSENSECC
ANN.3.1 ST/NJ-NL-MAP 0.76 0.0019 0.754 0.869 
ANN.3.2 NL-SCL-MPN 0.76 0.019 0.752 0.868 
ANN.3.3 NL-MPN-MAP 0.75 0.0020 0.744 0.865 
ANN.3.4 NL/MPN-NSC-MAP 0.75 0.0020 0.744 0.863 
ANN.3.5 NL/MPN-MDP-MAP 0.75 0.0020 0.747 0.865 
ANN.3.6 ST/NJ-NL-MPN 0.74 0.0021 0.736 0.860 
ANN.3.7 SIV-NL-MPN 0.73 0.0021 0.728 0.853 
ANN.3.8 SIV-NL-MAP 0.72 0.0022 0.722 0.850 
ANN.3.9 SIV/NL-MDP-MPN 0.72 0.0023 0.713 0.846 
ANN.3.10 SIV/NL-MPN-MAP 0.72 0.0022 0.719 0.849 
ANN.3.11 SIV/NJ-MPN-MAP 0.71 0.0023 0.709 0.845 
Table 6

Performance of ANN models with four input variables

Model noInput combinationsR2MSENSECCd
ANN.4.1 ST/NJ-NL-MPN-MAP 0.75 0.0020 0.748 0.865 0.924 
ANN.4.2 NL-MPN-MAP-MDP 0.75 0.0021 0.736 0.865 0.926 
ANN.4.3 NL-SCL-MPN-MAP 0.75 0.0020 0.743 0.865 0.924 
ANN.4.4 SIV-NL-MPN-MAP 0.74 0.0021 0.738 0.860 0.918 
ANN.4.5 SIV-NL-MPN-MDP 0.74 0.0020 0.740 0.860 0.921 
ANN.4.6 NL-NSC-MPN-MAP 0.74 0.0020 0.740 0.861 0.924 
ANN.4.7 SCL-MPN-MAP-MDP 0.72 0.0023 0.713 0.846 0.914 
ANN.4.8 SIV/NL-MDP-MPN-MAP 0.72 0.0022 0.715 0.847 0.907 
ANN.4.9 NJ-MAP-MPN-MDP 0.72 0.0022 0.718 0.848 0.913 
ANN.4.10 SIV-NJ-MPN-MDP 0.71 0.0024 0.697 0.843 0.915 
Model noInput combinationsR2MSENSECCd
ANN.4.1 ST/NJ-NL-MPN-MAP 0.75 0.0020 0.748 0.865 0.924 
ANN.4.2 NL-MPN-MAP-MDP 0.75 0.0021 0.736 0.865 0.926 
ANN.4.3 NL-SCL-MPN-MAP 0.75 0.0020 0.743 0.865 0.924 
ANN.4.4 SIV-NL-MPN-MAP 0.74 0.0021 0.738 0.860 0.918 
ANN.4.5 SIV-NL-MPN-MDP 0.74 0.0020 0.740 0.860 0.921 
ANN.4.6 NL-NSC-MPN-MAP 0.74 0.0020 0.740 0.861 0.924 
ANN.4.7 SCL-MPN-MAP-MDP 0.72 0.0023 0.713 0.846 0.914 
ANN.4.8 SIV/NL-MDP-MPN-MAP 0.72 0.0022 0.715 0.847 0.907 
ANN.4.9 NJ-MAP-MPN-MDP 0.72 0.0022 0.718 0.848 0.913 
ANN.4.10 SIV-NJ-MPN-MDP 0.71 0.0024 0.697 0.843 0.915 

First of all, the different ANN model outputs developed for the single-input parameters that have an impact on the NRW ratio can be seen in Table 3. Sixteen different models in total have been developed for each of 12 variables given in Table 2 and for five parameters derived from these variables such as ST/NJ, SIV/NL, NL/MPN, SIV/NJ and NL/MAP. According to the first model results, the best ones in terms of statistical criteria in the estimation of NRW have been aligned in Table 3. As regards the models with one input, the most effective variables are respectively as follows: NL, NL/MPN, ST, ST/NJ and MDP.

According to the model predictions with two inputs, the best 13 models have been aligned in terms of five performance evaluation benchmarks given in Table 4. It is seen that the best model is the ST/NJ-NL combination in accordance with this alignment. When comparing the model results with two inputs with the models with one input, it is seen that the first one has better performance.

In the ANN models with three-inputs, it is seen that there is a great number of combinations with better performance. The 11 models given in Table 5 are not distinctly separated from each other in terms of the R2, MSE, NSE, CC and d benchmarks. The NRW estimates have signally improved on the three-input models in contrast to the two-input models.

When the performances of the four-input and three-input combinations given in Table 6 are evaluated together, it can be understood that the additional variable has not sufficiently ameliorated the prediction performance. In the four-input models made by adding an input to the three-input model combinations, it can be seen that there is not a considerable performance improvement in the prediction of the NRW ratio, according to the obtained performance results.

The graphs of the two combinations with the best measure–model-prediction relationship among all ANN models are given in Figure 3.

Figure 3

Scatter graphs for ST/NJ-NL-MAP and NL-SCL-MPN models.

Figure 3

Scatter graphs for ST/NJ-NL-MAP and NL-SCL-MPN models.

A comparative evaluation of all models shows that the three-input models have better results. In general, the best predictions have been made with the three-input models. When the performances of the three-input models are compared with the performances of the four-input models by means of the model performances given in Tables 5 and 6, a significant amelioration cannot be observed in models with four inputs. On the other hand, it is seen that the two-input models cannot adequately explain the NRW predictions. In addition, the closed-box structure of the ANN method does not allow an evaluation in detail. When the best models were analyzed, it was concluded that variables such as the storage tank, the number of junctions, the network length, the mean age of pipes, the service connection lengths and the mean pressure networks have a greater impact on NRW prediction.

Kriging method

All data has been set according to the XYZ coordinate system and three-dimensional Kriging prediction maps (with high accuracy and different grid zones) have been obtained. In Kriging models, the first parameters have been represented on the x-axis and the second ones on the y-axis. In graphs, the nodes of Kriging Model 1 are given as x-direction 100, y-direction 95 and the nodes of Kriging Model 2 as x-direction 88, y-direction 100.

Cross-validation has been used for all gridding methods. Cross-validation helps to assess the relative quality of the grid by computing and investigating the gridding errors. Cross-validation has been performed on the linear Z values, not on the transformed Z values in all Kriging models. The model outputs of NRW ratios have been obtained over the Z coordinate system through the prediction model maps developed by Surfer (2016). The outcomes of Kriging model performance are given in Table 7. The prediction maps developed for the best Kriging models (nos 1 and 2) are given in Figures 4 and 5. Thanks to the obtained model figures, the evaluation of the relationship between the variables and the NRW ratios is made easier.

Table 7

Performance of Kriging models with different variables

Model noInput combinationsR2MSENSECCd
Kriging 1 SIV/NJ-MPN 0.95 0.0004 0.944 0.974 0.985 
Kriging 2 SIV/NJ-MAP 0.94 0.0005 0.937 0.971 0.982 
Kriging 3 SIV/SCL-MAP 0.93 0.0006 0.928 0.966 0.979 
Kriging 4 SIV/SCL-MPN 0.93 0.0006 0.929 0.966 0.979 
Kriging 5 SIV/NL-MPN 0.90 0.0008 0.901 0.949 0.973 
Kriging 6 SIV/NL-MAP 0.90 0.0008 0.899 0.949 0.972 
Kriging 7 NL/MPN-MAP 0.79 0.0017 0.782 0.886 0.939 
Kriging 8 NL/MAP-MPN 0.78 0.0017 0.778 0.884 0.932 
Kriging 9 NL/MPN-MPD 0.78 0.0017 0.779 0.883 0.936 
Kriging 10 SCL/MPN-MAP 0.74 0.0021 0.735 0.859 0.921 
Model noInput combinationsR2MSENSECCd
Kriging 1 SIV/NJ-MPN 0.95 0.0004 0.944 0.974 0.985 
Kriging 2 SIV/NJ-MAP 0.94 0.0005 0.937 0.971 0.982 
Kriging 3 SIV/SCL-MAP 0.93 0.0006 0.928 0.966 0.979 
Kriging 4 SIV/SCL-MPN 0.93 0.0006 0.929 0.966 0.979 
Kriging 5 SIV/NL-MPN 0.90 0.0008 0.901 0.949 0.973 
Kriging 6 SIV/NL-MAP 0.90 0.0008 0.899 0.949 0.972 
Kriging 7 NL/MPN-MAP 0.79 0.0017 0.782 0.886 0.939 
Kriging 8 NL/MAP-MPN 0.78 0.0017 0.778 0.884 0.932 
Kriging 9 NL/MPN-MPD 0.78 0.0017 0.779 0.883 0.936 
Kriging 10 SCL/MPN-MAP 0.74 0.0021 0.735 0.859 0.921 
Figure 4

SIV/NJ-MPN model.

Figure 4

SIV/NJ-MPN model.

Figure 5

SIV/NJ-MAP model.

Figure 5

SIV/NJ-MAP model.

The best three-input ANN.3.1 model and Kriging 1 model developed within the framework of this study have been evaluated together in Figure 6 and it is seen that the Kriging model makes better predictions than the best ANN model. When all the ANNs and Kriging model results in Tables 57 are evaluated, it is seen that the Kriging model results are much better than those of the ANN models. Furthermore, in the Kriging models, the prediction with the figure information is highly important and expert evaluation of the model outputs is possible.

Figure 6

Comparison of the best Kriging and ANN model results.

Figure 6

Comparison of the best Kriging and ANN model results.

In this study, the comparison of ANN and Kriging model results with common inputs is given in Table 8. It is seen that the NRW ratio model predictions of the Kriging method is better than the ANN solutions. The R2 value of the NL/MPN-MAP combination, the best one to estimate the NRW by using ANN, is 0.62. However, when Kriging methodology is used for the same model inputs, this value is 0.79. In this study, the best model prediction of NRW ratio is obtained by means of the SIV/NJ-MPN combination and Kriging methodology. When the same combination is used with ANN, the R2 value is determined as 0.55 in terms of model performance, lower than that of the Kriging model.

Table 8

Comparison of the results by using ANN and Kriging methodologies for the same input data

Input combinationsMethodologyR2MSENSECCd
SIV/NJ-MPN Kriging 1 0.95 0.0004 0.944 0.974 0.985 
ANN.2.10 0.55 0.0036 0.536 0.740 0.829 
SIV/NL-MPN Kriging 5 0.90 0.0008 0.901 0.949 0.973 
ANN.2.8 0.59 0.0032 0.591 0.770 0.862 
NL/MPN-MAP Kriging 7 0.79 0.0017 0.782 0.886 0.939 
ANN.2.4 0.62 0.0030 0.622 0.789 0.872 
NL/MAP-MPN Kriging 8 0.78 0.0017 0.778 0.884 0.932 
ANN.2.13 0.50 0.0040 0.488 0.714 0.830 
Input combinationsMethodologyR2MSENSECCd
SIV/NJ-MPN Kriging 1 0.95 0.0004 0.944 0.974 0.985 
ANN.2.10 0.55 0.0036 0.536 0.740 0.829 
SIV/NL-MPN Kriging 5 0.90 0.0008 0.901 0.949 0.973 
ANN.2.8 0.59 0.0032 0.591 0.770 0.862 
NL/MPN-MAP Kriging 7 0.79 0.0017 0.782 0.886 0.939 
ANN.2.4 0.62 0.0030 0.622 0.789 0.872 
NL/MAP-MPN Kriging 8 0.78 0.0017 0.778 0.884 0.932 
ANN.2.13 0.50 0.0040 0.488 0.714 0.830 

Finally, the comparison of the models used in this study for NRW estimation with the performance of models studied in the literature is made and given in Table 9. The R2 values of the models with three and six inputs developed by Jang & Choi (2017) by means of the ANN approach are respectively 0.40 and 0.63. On the other hand, a better prediction model with three inputs, R2 value: 0.76, could be developed in this study thanks to the combinations of different input variables. In addition, a much better model prediction, R2 value: 0.95, is also obtained thanks to a new approach used for the first time in NRW estimation.

Table 9

Comparison of the previous model results with the results of this study

NoStudyModelInput numberR2
Jang & Choi (2018)  ANN 0.63 
Jang & Choi (2017)  ANN 0.40 
This study ANN 0.76 
This study Kriging 0.95 
NoStudyModelInput numberR2
Jang & Choi (2018)  ANN 0.63 
Jang & Choi (2017)  ANN 0.40 
This study ANN 0.76 
This study Kriging 0.95 

CONCLUSIONS

Firstly, as a result of this study, it is seen that the new four-input variable suggested in addition to the studies on NRW ratio estimation in the literature has a positive effect on MPN, MAP, NL/MPN and NL/MAP model performances. The importance of independent variables to be selected for the best NRW ratio model predictions is introduced once again by the obtained results. On the other hand, the most important contribution of this study is the effectiveness of an alternative method, namely Kriging, used in NRW ratio estimations. Although the number of independent variables for predictions carried out by ANN methodology was increased, it was not possible to exceed a certain level of performance in model predictions. However, the results of Kriging modeling used for the first time for NRW ratio estimations are very good. In addition, much better predictions than those of ANN approaches are obtained by Kriging models with two inputs. In ANN models, evaluation through the predictions becomes difficult. It is not possible to reach directly the information on why and how the model results occur. On the other hand, developing innovative models which allow expert evaluation and interpretation is possible by the new Kriging technique in this study.

REFERENCES

REFERENCES
Alegre
H.
,
Baptista
J. M.
,
Cabrera
E.
,
Cubillo
F.
,
Duarte
P.
,
Hirner
W.
,
Merkel
W.
&
Parena
R.
2016
Performance Indicators for Water Supply Services
, 3rd edn.
IWA Publishing
,
London, UK
.
Armstrong
M.
1998
Basic Linear Geostatistics
.
Springer-Verlag
,
Berlin, Germany
.
AWWA
2003
Best practice in water loss control: improved concepts for 21st century water management
.
American Water Works Association
,
Denver, CO, USA
. .
Chica-Olmo
M.
,
Luque-Espinar
J. A.
,
Rodriguez-Galiano
V.
,
Pardo-Igúzquiza
E.
&
Chica-Rivas
L.
2014
Categorical Indicator Kriging for assessing the risk of groundwater nitrate pollution: the case of Vega de Granada aquifer (SE Spain)
.
Science of the Total Environment
470–471
,
229
239
.
Clark
I.
1979
Practical Geostatistics
.
Applied Science Publishers
,
London, UK
Coulibaly
P.
,
Anctil
F.
&
Bobée
B.
2000
Daily reservoir inflow forecasting using artificial neural networks with stopped training approach
.
Journal of Hydrology
230
(
3–4
),
244
257
.
González-Gómez
F.
,
García-Rubio
M. A.
&
Guardiola
J.
2011
Why is non-revenue water so high in so many cities?
International Journal of Water Resources Development
27
(
2
),
345
360
.
Güngör-Demirci
G.
,
Lee
J.
,
Keck
J.
,
Guzzetta
R.
&
Yang
P.
2018
Determinants of non-revenue water for a water utility in California
.
Journal of Water Supply: Research and Technology – AQUA
67
(
3
),
270
278
.
Haykin
S.
2008
Neural Networks and Learning Machines
, 3rd edn.
Pearson Prentice Hall
,
Upper Saddle River, NJ
,
USA
.
IBNET
2018
The International Benchmarking Networks
. .
Kanakoudis
V.
&
Gonelas
K.
2016a
Non-revenue water reduction through pressure management in Kozani's water distribution network: from theory to practice
.
Desalination and Water Treatment
57
(
25
),
11436
11446
.
Kanakoudis
V.
&
Gonelas
K.
2016b
Assessing the results of a virtual pressure management project applied in Kos Town water distribution network
.
Desalination and Water Treatment
57
(
25
),
11472
11483
.
Kanakoudis
V.
&
Muhammetoglu
H.
2014
Urban water pipe networks management towards non-revenue water reduction: two case studies from Greece and Turkey
.
CLEAN – Soil, Air, Water
42
(
7
),
880
892
.
Kanakoudis
V.
,
Tsitsifli
S.
&
Zouboulis
A. I.
2015
WATERLOSS project: developing from theory to practice an integrated approach towards NRW reduction in urban water systems
.
Desalination and Water Treatment
54
(
8
),
2147
2157
.
Kanakoudis
V.
,
Tsitsifli
S.
&
Demetriou
G.
2016
Applying an integrated methodology toward non-revenue water reduction: the case of Nicosia, Cyprus
.
Desalination and Water Treatment
57
(
25
),
11447
11461
.
Kermani
B. G.
,
Schiffman
S. S.
&
Nagle
H. T.
2005
Performance of the Levenberg–Marquardt neural network training method in electronic nose applications
.
Sensors and Actuators B: Chemical
110
(
1
),
13
22
.
Kızılöz
B.
,
Çevik
E.
&
Aydoğan
B.
2015
Estimation of scour around submarine pipelines with Artificial Neural Network
.
Applied Ocean Research
51
,
241
251
.
Krige
D. G.
1966
Two-dimensional weighted moving average trend surfaces for ore evaluation
.
J. South Afr. Inst. Min. Metall.
66
,
13
38
.
Lark
R. M.
,
Ander
E. L.
,
Cave
M. R.
,
Knights
K. V.
,
Glennon
M. M.
&
Scanlon
R. P.
2014
Mapping trace element deficiency by cokriging from regional geochemical soil data: a case study on cobalt for grazing sheep in Ireland
.
Geoderma
226–227
,
64
78
.
Liemberger
R.
&
Wyatt
A.
2019
Quantifying the global non-revenue water problem
.
Water Science and Technology: Water Supply
19
(
3
),
831
837
.
McGrath
D.
,
Zhang
C.
&
Carton
O. T.
2004
Geostatistical analyses and hazard assessment on soil lead in Silvermines area, Ireland
.
Environmental Pollution
127
,
239
248
.
Negnevitsky
M.
2002
Artificial Intelligence: A Guide to Intelligent Systems
, 1st edn.
Addison Wesley
,
New York, USA
.
Patelis
M.
,
Kanakoudis
V.
&
Gonelas
K.
2016
Pressure management and energy recovery capabilities using PATs
.
Procedia Engineering
162
,
503
510
.
Patelis
M.
,
Kanakoudis
V.
&
Gonelas
K.
2017
Combining pressure management and energy recovery benefits in a water distribution system installing PATs
.
Journal of Water Supply: Research and Technology – AQUA
66
(
7
),
520
527
.
Piotrowski
A. P.
,
Napiorkowski
M. J.
,
Napiorkowski
J. J.
&
Osuch
M.
2015
Comparing various artificial neural network types for water temperature prediction in rivers
.
Journal of Hydrology
529
(
P1
),
302
315
.
Salami Shahid
E.
&
Ehteshami
M.
2016
Application of artificial neural networks to estimating DO and salinity in San Joaquin River basin
.
Desalination and Water Treatment
57
(
11
),
4888
4897
.
Sanusi
M. S. M.
,
Ramli
A. T.
,
Gabdo
H. T.
,
Garba
N. N.
,
Heryanshah
A.
,
Wagiran
H.
&
Said
M. N.
2014
Isodose mapping of terrestrial gamma radiation dose rate of Selangor state, Kuala Lumpur and Putrajaya, Malaysia
.
Journal of Environmental Radioactivity
135
,
67
74
.
Tabesh
M.
,
Roozbahani
A.
,
Roghani
B.
,
Faghihi
N. R.
&
Heydarzadeh
R.
2018
Risk assessment of factors influencing non-revenue water using Bayesian networks and fuzzy logic
.
Water Resources Management
32
(
11
),
3647
3670
.
Tsitsifli
S.
,
Kanakoudis
V.
,
Kouziakis
C.
,
Demetriou
G.
&
Lappos
S.
2017
Reducing non-revenue water in urban water distribution networks using DSS tools
.
Water Utility Journal
16
,
25
37
.
van den Berg
C.
2015
Drivers of non-revenue water: a cross-national analysis
.
Utilities Policy
36
,
71
78
.
Yeşilkanat
C. M.
,
Kobya
Y.
,
Taşkin
H.
&
Çevik
U.
2015
Dose rate estimates and spatial interpolation maps of outdoor gamma dose rate with geostatistical methods: a case study from Artvin, Turkey
.
Journal of Environmental Radioactivity
150
,
132
144
.