Forecasting of water demand and equitable allocation of local water resources are used to reduce and eliminate water shortages and waste. The key emphasis of this research article is to estimate water demand using the prediction model for the Peroorkada urban water distribution network. The characteristics, such as head, pressure, and base demand, related to the water demand were the features of the prediction model. The prediction model has been developed using python. The water distribution network consists of 99 nodes. The demand graph for a time interval of 6 h has been plotted and predicted for all the nodes, and 24-h interval demand has been plotted for vulnerable nodes, which were determined by the sensor placement toolkit. This study included 13 machine learning algorithms, including three hybrid/stacked regression techniques. The least absolute shrinkage and selection operator-based stacking regressor model performs the best at demand prediction. Single prediction models were outperformed by stacking regressor models.

  • Water demand using the prediction model for the Peroorkada urban water distribution network is estimated.

  • The prediction model has been developed using Python software on a Jupyter notebook. The water distribution network consists of 99 nodes.

  • A total of 13 machine learning algorithms were used in this work, three of which were hybrid/stacked regression methods.

  • Demand prediction is best achieved with the Lasso-based stacking regressor model.

Many cities worldwide experience water scarcity (Greve et al. 2018). More than half the population is expected to live in water-stressed areas by 2030 (Bakkes et al. 2008). For the consumption of water, distribution systems have to be managed sustainably. Water demand prediction is crucial for the same (House-Peters & Chang 2011). Infrastructure decision-makers also consider this to create effective water consumption plans and schedules (Derrible 2016; Pacchin et al. 2019; Pesantez et al. 2020). This is particularly crucial, given the ongoing urbanization of the population and consumer pushback for lower energy and resource consumption. To design, plan, operate, and manage water distribution systems (WDSs), water utilities around the globe rely on urban water demand projections. Water demand approximations are crucial to the optimal functioning of WDS as they provide the basis for efficient scheduling. For example, it helps water enterprises evaluate pricing strategies, pump station operations, pipe network capacity, water distribution, and water production (Herrera et al. 2010). Historical water consumption, operating conditions, socioeconomic factors, and weather patterns frequently influence future water demand (Donkor et al. 2014). Water demand forecasting has been approached from many angles, but it may be broadly divided into learning algorithms and classical procedures. Early articles used time-series and linear regression models, two standard statistical techniques, to address this problem (Zhou et al. 2000; Bougadis et al. 2005; Wong et al. 2010).

Nonlinear approaches are algorithms used for learning. Nonlinear approaches largely hinge on previous data to establish the association between water demand and substantial factors (such as wind speed, relative humidity, and temperature). Artificial intelligence and machine learning (ML) are examples of advanced data analysis approaches that enable learning algorithm models to extract essential knowledge from demand data of water accurately. The characteristics that are selected as model inputs and the prediction algorithms used during model development are the two factors that affect the success of learning algorithm models (Fan et al. 2017).

This research focuses on anticipating water demand in the metropolitan water distribution system of Peroorkada, Trivandrum. According to a study by the Kerala Water Authority, the Peroorkada network consists of 99 nodes (Sankaranarayanan et al. 2017; Sankaranarayanan et al. 2019, 2020). Peroorkada's commercial supply system, with one reservoir, has a water storage facility of about 8 million litres. A total of 13 prediction models were developed, which comprised both mathematical and machine intelligence learning models. Hybrid machine learning prediction situations were discussed. The historical dataset was utilized to train algorithms for predicting future water demand. Their performances were compared to choose the most suited model.

This is how the remaining of the article is structured. Section 2 provides a synopsis of relevant research. The approaches are outlined in Section 3. The Case study, in Section 4, the model test findings are reported in Section 5, and the study's decisions and future research suggestions have been delivered in Section 6.

A prediction model integrates geographic data with economic and social variables to estimate water demand. Explanatory variable selection and model construction are the two primary phases in prediction modelling (Shuang & Zhao 2021). The model characteristics described depend on the demand patterns. Using a procedure to determine the correlation between the chosen characteristics and the prediction target – in this example, water demand is a step in the model-building process.

Water demand predictions have already been made using machine learning prediction methods. Numerous studies have looked at the effectiveness of support vector machine (SVM) regression, a well-liked machine intelligence learning technique that is frequently employed in water demand projections (Braun et al. 2014; Brentan et al. 2017). In addition to SVM, other machine learning approaches that were being exercised to anticipate WDS demand include artificial neural networks (ANNs) (Maier et al. 2010; Romano & Kapelan 2014; Baotić et al. 2015), random forests (RFs, Mouatadid & Adamowski 2016), and extreme learning machines (Li et al. 2017; Lu et al. 2020). In general, machine learning has proved to be promising and is often utilized in estimating water demand.

A prediction model's goal will dictate its prediction periodicity, the amount of time between consecutive forecasts and prediction horizon, or the period of impending demand that should be projected (Bakker et al. 2003). When planning or designing urban WDS, long-term predictions – yearly projections for more than 10 years – are typically employed. To estimate the income or charge of a water supply and enhance the distribution system, medium-period predictions – that is, monthly or yearly forecasts for 1–10 years – are typically employed. The daily routine operations of the water plants are normally grounded on short-term forecasts. WDS smart management is evolving quickly, and there is a need for very accurate short-term predictions, particularly ones that have real-time pipe burst detection and optimal WDS control (Bakker et al. 2013; Hutton & Kapelan 2015).

Considered factors in the study for water demand

In order to forecast the monthly water utilization of Austin, Texas, Lu et al. (2020) created a fusion model and noted that the demand was strongly correlated with the city's population, regular mean temperature, and regular mean humidity. Shanghai's yearly water consumption was examined by Li et al. (2017) using prime module analysis regression. The authors discovered that the city's populace and Gross domestic product significantly affect yearly water distribution demand.

For forecasting urban residents' demand for water with variables such as the quality of population, GDP per capita, water cost, the sum of yearly supply and temperature of water from water companies, and weather variations, Zhao & Chen (2014) used a neural network algorithm based on elastic backpropagation. For forecasting the yearly water consumption in the Chinese city of Guangdong with variables such as precipitation, the sum of water resources, GDP, permanent resident information, tertiary factories added value, the added value of the industrial area, added value of agricultural and irrigation area, animal husbandry, forestry, and fishery, Tian & Xue (2017) created a neural network with backpropagation.

Zhi-Guo et al. (2010) developed methods to assess the base demand for water in Chinese cities Beijing and Jinan, taking into account variables including GDP, per capita water resources, added value in elementary, intermediate, and higher sectors, and population. Sun et al. (2017) assessed the maintainable use of Beijing's water resources, taking into account the population, economy, supply of water, demand and resources of land, pollution, and management.

Models for water demand predictions

The selection of suitable water demand estimation is quite difficult since such methods are affected by surrounding factors (Fricke 2013). Statistical and machine-learning models are the two main categories of predictive models. Probability theory and quantitative statistics are utilized in statistical models to determine the functional connection among various variables. Regression using statistical techniques, such as least absolute shrinkage and selection operator (Lasso) regression, linear regression, and ridge regression, are often used.

The establishment of links between explanatory variables and dependent variables is not necessary for machine learning models (Mitchell 1997). Rather, they exercise procedures such as RF regression, SVM regression, and decision tree (DT) to identify models in training data and utilize those patterns to forecast future events. A number of statistical models are employed to anticipate water consumption (House-Peters et al. 2010; Kontokosta & Jain 2015; Arbues & Villanua 2016; Ashoori et al. 2016). Finding a single mathematical function that would perform well on a variety of datasets is challenging because statistical models are primarily limited by the need to have a predefined structure (Hastie et al. 2009; Lee & Derrible 2020). Moreover, statistical models frequently fall short in handling intricate data linkages, and as data volume increases, so does the accuracy of their predictions ends to improve (Mu et al. 2020). When working with large and complicated datasets, other techniques have to be utilized (Villarin & Rodriguez-Galiano 2019). For instance, Rozos et al. (Makropoulos et al. 2016) used cellular automata modelling and integrated system dynamics to estimate water demand.

With their outstanding predictive performance in areas including urban infrastructure (Marvuglia & Messineo 2012; Ali et al. 2016; Golshani et al. 2018; Lee et al. 2018), financial risk (Xia et al. 2017; García et al. 2019), energy (Voyant et al. 2017), ecology (Archibald et al. 2009; Darling et al. 2012; Muñoz-Mas et al. 2019), and resource management of water (Rozos 2019), techniques for machine learning are gaining popularity. Depending on how many predictors are used, machine learning techniques may be additionally separated to ensemble algorithms and single machine learning predictors. A single predictor, such as decision tree, SVM, or neural network, consists of just one predictor (or method). Many predictors are aggregated via ensemble algorithms including gradient-boosting tree and RF, which all contribute to the final prediction outcome.

The use of ensemble learning is growing in popularity (Nascimento et al. 2014; Lessmann et al. 2015). It trains several models using the concepts of statistical sampling. A fresh sample is predicted independently using each of these models. The majority voting procedure determines the rate of the concluding forecast constituted for the novel model. Stated differently, ensemble learning combines several hypotheses from individual predictors into a single hypothesis. In order to forecast everyday domestic water consumption in reaction to the housing demand for water, Lee et al. (2018) looked into 12 numerical and ML methods in the field of water supply prediction. With the aim of forecasting the hourly water consumption of 90 accounts, Pesantez et al. (2020) utilized SVMs, RFs, and ANNs to smart-meter data.

Support vector machine regression, a backpropagation ANN, and an excessive understanding machine were utilized by Parisouj et al. (2020) to forecast the daily and monthly streamflow of quaternion catchment reservoirs in the USA. Villarin & Rodriguez-Galiano (2019) developed a multiple variate estimation model for demand of water in Seville, Spain, using regression trees, RFs, and categorization. Sengupta et al. (2018) predicted changes in stream channel morphology using SVM regression, ANNs, and RF regression.

Structure of prediction model

At junctions or nodes, the input data obtained are elevation (above reference), base demand, and initial water quality (which is binary, good or bad). The quality of water is one of the factors that influence water demand for both domestic and commercial purposes. The good quality water demand is more than that of average or bad quality. The outputs obtained are hydraulic head (intrinsic energy), pressure, water quality (which is binary, good or bad), and water demand. The measured parameters are hydraulic head and pressure. The other values are constant at particular nodes. At links or pipes, the input data that are constant are pipe length, diameter, and pipe roughness. The measured output are flow rate, velocity, and head loss (determined by roughness). The objective of this investigation was to estimate 13 modelling approaches, three of which were hybrid/stacked predictions of water demand in the Peroorkada urban area. The data are collected every 6 h (frequency) for a day (24 h). The time interval of the prediction is short term, which is the prediction of water demand at all the nodes at a specific hour such as 06:00 h, 12:00 h, and so on, and the prediction of water demand for the entire day for mentioned vulnerable nodes. Data preparation, modelling, model preparation/training, k-fold cross-validation, and model testing were five stages of the study design (Figure 1). The preparation of data enables reducing the sensitivity of the models.
Figure 1

Flowchart for machine learning models.

Figure 1

Flowchart for machine learning models.

Close modal

Data preprocessing

To guarantee the comparability of the tested models' prediction outcomes, feature scaling was applied uniformly to all of them. The selected feature scaling technique was normalization. Let, P = (X, y) stand for the training set, and y be the dependent variable. X = (x1 to xn) is the n-dimensional illustrative set. One way to express the normalization of xi is as follows:
(1)
The standardized rate of the illustrative variable x for trial is indicated by ; the highest and least values of x are indicated by high(x) and least(x), correspondingly. In this study, the dependent and explanatory variables were both standardized. Figure 2 depicts the data preprocessing processes.
Figure 2

Data preprocessing.

Figure 2

Data preprocessing.

Close modal
Figure 3

Lasso-based stacked regressor, Ridge-based stacked regressor, and Linear-based stacked regressor.

Figure 3

Lasso-based stacked regressor, Ridge-based stacked regressor, and Linear-based stacked regressor.

Close modal

Modelling

In this study, 13 models were presented to predict water demand:

Numerical methods: ridge regression method, linear regression (LR) method, Lasso regression method, elastic net regression (ENR), and partial least squares (PLS) method.

Machine learning representations: individual predictors: DT, SVM regression; ensemble approaches: DT and RFs. While DT is a serial integration technique, RF is a parallel integration algorithm.

The techniques were implemented using the Python Scikit-Learn module (Pedregosa et al. 2011; Buitinck et al. 2013). For every technique mentioned in Section 3.3, a different prediction model was constructed. The hyperparameter values are used to train the models. The following is a brief description of each algorithm.

Random forest

Breiman (2001) introduced RF, an ensemble learning method that selects features using the bagging approach. Specifically, substitution sampling is exercised at random to choose a subdivision of factors to build an individual tree in the ensemble, whereas substitution sampling is done to train trees using samples commencing the initial data. To reach the final result, the prediction results given by every trained tree in the ensemble are combined for every novel test sample using the majority voting technique.

Gradient Boosting Tree Regression

Gradient Boosting Tree Regression uses the notion of an ensemble method, which is acquired from the technique called decision trees (Friedman 2001). The decision tree is based on the structure of a tree. The predicted outcome is the target leaf, which commences from the root and divisions of the tree with some conditions before reaching the leaves.

Gaussian process regression

Gaussian processes (GPs) are a nonparametric supervised learning technique for solving regression and probabilistic classification issues. Gaussian process regression (GPR) is a sophisticated and adaptable nonparametric regression approach used in machine learning and statistics (Shuang & Zhao 2021). GPR is a Bayesian technique that can provide some assurance in forecasts, which makes it valuable for numerous functions concerning optimization and time series calculation. GPR is based on the concept of a Gaussian structure, i.e. a set of arbitrary values with a collective Gaussian distribution.

Linear regression

LR uses a linear function to represent the connection among the illustrative value(s) x and the variables of dependencies y (Guo et al. 2018). It uses a linear method to reduce the residual sum of squares amid the observed and projected values of the variable of dependency. The coefficients are calculated via the ordinary least squares (OLS) technique.

Lasso regression

Tibshirani (1996) suggested Lasso regression. The key feature of the method is such that it entirely removes minimal significant attributes by arranging its weight values to zero (Sankaranarayanan et al. 2020). The spontaneous selection of features and sparse matrix generation are done by Lasso regression.
(2)
where is the result, is the covariate, β is the coefficient of regression, and λ is the shrinkage operator.

Ridge regression

Ridge and Lasso are variants of the linear method (Hoerl & Kennard 1970), and ridge is an enhancement to the approach of OLS. It is an added steady and accurate machine learning technique due to the deficiency of impartialness (Khan et al. 2019). To balance variance and bias, the regularization term, i.e., L2 is applied after the sum of squared errors (SSE).

Decision tree

DT is a traditional machine intelligence algorithm. It creates a binary programme tree constructed on the sample characteristics. This method of reaching a prediction outcome is simple for grasping and interpretating from the final decision tree structure, which resembles a flow diagram, with leaf joints representing the predicted variables (Shuang & Zhao 2021).

Support vector machine

SVM regression locates a dividing hyperplane and fits y on x with the biggest difference. It employs a function of kernel to translate the initial training group with nonlinear characteristics to a higher dimensional attribute space, in which the values become proportionally distinguishable (Braun et al. 2014). Most common functions that use kernel are polynomial, linear, and radial basis functions.

Elastic net regression

A variation of linear method that employs the imaginary function is the elastic net regression. It is a superior machine learning technique that comprises components of both Lasso and Ridge (Khan et al. 2019). It is a standardized regression method designed to deliver the issues of multicollinearity and overfitting that are common in big datasets. This technique works by including a penalty term into the normal least-squares objective function.

Partial least-squares regression

The PLS method is a strategy that decreases interpreters to a reduced collection of noncorrelated factors and conducts the least-squares method based on these factors, instead of the whole set of variables (Hanrahan et al. 2005).

Stacking regression

Stacking regression (Figure 3) is a prominent machine learning strategy of the ensemble method for predicting several nodes with the aim of creating a novel method and enhancing its execution. Stacking permits to train numerous algorithms to address related issues and associates their product to generate a novel algorithm with improved performance. Stacking regression involves stacking the output of distinct predictors and using a machine learning method to obtain the optimal estimate. It permits to use strengths of every distinct estimator by utilizing their product as input to the last estimator (Anbananthen et al. 2021). It employs various meta-algorithms for determination to integrate the best estimates from two or more basic algorithms. Cross-validation and the least squares for values of nonnegative are used to get the stack coefficient. It appears to be more efficient than conventional ML procedures.

Model training

A typical aim in machine learning is to investigate and develop methods that are learning and predicting data value (Kohavi 1998). Such methods work by producing data-driven predictions or assessments (Bishop 2006) by developing a mathematical model from incoming data. The input data required to develop the model are often separated into several datasets. Specifically, three datasets are frequently employed at various phases of model development: training, validation, and test sets. Using Python, the input ‘X’ data groups are divided into two parts: training and k-fold validation, with the training group taking up 0.8 of the data groups and the test group taking up the remaining 0.2.

k-fold cross-validation

When a model is fitted to the training group and validated using the test group, it is prone to excessive fitting, which implies that it functions excellently on observed data but poorly on unobserved information. To alleviate this problem of excessive fitting, a 10-fold cross-validation (CV) was applied to the training datasets. The k-fold CV spontaneously divides the training data into 10 distinct subgroups, each constituting one-fold of the total. Each model is then trained on the remaining nine-folds and tested on the remaining fold 10 times, with the validation fold changing between iterations. The mean prediction result and standard deviation are calculated using 10-fold scores.

Model testing

The test data consisted of the remaining data, which made up of 20% of the total dataset. The predicted results from the preceding phase are compared to what the actual outcomes should have been. Several evaluation metrics are created to determine the model's performance. To assess each of the model's performance, three performance metrics were used: mean absolute error (MAE), mean squared error (MSE), and determination coefficient (R2). These measures were evaluated because they are commonly used in studies on demand forecast (Villarin & Rodriguez-Galiano 2019; Lee & Derrible 2020; Mu et al. 2020; Pesantez et al. 2020).

The difference between the actual and anticipated values for each sample was measured using MSE and MAE, as follows:
(3)
(4)
where represents the projected result of the th sample; is the actual variable; and n is the total sample number. Lower MSE and MAE imply that the model fits better.
R2 represents the merit of the fit. The amount of variation enlightened by the illustrative values employed in a method is calculated as follows:
(5)
where and yi are the projected and observed data for the th group, correspondingly, and is the mean of all observed data, . The fraction of explained variation reveals how accurately unknown samples are anticipated. The best potential R2 value is 1.0.

The Peroorkada WDN, which is taken into consideration for this study, is operated on 24-h regular basis. The resource planning and allocation are done for every 24 h substantially to prevent the data managing problem. This measurement of large period calculated for WDS scheduling creates a dynamic model. The inflows and outflows of head and flow of the system are computed and updated for corresponding periods of time. Each consumer node has the outflow from the tank that depends upon the tank head and the inflow from the main water distribution lines. The aforementioned demand dispatch circumstances cannot be accommodated by the traditional steady-state models for WDS as they all take into consideration demand-driven situations, in which the demands are directly related to the WDS.

This gives the same response as the extensive nonlinear modelling running over a longer time period. The proposed machine learning stacking methods and their performance may be resolved by the implementation of the ML system on an artificial distribution test system. In a greater dimension, the evaluation of the proposed methods is done by a possible platform of a real-time water distribution system. The eastern water network segment of the town named Peroorkada in Trivandrum is considered as the real-time WDS. It primarily consists of commercial and residential nodes that include 58 pipes connected to 55 nodes, 49 of which are observable. Figure 4 depicts the schematic diagram of the distribution network topology with the complete network supplied by one source reservoir tank.
Figure 4

Pictorial representation of the Peroorkada residential WDS network.

Figure 4

Pictorial representation of the Peroorkada residential WDS network.

Close modal

Mathematical model of the water supply network

Forecasting of the states and outcomes, for any estimation approach, requires a numerical model. The WDS implemented in this study operates on a 24-h basis. Minimizing the issue of managing of data on a very large and regular period, the scheduling and resource planning occur every 24 h. The dynamic model of the system is said to be a set of quasi-steady-state difference equations and this is caused by the wide time interval used for scheduling of the WDS. This is a comparatively better depiction of dynamics of WDS. During the specific time period with the use of inflows and outflows, as well as the heads availed, the flow and head values are computed. The resultant is equivalent to the nonlinear comprehensive models over a larger time (Mohamed Hussain et al. 2023). A traditional pressure dependent demand dispatch (PDDD) (Sankaranarayanan et al. 2018) is the significance of this WDS context, in which each civil structure (consumer node) includes outflow from tank and a reservoir tank. These are determined by input from the distribution lines and the head of the tank. Additional modelling of the reservoir at the consumer nodes is used to simulate PDDD as a result due to the unsuitable standard steady-state models for WDS. Direct demand-driven conditions are considered for the demand dispatch settings for all steady states.

The modelling of the WDN used four equivalent sets of equations: (i) the continuity equation (flow balance), (ii) the head balance equation in the WDS, (iii) the flow and head loss correlation, and (iv) the reservoir model and their related points.

Continuity equation (flow balance)

The continuity equation for any node in the WDS is given by Equation (6).
(6)
where i implies the resultant node index from which flow starts and is the corresponding node where the flow terminates, Q depicts the flow in the pipe, d is the equivalent node demand j, n is total length of the measurement acquired, and

Head balance equation in the WDS

The head balance equation is generally used to evaluate the effect of pressure distribution in a loop of WDS. The head balance is given by Equation (7).
(7)
where implies the deviation between the heads of the nodes , which is given by and h is the head of the corresponding node.

Flow and head loss correlation

The Hazen–Williams (H–W) equation provides the empirical correlation between the change in head with the flow through the pipe and is given by Equation (8).
(8)
where K is the H–W coefficient correlating the flow and the head loss, which is given by Equation (9).
(9)
where C is the pipe rough coefficient, D is the interior diameter of the flow pipeline (m), and L implies the length of pipe in the network.
The pressure drops across the valves are also considered, since the networks utilizes the valves as the final control element to regulate the water through the network (Wong et al. 2010), which is given by Equation (10).
(10)
where is the head loss across the valve, Q implies the flow rate through the valve, implies the valve flow coefficient (for fully opened valve), is the valve span, and is the specific gravity of water flowing through the valve.

Reservoir tank modelling

As previously indicated, reservoir tank levels must be simulated since they serve as an intermediary storage element between the consumer node and the distribution network. The reservoir tanks’ water level for the corresponding time step is considered to be a steady-state solution. The computed solution is used to get further updates of the solution. Thus, these differential equations are projected to the difference equations using the quasi-steady state model provided by Equation (11)
(11)
where implies water level in the storage tank level, represents water input from the WDS to the tank, m implies the time step between the sampled instances, and is the storage reservoir tank's cross-sectional area for node i. Some of the consumer node reservoir tanks interact with each other and the corresponding difference equations for the head level are given by Equation (12).
(12)
where is the valve opening coefficient that lies within 0–1
(12a)
In general, the consumption rates of the consumer nodes are highly varying in pattern, which makes the supply sometimes be either deficient or surplus compared to the consumption rate. The head of the tanks are affected by varying the pattern demand, the actual demand flow can be obtained if the head of the tank is sufficient, or demand is supplied with the available head in the tanks. The numerical illustration of the demand is given by Equation (13)
(13)
where is the required demand flow if the tanks' head and the inflow from the WDNs are sufficient to supply the required demand set point, and is the outlet valve coefficient. The extended time period simulation of the abovementioned model (as in Equations (11)–(13)) captures the dynamic characteristics of the WDS and is used to estimate the missing data.

They considered WDS is the town's residential sector, with the majority of nodes being residential (individual houses and closely guarded communal complexes) or commercial. The demands are considered as an unknown quantity, and it is highly essential to estimate the proper water management. If end-user demand is directly associated to reservoir tanks excluding an exit valve, the outflow is proportionate to demand and effects tank head. If the consumer's water supply is linked to the tank by a regulating valve, the head of the tank is determined by the working position of the valve. Thus, exit valves are regarded as an important feature that influences the kinetics of the outflow. Since the rate of change of the exit valve co-efficient is almost maintained constant in nominal cases, the working conditions for the valves are often not measured (Sankaranarayanan et al. 2018). The reservoir tank model on demand nodes (non-interacting and interacting) is depicted in figure A1. The structural complexity and the unobservable states lead to the ambiguous idea about the operating condition of the valve coefficients. Furthermore, the diameter of pipes is also considered as an important parameter, since the WDS is vulnerable to the sediment contamination through ages and the pipe leak affects the dynamics of the flow. The considered section of WDS consists of 55 PDDD consumer nodes, out of which 6 nodes are directly connected to the demand points and other 49 are connected through 49 valves. The parameters to be estimated are the 49 valves connected to the reservoir tanks, 6 demand patterns, and the diameter of the 55 pipes. The demand pattern is dynamic in nature, and the considered optimization is a static approach. In order to solve the abovementioned discrepancy, the dynamic demand profile is converted to a static parameter for the estimation.

Modelling of demand

The consumer demand profile varies depending on the customer type. In common, demand profiles are categorized into three broad types (Letting et al. 2017): (i) residential (distinct and obstructed fenced cooperation flats), (ii) commercial assemblies, and (iii) industrial needs. Individual outline is supplied for every 24 h, from the 0th hour to the 24th hour, and the profile can be divided into two components (Vafamand & Khayatian 2018). To calculate the real demand flow, multiply the gain of demand for each sample moment by the base demand multiplier, as shown in Equation (14). Since the demand gain curve pattern is consistent for each customer, the static base demand multiplier is the sole component that must be calculated and optimized using the optimization technique.
(14)
where represents flow of the th node at the kth step, represents the demand curve gain for the th node at the subsequent kth sample instant, and represents the base demand multiplier for the th node.
The six demand nodes, the diameter of the 55 pipelines, and the coefficient of the 49 nodes' output valves are the parameters required to be obtained, as was previously said. The WDS is an underdefined system with infinite solutions since the number of detected head level values (available measurements) is less than the number of parameters to estimate. Therefore, the estimate problem is converted into an optimization problem using the appropriate objective function in order to find the most optimum solution among the available possibilities. The optimal parameter vector can only be obtained by reducing the considered goal function. The proper goal function is determined by taking the SSE, as illustrated by the case study given by Equation (15).
(15)
where i implies the corresponding node and k depicts the corresponding time step (or, more precisely, the sample moment). The measured numbers are tainted with uncertainty, posing a trial to all employed optimization methods in determining the best settings.

Model training of WDS

This experiment uses Python as the development language, selects the data from Peroorkada urban water distribution network as the training set, and uses Scikit-Learn to build an experimental model framework for training. The data on all nodes with respect to particular timings mentioned in Section 3.2 have been predicted and the vulnerable nodes that were determined by the Sensor placement toolkit in MATLAB. 24-h demand data have been taken as training for the six vulnerable nodes, and the prediction for the vulnerable nodes was done. The Sensor placement toolkit results are shown in Figure 5. The results obtained such as regression score plots, residual plots (Figure 6), and prediction plots (actual vs. prediction test data) (Figure 7) have been mentioned (Mitchell 1997) in Section 5.2.
Figure 5

Determination of vulnerable nodes by the sensor placement toolkit.

Figure 5

Determination of vulnerable nodes by the sensor placement toolkit.

Close modal

Predictive results

Regression plots

From the R2 test plot, the performance of the models was determined. In an ideal model, the points should be closer to a diagonal line (Kohavi 1998). A total of 13 machine learning models, namely, RF, Gradient Boosting Regression (GBR), GPR, linear regression, Lasso regression, ridge regression, DT, Support Vector machine Regression (SVR), ENR, PLS, Lasso-based stacking regressor, ridge-based stacking regressor, and linear-based stacking regressor, were modelled, and the predicted vs. actual plot (R2 plot) has been plotted for each node at the time intervals of 6 h exactly at 0, 6, 12, 18, and 24 h, and the predictions for the nodes 106, 113, 123, 150, 1,041, and 1,057 were plotted. Figures 8 and 9 describe the R2 plot for the abovementioned nodes and time intervals. For each node mentioned above, goodness of fit (R2) is higher for Lasso-based stacked regressor (Brentan et al. 2017; Anbananthen et al. 2021) than that of other regression models.

Train and test results

The bars with blue and red in Figures 10 and 11 correspond to the training and test results of 13 regression models, which include RF, GBR, GPR, linear regression (LR), Lasso, ridge, DT, SVR, ENR method, PLS, Lasso-based stacked regressor (Lasso SR), ridge-based stacked regressor (Ridge SR), and linear-based stacked regressor (Linear SR). Linear regression was used as the reference point model for all numerical models (Guo et al. 2018).
Figure 6

Actual vs. predicted plots for Lasso-based stacking regressor.

Figure 6

Actual vs. predicted plots for Lasso-based stacking regressor.

Close modal
Figure 7

Residual plots for Lasso-based stacking regressor.

Figure 7

Residual plots for Lasso-based stacking regressor.

Close modal
Figure 8

R2 plots for all nodes at time 0:00 h to 24:00 h with 6-h intervals.

Figure 8

R2 plots for all nodes at time 0:00 h to 24:00 h with 6-h intervals.

Close modal
Figure 9

R2 plots for nodes 106, 113, 123, 150, 1,041, and 1,057 for 24 h.

Figure 9

R2 plots for nodes 106, 113, 123, 150, 1,041, and 1,057 for 24 h.

Close modal
Figure 10

Train and test bars for all the nodes at time 0:00 h to 24:00 h with 6-h intervals.

Figure 10

Train and test bars for all the nodes at time 0:00 h to 24:00 h with 6-h intervals.

Close modal
Figure 11

Train and test bars for nodes 106, 113, 123, 150, 1,041, and 1,057 for 24 h.

Figure 11

Train and test bars for nodes 106, 113, 123, 150, 1,041, and 1,057 for 24 h.

Close modal
Figure 12

Regression R2 plots for Lasso-based stacking regressor.

Figure 12

Regression R2 plots for Lasso-based stacking regressor.

Close modal

The training R2 results for all nodes at 0, 6, 12, 18, and 24 h, with a 6-h interval of comparatively larger standard deviations have been noticed for DT and SVR, which indicates that the R2 results were amply distributed around the 10-folds of these methods, implying that the algorithms' execution in every fold suggestively diverged from the mean value. Furthermore, to decrease standard deviations (Shuang & Zhao 2021), R2 values of the RF, GBR, as well as the three stacked regressor methods, were in the top five. The test R2 findings for all nodes at 0, 6, 12, 18, and 24 h with a 6-h interval showed relatively significant standard deviations for DT and SVR, indicating lower R2 values. In addition, higher R2 values of test results were observed for RF, Lasso, Linear, and the three stacked regressor models ranked in the top five. The best values of R2 were indicated for the Lasso-based stacked regressor (Figure 12) (Lasso SR) with values of 0.789 for nodes at time 0:00 h, 0.98 for nodes at time 6:00 h, 0.967 for nodes at time 12:00 h, 0.881 for nodes at time 18:00 h, and 0.79 for nodes at time 24:00 h.

The training R2 results for the nodes 106, 113, 123, 150, 1,041, and 1,057 for SVR had relatively large standard deviations, which indicates that the R2 results were amply distributed around the 10-folds of these methods, implying that the algorithms' execution in every fold suggestively diverged from the mean value (Pacchin et al. 2019). Furthermore, to decrease standard deviations, the R2 values of the RF, GBR, DT, and three stacked regressor models were in the top six. The test R2 results for the nodes 106, 113, 123, 150, 1,041, and 1,057 of relatively large standard deviations were observed for SVR, indicating lower R2 values. In addition, higher R2 values of test results (Herrera et al. 2010) were observed for GBR, DT, and the three stacked regressor models ranked in the top five. The best values of R2 were indicated for Lasso-based stacked regressor (Lasso SR) with values of 0.806 for node 106, 0.978 for node 113, 0.981 for node 123, 0.824 for node 150, 0.61 for node 1,041, and 0.708 for node 1,057. The tables for individual nodes are provided in the Supplementary File.

Tables 1 and 2 show the outcomes attained by the machine learning models over the training set and the test set, respectively, for all nodes at 6-h intervals. The optimal values for each measure are marked. The following table depitcs the mean value of R2 values trained and tested for all models. The separate R2 values for each node were discussed in Section 5.2.2.

Table 1

Comparison of performance criteria for training results of nodes at 6-h intervals

Regression modelsMAEMSERoot Mean Squared Error (RMSE)R2
Trained results RF 1.256 1.206 1.098 0.962 
GBR 0.328 0.028 0.168 1.000 
GPR 5.808 8.844 2.974 0.890 
Linear 5.800 8.662 2.943 0.907 
Lasso 5.788 8.768 2.961 0.891 
Ridge 5.810 8.713 2.952 0.892 
DT 0.306 0.020 0.141 1.000 
SVR 5.754 18.364 4.285 0.784 
ENR 5.792 8.780 2.963 0.891 
PLS 6.274 9.908 3.148 0.879 
Lasso-based stacked regressor 0.322 0.025 0.159 0.996 
Ridge-based stacked regressor 0.304 0.027 0.164 1.000 
Linear-based stacked regressor 0.308 0.028 0.169 1.000 
Regression modelsMAEMSERoot Mean Squared Error (RMSE)R2
Trained results RF 1.256 1.206 1.098 0.962 
GBR 0.328 0.028 0.168 1.000 
GPR 5.808 8.844 2.974 0.890 
Linear 5.800 8.662 2.943 0.907 
Lasso 5.788 8.768 2.961 0.891 
Ridge 5.810 8.713 2.952 0.892 
DT 0.306 0.020 0.141 1.000 
SVR 5.754 18.364 4.285 0.784 
ENR 5.792 8.780 2.963 0.891 
PLS 6.274 9.908 3.148 0.879 
Lasso-based stacked regressor 0.322 0.025 0.159 0.996 
Ridge-based stacked regressor 0.304 0.027 0.164 1.000 
Linear-based stacked regressor 0.308 0.028 0.169 1.000 
Table 2

Comparison of performance criteria for test results of nodes at 6-h intervals

Regression modelsMAEMSERMSER2
Test results RF 4.124 8.596 2.932 0.821 
GBR 3.530 7.634 2.763 0.853 
GPR 7.012 9.930 3.151 0.818 
Linear 7.188 10.346 3.216 0.832 
Lasso 7.012 10.012 3.164 0.816 
Ridge 7.122 10.438 3.231 0.808 
DT 4.700 18.873 4.344 0.645 
SVR 7.078 15.269 3.908 0.732 
ENR 7.030 10.034 3.168 0.816 
PLS 7.290 10.111 3.180 0.815 
Lasso-based stacked regressor 3.522 7.592 2.755 0.881 
Ridge-based stacked regressor 3.528 7.608 2.758 0.855 
Linear-based stacked regressor 3.522 7.610 2.759 0.855 
Regression modelsMAEMSERMSER2
Test results RF 4.124 8.596 2.932 0.821 
GBR 3.530 7.634 2.763 0.853 
GPR 7.012 9.930 3.151 0.818 
Linear 7.188 10.346 3.216 0.832 
Lasso 7.012 10.012 3.164 0.816 
Ridge 7.122 10.438 3.231 0.808 
DT 4.700 18.873 4.344 0.645 
SVR 7.078 15.269 3.908 0.732 
ENR 7.030 10.034 3.168 0.816 
PLS 7.290 10.111 3.180 0.815 
Lasso-based stacked regressor 3.522 7.592 2.755 0.881 
Ridge-based stacked regressor 3.528 7.608 2.758 0.855 
Linear-based stacked regressor 3.522 7.610 2.759 0.855 

Tables 3 and 4 present the outcomes attained by the machine learning models over the training set and test set, respectively, for the nodes 106, 113, 123, 150, 1,041, and 1,057. The optimal values for each measure are marked. The following table depicts the mean value of R2 values trained and tested for all models. The separate R2 values for each node were discussed in Section 5.2.2.

Table 3

Comparison of performance criteria for training results of abovementioned nodes

Regression modelsMAEMSERMSER2
Trained results RF 1.585 0.991 0.996 0.951 
GBR 0.073 0.005 0.068 1.000 
GPR 6.513 8.520 2.919 0.476 
Linear 4.800 5.164 2.272 0.615 
Lasso 5.175 5.967 2.443 0.562 
Ridge 5.107 5.942 2.438 0.581 
DT 0.168 0.011 0.105 1.000 
SVR 7.503 17.349 4.165 0.368 
ENR 5.175 6.074 2.465 0.570 
PLS 5.100 5.943 2.438 0.581 
Lasso-based stacked regressor 0.043 0.009 0.096 0.971 
Ridge-based stacked regressor 0.035 0.005 0.070 1.000 
Linear-based stacked regressor 0.172 0.007 0.083 0.994 
Regression modelsMAEMSERMSER2
Trained results RF 1.585 0.991 0.996 0.951 
GBR 0.073 0.005 0.068 1.000 
GPR 6.513 8.520 2.919 0.476 
Linear 4.800 5.164 2.272 0.615 
Lasso 5.175 5.967 2.443 0.562 
Ridge 5.107 5.942 2.438 0.581 
DT 0.168 0.011 0.105 1.000 
SVR 7.503 17.349 4.165 0.368 
ENR 5.175 6.074 2.465 0.570 
PLS 5.100 5.943 2.438 0.581 
Lasso-based stacked regressor 0.043 0.009 0.096 0.971 
Ridge-based stacked regressor 0.035 0.005 0.070 1.000 
Linear-based stacked regressor 0.172 0.007 0.083 0.994 
Table 4

Comparison of performance criteria for test results of abovementioned nodes

Regression modelsMAEMSERMSER2
Test results RF 2.438 12.888 3.242 0.575 
GBR 3.118 42.135 5.324 0.686 
GPR 6.535 86.913 7.276 0.481 
Linear 5.052 53.145 6.046 0.512 
Lasso 5.648 61.315 6.424 0.547 
Ridge 5.583 61.470 6.409 0.545 
DT 1.638 6.480 2.126 0.629 
SVR 8.845 202.965 11.324 0.246 
ENR 5.582 60.807 6.376 0.557 
PLS 5.588 61.568 6.416 0.544 
Lasso-based stacked regressor 3.323 41.798 5.373 0.818 
Ridge-based stacked regressor 3.137 42.092 5.338 0.681 
Linear-based stacked regressor 3.225 42.275 5.365 0.677 
Regression modelsMAEMSERMSER2
Test results RF 2.438 12.888 3.242 0.575 
GBR 3.118 42.135 5.324 0.686 
GPR 6.535 86.913 7.276 0.481 
Linear 5.052 53.145 6.046 0.512 
Lasso 5.648 61.315 6.424 0.547 
Ridge 5.583 61.470 6.409 0.545 
DT 1.638 6.480 2.126 0.629 
SVR 8.845 202.965 11.324 0.246 
ENR 5.582 60.807 6.376 0.557 
PLS 5.588 61.568 6.416 0.544 
Lasso-based stacked regressor 3.323 41.798 5.373 0.818 
Ridge-based stacked regressor 3.137 42.092 5.338 0.681 
Linear-based stacked regressor 3.225 42.275 5.365 0.677 

Water shortages and wastes are limited or eliminated via the use of demand forecasting and an equal supply of local water resources. The foremost goal of this research study is to use a prediction model to estimate water demand in the Peroorkada urban water distribution system. The elements of predictions included attributes linked to the water demand, such as head, pressure, and base demand. Python is used to create the prediction model. There are 99 nodes in the water distribution network. All of the nodes have had their demand for a 6-h time interval anticipated and displayed, and the sensor placement tools have been used to estimate the susceptible nodes' demand for a 24-h interval.

A total of 13 machine learning algorithms were used in this work, three of which were hybrid/stacked regression methods. Demand prediction is best achieved with the Lasso-based stacking regressor model. Stacking regressor models outperformed single prediction models in terms of success. This is due to the fact that they are able to combine the effectiveness of different individual regressors. These models can improve predictive maintenance performance and handle more complex tasks than single models by managing a variety of data structures.

Two other areas with distinct topographies, climates, and economies were used to further verify the Lasso-based stacked regressor model. In most of the situations, the anticipated accuracies of Lasso-based stacked regressor exceeded 80%. This establishes the model's robustness. Other locations can utilize the same information with the same explanatory qualities to forecast water demand. The study's findings will help municipalities and utilities optimize the association between water distribution and water demand and decrease ambiguous water time arrangement costs by merging their own databases with corporate industrial requests to keep the demand forecast of water. Further segmentation of the demand for water with an examination of the key factors influencing variations in the demand for water is required in the future. Data on water supply should be incorporated into the design of multiscenario water demand forecasts. It is important in the future to take into account extensive information on water demand, including that of various Indian provinces and cities. Since it decreases overall demand, recovered wastewater and polluted water may be factored into the prediction model.

The authors express their thanks, in particular, to the ‘MeitY (Ministry of Electronics and Information Technology), Government of India’ for granting this project work, as well as our institute, ‘National Institute of Technology, Tiruchirappalli (NITT), Tamil Nadu, India, under the Ministry of Human Resources Development (MHRD), Government of India’, for assistance in accomplishing the above work.

All relevant data are included in the paper or its Supplementary Information.

The authors declare there is no conflict.

Ali
S.
,
Wu
K.
,
Weston
K.
&
Marinakis
D.
(
2016
)
A machine learning approach to meter placement for power quality estimation in smart grid
,
IEEE Transactions on Smart Grid
,
7
,
1552
1561
.
Anbananthen
K. S. M.
,
Subbiah
S.
,
Chelliah
D.
,
Sivakumar
P.
,
Somasundaram
V.
,
Velshankar
K. H.
&
Khan
M.
(
2021
)
An intelligent decision support system for crop yield prediction using hybrid machine learning algorithms
,
F1000Res
,
10
,
1143
.
Archibald
S.
,
Roy
D. P.
,
Van Wilgen
B. W.
&
Scholes
R. J.
(
2009
)
What limits fire? An examination of drivers of burnt area in Southern Africa
,
Global Change Biology
,
15
,
613
630
.
Bakker
M.
,
Van Schagen
K.
&
Timmer
J.
(
2003
)
Flow control by prediction of water demand
,
Journal of Water Supply: Research and Technology – AQUA
,
52
(
6
),
417
424
.
Bakker
M.
,
Vreeburg
J. H. G.
,
Van Schagen
K. M.
&
Rietveld
L. C.
(
2013
)
A fully adaptive forecasting model for short-term drinking water demand
,
Environmental Modelling & Software
,
48
,
141
151
.
Bakkes
J. A.
,
Bosch
P. R.
,
Bouwman
A. F.
,
Eerens
H. C.
,
Den Elzen
M. G. J.
,
Isaac
M.
,
Janssen
P. H. M.
,
Goldewijk
K. K.
,
Kram
T.
,
De Leeuw
F. A. A. M.
&
Olivier
J. G. J.
(
2008
)
Background report to the OECD Environmental Outlook to 2030: Overviews, details, and methodology of model-based analysis
,
Netherlands Environmental Assessment Agency, The Hague, Netherlands
.
Baotić
M.
,
Vašak
M.
&
Banjac
G.
(
2015
)
Adaptable urban water demand prediction system
,
Water Supply
,
15
,
958
964
.
Bishop
C.
(
2006
)
Pattern Recognition and Machine Learning
, Vol.
2
.
New York, NY
:
Springer
, pp.
531
537
.
Bougadis
J.
,
Adamowski
K.
&
Diduch
R.
(
2005
)
Short-term municipal water demand forecasting
,
Hydrological Processes: An International Journal
,
19
,
137
148
.
Braun
M.
,
Bernard
T.
,
Piller
O.
&
Sedehizade
F.
(
2014
)
24-Hours demand forecasting based on SARIMA and support vector machines
,
Procedia Engineering
,
89
,
926
933
.
Breiman
L.
(
2001
)
Random forests
,
Machine Learning
,
45
,
5
32
.
Brentan
B. M.
,
Luvizotto
E.
Jr
,
Herrera
M.
,
Izquierdo
J.
&
Pérez-García
R.
(
2017
)
Hybrid regression model for near real-time urban water demand forecasting
,
Journal of Computational and Applied Mathematics
,
309
,
532
541
.
Buitinck
L.
,
Louppe
G.
,
Blondel
M.
,
Pedregosa
F.
,
Mueller
A.
,
Grisel
O.
,
Niculae
V.
,
Prettenhofer
P.
,
Gramfort
A.
,
Grobler
J.
&
Layton
R.
(
2013
)
API Design for Machine Learning Software Experiences from the Scikit-Learn Project. arXiv preprint arXiv:1309.0238
.
Darling
E. S.
,
Alvarez-Filip
L.
,
Oliver
T. A.
,
Mcclanahan
T. R.
,
Cote
I. M.
&
Bellwood
D.
(
2012
)
Evaluating life-history strategies of reef corals from species traits
,
Ecology Letters
,
15
,
1378
1386
.
Derrible
S.
(
2016
)
Urban infrastructure is not a tree: Integrating and decentralizing urban infrastructure systems
,
Environment and Planning B: Urban Analytics and City Science
,
44
,
553
569
.
Donkor
E. A.
,
Mazzuchi
T. A.
,
Soyer
R.
&
Roberson
J. A.
(
2014
)
Urbanwater demand forecasting review of methods and models
,
Journal of Water Resources Planning and Management
,
140
,
146
159
.
Fricke
K.
(
2013
)
Analysis and Modelling of Water Supply and Demand Under Climate Change, Land use Transformation and Socio-Economic Development: The Water Resource Challenge and Adaptation Measures for Urumqi Region, Northwest China
.
Heidelberg, Germany
:
Springer Science & Business Media
.
Friedman
J. H.
(
2001
)
Greedy function approximation: A gradient boosting machine
,
Annals of Statistics
,
29
,
1189
1232
.
Golshani
N.
,
Shabanpour
R.
,
Mahmoudifard
S. M.
,
Derrible
S.
&
Mohammadian
A.
(
2018
)
Modeling travel mode and timing decisions: Comparison of artificial neural networks and copula-based joint model
,
Travel Behaviour and Society
,
10
,
21
32
.
Greve
P.
,
Kahil
T.
,
Mochizuki
J.
,
Schinko
T.
,
Satoh
Y.
,
Burek
P.
,
Fischer
G.
,
Tramberend
S.
,
Burtscher
R.
,
Langan
S.
&
Wada
Y.
(
2018
)
Global assessment of water challenges under uncertainty in water scarcity projections
,
Nature Sustainability
,
1
,
486
494
.
Guo
G.
,
Liu
S.
,
Wu
Y.
,
Li
J.
,
Zhou
R.
&
Zhu
X.
(
2018
)
Short-term water demand forecast based on deep learning method
,
Journal of Water Resources Planning and Management
,
144
,
04018076
.
Hanrahan
G.
,
Udeh
F.
&
Patil
D. G.
(
2005
)
Chemometrics And Statistics, Multivariate Calibration Techniques
.
Limerick, Ireland
:
Elsevier
.
Hastie
T.
,
Tibshirani
R.
&
Friedman
J.
(
2009
)
Model Assessment and Selection
.
New York, NY
:
The Elements of Statistical Learning
.
Herrera
M.
,
Torgo
L.
,
Izquierdo
J.
&
Pérez-García
R.
(
2010
)
Predictive models for forecasting hourly urban water demand
,
Journal of Hydrology
,
387
,
141
150
.
Hoerl
A. E.
&
Kennard
R. W.
(
1970
)
Ridge regression: Applications to nonorthogonal problems
,
Technometrics
,
12
,
69
82
.
House-Peters
L. A.
&
Chang
H.
(
2011
)
Urban water demand modeling: Review of concepts, methods, and organizing principles
,
Water Resources Research
,
47
,
W05401
.
House-Peters
L.
,
Pratt
B.
&
Chang
H.
(
2010
)
Effects of urban spatial structure, sociodemographics, and climate on residential water consumption in Hillsboro, Oregon
,
JAWRA Journal of the American Water Resources Association
,
46
,
461
472
.
Khan
M. H. R.
,
Bhadra
A.
&
Howlader
T.
(
2019
)
Stability selection for Lasso, ridge and elastic net implemented with AFT models
,
Statistical Appliances in Genetics and Molecular Biology
,
18
,
20170001
.
Kohavi
R.
(
1998
)
Glossary of terms
,
Machine Learning
,
30
,
271
274
.
Lee
D.
&
Derrible
S.
(
2020
)
Predicting residential water demand with machine-based statistical learning
,
Journal of Water Resources Planning and Management
,
146
,
04019067
.
Lee
D.
,
Derrible
S.
&
Pereira
F. C.
(
2018
)
Comparison of four types of artificial neural network and a multinomial logit model for travel mode choice modeling
,
Transportation Research Record: Journal of the Transportation Research Board
,
2672
,
101
112
.
Lessmann
S.
,
Baesens
B.
,
Seow
H.-V.
&
Thomas
L. C.
(
2015
)
Benchmarking state-of-the-art classification algorithms for credit scoring: An update of research
,
European Journal of Operational Research
,
247
,
124
136
.
Li
M.
,
Finlayson
B.
,
Webber
M.
,
Barnett
J.
,
Webber
S.
,
Rogers
S.
,
Chen
Z.
,
Wei
T.
,
Chen
J.
,
Wu
X.
&
Wang
M.
(
2017
)
Estimating urban water demand under conditions of rapid growth: The case of Shanghai
,
Regional Environmental Change
,
17
,
1153
1161
.
Makropoulos
C.
,
Butler
D.
&
Rozos
E.
(
2016
)
An integrated system dynamics – Cellular automata model for distributed water-infrastructure planning
,
Water Supply
,
16
,
1519
1527
.
Mitchell
T. M.
(
1997
)
Machine Learning
.
New York, NY
:
McGraw-Hill
.
Mohamed Hussain
K.
,
Sivakumaran
N.
,
Sankaranarayanan
S.
,
Radhakrishnan
T. K.
&
Swaminathan
G.
(
2023
)
An enhanced dynamic soft sensor–based online estimation of missing data for water distribution system with inherent disturbances
,
Transactions of the Institute of Measurement and Control
,
45
,
1579
1603
.
Mouatadid
S.
&
Adamowski
J.
(
2016
)
Using extreme learning machines for short-term urban water demand forecasting
,
Urban Water Journal
,
14
,
630
638
.
Mu
L.
,
Zheng
F.
,
Tao
R.
,
Zhang
Q.
&
Kapelan
Z.
(
2020
)
Hourly and daily urban water demand predictions using a long short-term memory based model
,
Journal of Water Resources Planning and Management
,
146
,
05020017
.
Nascimento
D. S. C.
,
Coelho
A. L. V.
&
Canuto
A. M. P.
(
2014
)
Integrating complementary techniques for promoting diversity in classifier ensembles: A systematic study
,
Neurocomputing
,
138
,
347
357
.
Pacchin
E.
,
Gagliardi
F.
,
Alvisi
S.
&
Franchini
M.
(
2019
)
A comparison of short-term water demand forecasting models
,
Water Resources Management
,
33
,
1481
1497
.
Pedregosa
F.
,
Varoquaux
G.
,
Gramfort
A.
,
Michel
V.
,
Thirion
B.
,
Grisel
O.
,
Blondel
M.
,
Prettenhofer
P.
,
Weiss
R.
,
Dubourg
V.
&
Vanderplas
J.
(
2011
)
Scikit-learn machine learning in Python
,
The Journal of Machine Learning Research
,
12
,
2825
2830
.
Pesantez
J. E.
,
Berglund
E. Z.
&
Kaza
N.
(
2020
)
Smart meters data for modeling and forecasting water demand at the user-level
,
Environmental Modelling & Software
,
125
,
104633
.
Sankaranarayanan
S.
,
Sivakumaran
N.
,
Radhakrishnan
T. K.
&
Swaminathan
G.
(
2017
) '
Soft sensor based estimation of process parameters and states with Hybridized Grey Wolf Optimizer
',
American Control Conference (ACC)
, pp.
1892
1897
.
Sankaranarayanan
S.
,
Sivakumaran
N.
,
Radhakrishnan
T. K.
&
Swaminathan
G.
(
2018
)
Metaheuristic-based approach for state and process parameter prediction using hybrid grey wolf optimization
,
Asia-Pacific Journal of Chemical Engineering
,
13
,
e2215
.
Sankaranarayanan
S.
,
Sivakumaran
N.
,
Radhakrishnan
T. K.
&
Swaminathan
G.
(
2019
)
Water hammer vibrations in pressurized water distribution system: Conceptual development of dynamic model and validation
,
Asia-Pacific Journal of Chemical Engineering
,
14
,
e2345
.
Sankaranarayanan
S.
,
Sivakumaran
N.
,
Radhakrishnan
T. K.
&
Swaminathan
G.
(
2020
)
Dynamic soft sensor based parameters and demand curve estimation for water distribution system: Theoretical and experimental cross validation
,
Control Engineering Practice
,
102
,
104544
.
Sengupta
A.
,
Hawley
R. J.
&
Stein
E. D.
(
2018
)
Predicting hydromodification in streams using nonlinear memory-based algorithms in Southern California streams
,
Journal of Water Resources Planning and Management
,
144
,
04017079
.
Sun
Y.
,
Liu
N.
,
Shang
J.
&
Zhang
J.
(
2017
)
Sustainable utilization of water resources in China: A system dynamics model
,
Journal of Cleaner Production
,
142
,
613
625
.
Tian
T.
&
Xue
H.
(
2017
). '
Prediction of annual water consumption in Guangdong Province based on Bayesian neural network
',
IOP Conference Series: Earth and Environmental Science
, p.
69
.
Tibshirani
R.
(
1996
)
Regression shrinkage and selection Via the Lasso
,
Journal of the Royal Statistical Society: Series B (Methodological)
,
58
,
267
288
.
Villarin
M. C.
&
Rodriguez-Galiano
V. F.
(
2019
)
Machine learning for modeling water demand
,
Journal of Water Resources Planning and Management
,
145
,
04019017
.
Voyant
C.
,
Notton
G.
,
Kalogirou
S.
,
Nivet
M.-L.
,
Paoli
C.
,
Motte
F.
&
Fouilloy
A.
(
2017
)
Machine learning methods for solar radiation forecasting: A review
,
Renewable Energy
,
105
,
569
582
.
Xia
Y.
,
Liu
C.
,
Li
Y.
&
Liu
N.
(
2017
)
A boosted decision tree approach using Bayesian hyper-parameter optimization for credit scoring
,
Expert Systems with Applications
,
78
,
225
241
.
Zhao
Y.
&
Chen
X. J.
(
2014
)
Prediction model on urban residential water based on resilient BP learning algorithm
,
Applied Mechanics and Materials
,
543–547
,
4086
4089
.
Zhou
S. L.
,
Mcmahon
T. A.
,
Walton
A.
&
Lewis
J.
(
2000
)
Forecasting daily urban water demand: A case study of Melbourne
,
Journal of Hydrology
,
236
,
153
164
.
This is an Open Access article distributed under the terms of the Creative Commons Attribution Licence (CC BY 4.0), which permits copying, adaptation and redistribution, provided the original work is properly cited (http://creativecommons.org/licenses/by/4.0/).

Supplementary data