The estimation of small reservoir capacity is of great significance for water resources management. However, many widely distributed small reservoirs lack the capacity information because of the high costs of field measurements. This study proposed a novel approach to estimate the small reservoir capacity in the hilly area by using remote sensing and Digital Elevation Model (DEM). The basic idea of this approach is to explore the relationship between influential factors (i.e., topographic and geomorphic parameters) and measured reservoirs’ capacity to establish a machine learning model based on particle swarm optimization–extreme learning machine (PSO–ELM) to estimate the capacity. The Mihe River basin in northern China is selected as a case study, 111 measured reservoirs, and six optional influential factors are selected to develop and test this model. The results show that the five influential factors (i.e., the area of sub-catchment, the water surface area, the longest flow path of sub-catchment, the average slope of sub-catchment, and the average slope of buffer area) are the optimal combination with the lowest difference between the measured and the estimated reservoir capacities. The results demonstrate that the proposed approach is a robust tool for estimating the capacity of small reservoirs in the hilly area.

  • A total of 123 small reservoirs are identified in the Mihe River basin above the Tanjiafang station by remote sensing and DEM.

  • Five influential factors are selected from the spheres of topography and geomorphology to estimate the reservoir capacity.

  • The estimation model based on particle swarm optimization–extreme learning machine (PSO–ELM) is a robust tool for estimating the small reservoir capacity in the hilly area.

Graphical Abstract

Graphical Abstract
Graphical Abstract

Reservoirs, regardless of their size, are of great significance for the comprehensive management of water resources in river basins (Votruba & Broza 1989; Leemhuis et al. 2009). Large- and medium-sized reservoirs play a critical role in flood control and disaster reduction, water supply and irrigation, hydropower generation, etc. Small reservoirs, with large numbers, are also indispensable in ensuring the safety of drinking water and food (Eilander et al. 2014).

Over the past decades, more than 847,000 reservoirs have been built worldwide, of which about 95% are small reservoirs, whose heights are less than 15 m from foundation to crest (Rosenberg et al. 2000; World Commission on Dams 2000; Chen et al. 2018). In China, as of 2016, there are 98,460 reservoirs, including 93,850 small reservoirs, which account for about 95% of the total number of reservoirs in China (Ministry of Water Resources 2016). In China, a small reservoir refers to the reservoir with a capacity between 0.1 and 10 million m3.

Different from large reservoirs, small reservoirs lack storage capacity monitoring information and even lack geographic location (Liebe et al. 2009; Ogilvie et al. 2018). However, capacity is one of the most important factors determining the performance of small reservoirs. The lack of capacity not only restricts the decision-making on flood control and drought relief but also goes against the sustainable utilization of limited water resources (Meigh 1995). Although some projects have been carried out to monitor the capacity of reservoirs, for example, the International Commission on Large Dams (ICOLD), the Global Lakes and Wetlands Database (GLWD), and the Global Reservoir and Dam Database (GRanD), most of them are targeted to large reservoirs, and the information about the locations of some reservoirs is absent (Gao 2015; Chen et al. 2016; Langhorst et al. 2019).

There are currently three methods to estimate reservoir capacity. The first method is to estimate reservoir storage by constructing the water surface area capacity curve. The water surface area is obtained by extracting remote sensing data (Guan et al. 2021), and water depth is measured in the field (Meigh 1995; Liebe et al. 2005; Sawunyama et al. 2006; Rodrigues et al. 2012). The second method is remote sensing. In situ or low-altitude surveys based on Sonar and LiDAR sensors or depth meters can measure reservoir bathymetric information (Avisse et al. 2017). The third method is to reconstruct the underwater topography and establish a reservoir storage estimation model by extrapolating and interpolating underwater topography based on DEM data during the period of low water (Zhang et al. 2016) and remote sensing images according to the similarity between the underwater topography and the surrounding topography (Tseng et al. 2016; Getirana et al. 2018; Liu et al. 2020). The first two methods are time-consuming and limited to shallow waters with favorable visibility, which are infeasible for broad-scale applications practically. The third method is suitable for wide and shallow reservoirs with a large water surface area; their underwater topography is traceable and easy to generalize, but its applicability to the elevated natural reservoirs remains to be verified (Liu et al. 2020).

However, small reservoirs in hilly areas, which are generally constructed according to the natural environment, have complex underwater topography. These small reservoirs have a low water level during most periods of the year, so it is difficult to obtain continuous remote sensing images to match the water surface area and elevation data, which is the basic dataset of the third method. Therefore, an applicable and fast estimation method of reservoir capacity, which does not rely on topography reconstruction, can help management master information about the capacity of abundant small reservoirs in the basin. Besides, based on the regulation mode of small reservoirs, information, such as reservoir storage, can be further estimated, thereby providing data basis for comprehensive utilization of water resources in the basin.

Apart from the water surface area, the surrounding topography is also a key factor that determines reservoir capacity (Mehran et al. 2019; Fassoni-Andrade et al. 2020), which is confirmed by the third method. The capacity of a small reservoir is closely correlated with the topography of the basin where the reservoir is located (Yang et al. 2021), which offers a new idea for estimating the capacity of small reservoirs, that is, to estimate the capacity of small reservoirs by exploring the relationship between the multi-dimensional factors of topography surrounding small reservoirs and the reservoir capacity.

With great advantages in dealing with the nonlinear relationship, machine learning algorithms have been widely used in various fields such as the hydrology field. Typical machine learning algorithms include artificial neural network (ANN) (Tanty & Desmukh 2015; Filipova et al. 2022), extreme learning machine (ELM) (Atiquzzaman & Kandasamy 2016, 2018; Mouatadid & Adamowski 2017), particle swarm optimization–ELM (PSO–ELM) (Anupam & Pani 2020; Li et al. 2020; Zhu et al. 2020; Pham et al. 2021), support vector machines (SVM) (Asefa et al. 2006; Deka 2014), and so on.

Both ANN and SVM can approximate the complex nonlinear relationships and their detailed principles were described by Zurada et al. (1997) and Zhang et al. (2009), but ANN is more suitable for multi-dimensional data than SVM. However, some of ANN's shortcomings, such as long training time and easy to fall into local minima, lead to unsatisfactory simulation results. ELM is an improved type of feedforward neural network. Compared with the traditional single hidden layer feedforward neural network, ELM has the advantages of fast learning speed and good generalization performance (Huang et al. 2006, 2015).

However, ELM still has its own shortcomings, such as randomly given input weight and hidden layer deviation, which are unchanged in the calculation process. This resulting ELM requires more hidden layer neurons to ensure the simulation accuracy, which weakens its generalization ability. PSO is a typically used optimization algorithm, which can be used to optimize the input weights and hidden layer deviations of ELM. Therefore, compared with ELM, PSO–ELM has better generalization ability and simulation accuracy. However, it is found that there are few studies on PSO–ELM to estimate the capacity of small-size reservoirs, especially in Northern China where such reservoirs are widely distributed, and most of them were constructed in early times with limited topographical data. To fill this gap, PSO–ELM is applied to the estimation of storage capacities of small-size reservoirs. To compare the accuracy of estimation, the estimation models based on ANN and ELM are also established. Without constructing the underwater topography, these machine learning methods can explore the nonlinear relationship between topography and capacity based on the characteristic parameters of reservoirs and the basins where they are located. Making use of remote sensing and DEM, these methods have low requirements of model input parameters, which makes them applicable to estimate the capacity of small reservoirs in similar watersheds in the hilly area.

The specific steps are as follows: (1) the locations of small reservoirs were extracted by setting an appropriate threshold based on remote sensing images and DEM data; (2) the influencing factors of reservoir capacity were extracted and screened; (3) the machine learning model was established to explore the relationship between influencing factors and reservoir capacity, and then the estimation model was trained and calibrated to estimate the capacity of small reservoirs without data.

Study area

The Mihe River basin is located in eastern China, middle of Shandong Province, and the south of Taiyi mountain, the main mountain in Shandong Province. The Tanjiafang hydrological station is the main control station in the Mihe River basin. The river length above the Tanjiafang station is 90 km, with a catchment area of 2,153 km2, is dominated by hilly terrain.

The catchment controlled by the Tanjiafang station is selected as the study area because it is a typical hilly area with many reservoirs, which is important for flood control in the Mihe River basin. There is one large reservoir, three medium reservoirs, and a large number of small reservoirs, most of which lack location information and capacity information. In addition, to ensure a sufficient number of training samples, this paper selects an additional reference basin. The reference basin is also located in the Taiyi mountains, which have similar hydrogeological characteristics with the study area. The study area and reference area are shown in Figure 1.

Figure 1

The location of the study area and the reference area (the framed area in the upper right corner is the study area and the lower right corner is the reference area).

Figure 1

The location of the study area and the reference area (the framed area in the upper right corner is the study area and the lower right corner is the reference area).

Close modal

Data sources and processing

Three datasets are collected to support our study, including the DEM and Landsat datasets and the reservoirs’ information dataset. The ALOS DEM and Landsat 7 ETM SLC are the basic sources of the elevation and land-use types, which are obtained from the open-access databases. The measured records of reservoirs are collected from the Hydrology Centre of Shandong Province, including the reservoir location and the designed reservoir capacity. A total of 12 measured reservoirs in the study area and 89 measured reservoirs in the reference area are collected (Supplementary Tables S1 and S2).

The DEM data are pre-processed by filling sinks and reconditioning using ArcHydro Tools, the Landsat data are pre-processed by radiometric calibration, and atmospheric correction used ENVI 5.3. The pre-processed remote sensing data are used for supervision and classification to obtain the land-use types of the study area and reference area, including water body, construction land, green space, cultivated land, and greenhouse land. And then Google Earth Engine (GEE) is used to identify and remove mountain shadow patches and verify the water body's position and shape. The land-use types of the study area and the reference area are shown in Figure 2.

Figure 2

The land-use types in the study area and the reference area.

Figure 2

The land-use types in the study area and the reference area.

Close modal

Based on the verified water body patches, a water body area threshold of small reservoirs is set to screen small reservoirs in the study basin. When the area threshold is 0.002 km2, the number of small reservoirs in the Mihe River basin is 123, which is basically consistent with the number of small reservoirs in the study area mentioned in other documents (Shouguang Water Resources Bureau 2021). Therefore, for the study area, there are 12 small reservoirs with known capacity information and 111 small reservoirs with unknown capacity, which still need to be estimated.

The spatial distribution of the identified 123 reservoirs and their controlled sub-catchments in the study area are shown in Figure 3.

Figure 3

Spatial distribution of small reservoirs and their controlling sub-catchments in the study area.

Figure 3

Spatial distribution of small reservoirs and their controlling sub-catchments in the study area.

Close modal

Basic idea

This paper aims to develop an estimation approach to estimate the capacity of small reservoirs in hilly areas. The basic idea is to explore the relationship between the reservoir capacity and the surrounding topography and geomorphology and to select some appropriate influential factors to establish a machine learning model for estimating the reservoir capacity. We implemented the above-described idea of the estimation model using ANN, ELM, and PSO–ELM by following the flowchart as illustrated in Figure 4.

Figure 4

Flowchart of the proposed approach.

Figure 4

Flowchart of the proposed approach.

Close modal

Six factors representing the surrounding topography and geomorphology are selected as the influential factors, including the area of sub-catchment controlled by the reservoirs (Asc), the water surface area of the reservoir (Aws), the longest flow path of sub-catchment (Lsc), the average slope of sub-catchment (Ssc), the average slope of buffer area (Sb), and the degree of relief of sub-catchment (DRsc). Asc reflects the rain harvesting area of the reservoir. The larger the Asc, the larger the storage capacity of the reservoir. Lsc affects the time of flood transition to the reservoir and further influences the designed capacity of the reservoir. Generally, the shorter the longest drainage distance in the sub-basin, the faster the flood will converge to the small reservoirs. Ssc reveals the overall topography of the catchment area of the reservoir. Sb refers to the micro-terrain around the reservoir, that is, the terrain above the water surface and below the top of the reservoir dam. Aws refers to the extracted water body area. In general, the larger the water surface area, the greater the reservoir storage, and the closer the reservoir to its full capacity. Rsc is the difference between the altitude of the highest point and the lowest point in the catchment, which is a macroscopic index reflecting the topographic characteristics of the catchment.

Artificial neural network

As one of the most widely used machine learning algorithms, the ANN is composed of input layer neurons (or nodes, units), one or more hidden layer neurons, and output layer neurons. The nodes of the input layer, the hidden layer, and the output layer are connected by line segments. Each connection is associated with a numeric number called weight. The output function of neuron i in the hidden layer is expressed as follows:
(1)
where is called activation (or transfer) function, and commonly used activation functions include Sigmoid type activation function (Logistics function and Tanh function); N, the number of input neurons; , the weights which are calculated iteratively by the gradient descent method; , inputs to the input neurons; and , the threshold terms of the hidden neurons. As for the traditional ANN, its weight needs to be obtained iteratively by the gradient descent method, which has the disadvantages of slow operation speed, easily falling into the local optimal solution, and excessive fitting.

Extreme learning machine

ELM is a novel prediction method based on feedforward neural networks (Huang et al. 2006). The input weights and hidden layer biases of ELM can be stochastically chosen if the activation functions in the hidden layer are infinitely differentiable. Similar to general ANNs, an ELM structure consists of an input layer, an output layer, and the hidden layer. The only parameters that need to be set by users are the activation function and the number of nodes in the hidden layer. For the training data (Xi, Vi), Xi = [xi1, xi2, … xin]TRn, and Vi = [vi1, vi2, … vim]TRm, if k is the number of nodes in the hidden layer and g(x) is the activation function, the standard feedforward neural network is described in the following equation:
(2)
where is the connection input weight between the input layer and the ith neuron of the hidden layer; is the connection output weight between the output layer and the ith neuron of the hidden layer; and bi is the threshold of the ith neuron in the hidden layer. Compared with the traditional feedforward neural network, the input weight and the hidden layer bias of ELM are randomly obtained by randomness, and the output weight matrix is calculated by More–Penrose (MP) generalized inverse (Ahila et al. 2015). ELM is not only thousands of times faster than traditional learning algorithms, but also avoids some problems caused by gradient-based learning methods such as local minimum, stop criteria, and learning rate (Zhu et al. 2005; Cao et al. 2010).

PSO–ELM for reservoir capacity estimation

Particle swarm optimization

PSO is an iterative optimization algorithm (Xu & Shu 2006). The basic idea of PSO is to search for the optimal solution to the problem through cooperation and information sharing among individuals in the group. Suppose there is a community including n particles, denoted as Y= (Y1, Y2, …, Yn). The ith particle is expressed as a D-dimensional vector Yi= [y1,y2, …, yD]T, which not only stores the position of the ith particle in the D-dimensional search space but also stores the fitness and velocity (Mategaonkar et al. 2018; Swathi & Elwha 2018).

The fitness value of a particle is the value calculated by bringing the current position of the particle into the objective function. The velocity of the particles determines the next flight direction and distance of the particles themselves. The velocity of the ith particle is Vi= [vi1, vi2, …, viD]T. According to the objective function, the fitness value corresponding to the particle position Yi is calculated to judge whether the current position is good or bad. During each iteration update process, the particle updates its position by tracking two ‘extreme values’. One is the optimal solution found by the particle itself called the individual extremum pbest. The other extremum is the optimal solution currently found by the entire population called the global extremum gbest. In each iteration of PSO, the particle velocity and position are updated as follows:
(3)
(4)
where vi is the velocity of the ith particle; k is the current iteration number; is the inertia coefficient, c1 and c2 are acceleration factors, and r1 and r2 are random numbers in the interval (0,1). pbesti and gbesti represent the two extreme values used to update the position of the particles, which are the optimal solution found by the ith particle and the optimal solution found by the entire population. The position and speed are usually limited to [ ymax, ymax],[ vmax,vmax] to prevent blind searching of particles.

Particle swarm optimization–extreme learning machine

Due to insufficient generalization of ELM (Mahmood et al. 2017), PSO is used to optimize the input layer weight and the hidden layer bias of ELM (Xu & Shu 2006). The input weight and hidden layer bias are regarded as PSO particles. The specific steps are as follows (as shown in Figure 5):

Step 1: Data sorting and preprocessing. Extraction of six influential factors representing the surrounding topography and geomorphology Xi and the reservoir capacity Vi.

Step 2: Establishing the ELM model. The ELM model is established by using the datasets of (Xi, Vi).

Step 3: Start using PSO to optimize two parameters of ELM: the weights of input layers and the bias of hidden layers. Generate the initial population, select the appropriate number of particles, and determine the appropriate acceleration factors and the maximum number of iterations.

Step 4: For each particle in the population, ELM is used to calculate the output weight, initial fitness value, pbest, and gbest and determine whether it meets the condition of stopping iteration (the maximum number of iterations). Then, formulae (3)–(4) are used to update the velocity and position of all particles until the condition is met.

Extraction and analysis of the influential factors

Six influential factors, namely ASC, Aws, Ssc, Lsc, Sb, and Rsc, are extracted and analyzed statistically, as shown in Figure 6. Figure 6 shows that most of the influential factors follow a gamma distribution, followed by an exponential distribution.

Figure 5

Calibration process of parameters of PSO–ELM for the estimation of small reservoir capacity.

Figure 5

Calibration process of parameters of PSO–ELM for the estimation of small reservoir capacity.

Close modal
Figure 6

The influential factors xij and reservoir capacity yj of 101 small reservoirs.

Figure 6

The influential factors xij and reservoir capacity yj of 101 small reservoirs.

Close modal

To analyze the rationality of the selected six influential factors, the correlation among the 101 sets of (xij, yj) was tested by the Kendall, Spearman, and Pearson correlation analysis methods. The results are shown in Table 1.

Table 1

Correlation analysis of six influential factors and reservoir capacity

VrcAscAwsLscSscLscSbRsc
Kendall Vrc 1.000 0.491 0.561 0.464 0.079 0.464 −0.016 0.226 
Asc 0.491 1.000 0.466 0.819 0.234 0.819 0.081 0.418 
Aws 0.561 0.466 1.000 0.452 −0.071 0.452 − 0.198 0.139 
Ssc 0.079 0.234 −0.071 0.211 1.000 0.211 0.498 0.644 
Lsc 0.464 0.819 0.452 1.000 0.211 1.000 0.053 0.409 
Sb −0.016 0.081 − 0.198 0.053 0.498 0.053 1.000 0.272 
Rsc 0.226 0.418 0.139 0.409 0.644 0.409 0.272 1.000 
Spearman Vrc 1.000 0.669 0.751 0.635 0.123 0.635 −0.039 0.339 
Asc 0.669 1.000 0.627 0.950 0.381 0.950 0.111 0.601 
Aws 0.751 0.627 1.000 0.613 −0.098 0.613  − 0.294 0.215 
Ssc 0.123 0.381 −0.098 0.351 1.000 0.351 0.684 0.839 
Lsc 0.635 0.950 0.613 1.000 0.351 1.000 0.076 0.590 
Sb −0.039 0.111 − 0.294 0.076 0.684 0.076 1.000 0.394 
Rsc 0.339 0.601 0.215 0.590 0.839 0.590 0.394 1.000 
Pearson Vrc 1.000 0.697 0.838 0.656 0.105 0.656 −0.050 0.360 
Asc 0.697 1.000 0.671 0.931 0.345 0.931 0.137 0.494 
Aws 0.838 0.671 1.000 0.668 −0.002 0.668 −0.153 0.308 
Ssc 0.105 0.345 −0.002 0.335 1.000 0.335 0.662 0.811 
Lsc 0.656 0.931 0.668 1.000 0.335 1.000 0.113 0.525 
Sb −0.050 0.137 −0.153 0.113 0.662 0.113 1.000 0.314 
Rsc 0.360 0.494 0.308 0.525 0.811 0.525 0.314 1.000 
VrcAscAwsLscSscLscSbRsc
Kendall Vrc 1.000 0.491 0.561 0.464 0.079 0.464 −0.016 0.226 
Asc 0.491 1.000 0.466 0.819 0.234 0.819 0.081 0.418 
Aws 0.561 0.466 1.000 0.452 −0.071 0.452 − 0.198 0.139 
Ssc 0.079 0.234 −0.071 0.211 1.000 0.211 0.498 0.644 
Lsc 0.464 0.819 0.452 1.000 0.211 1.000 0.053 0.409 
Sb −0.016 0.081 − 0.198 0.053 0.498 0.053 1.000 0.272 
Rsc 0.226 0.418 0.139 0.409 0.644 0.409 0.272 1.000 
Spearman Vrc 1.000 0.669 0.751 0.635 0.123 0.635 −0.039 0.339 
Asc 0.669 1.000 0.627 0.950 0.381 0.950 0.111 0.601 
Aws 0.751 0.627 1.000 0.613 −0.098 0.613  − 0.294 0.215 
Ssc 0.123 0.381 −0.098 0.351 1.000 0.351 0.684 0.839 
Lsc 0.635 0.950 0.613 1.000 0.351 1.000 0.076 0.590 
Sb −0.039 0.111 − 0.294 0.076 0.684 0.076 1.000 0.394 
Rsc 0.339 0.601 0.215 0.590 0.839 0.590 0.394 1.000 
Pearson Vrc 1.000 0.697 0.838 0.656 0.105 0.656 −0.050 0.360 
Asc 0.697 1.000 0.671 0.931 0.345 0.931 0.137 0.494 
Aws 0.838 0.671 1.000 0.668 −0.002 0.668 −0.153 0.308 
Ssc 0.105 0.345 −0.002 0.335 1.000 0.335 0.662 0.811 
Lsc 0.656 0.931 0.668 1.000 0.335 1.000 0.113 0.525 
Sb −0.050 0.137 −0.153 0.113 0.662 0.113 1.000 0.314 
Rsc 0.360 0.494 0.308 0.525 0.811 0.525 0.314 1.000 

Note: Bold values indicate a significant correlation at the confidence level of 0.01.

In order to demonstrate the correlation between each influential factor and reservoir capacity, the scatter plots and correlation coefficient R2 are used to show the relationship between each influential factor and reservoir capacity, as shown in Figure 7. Figure 7 shows that correlation coefficients between the reservoir capacity and Asc, Awc, and Lsc are the top three factors, and the values of R2 are 0.486, 0.705, and 0.43, respectively, which are all positive correlations. In addition, a nonlinear correlation is also tested between the reservoir capacity and Ssc, Sb, and Rsc.

Figure 7

Scatter plot of six factors and storage capacity, respectively.

Figure 7

Scatter plot of six factors and storage capacity, respectively.

Close modal

Accuracy and sensibility analysis of the estimation model

To verify the accuracy of the model and the sensitivity of the influential factors, three machine learning models and eight combinations of influential factors are used for experiments. Three models (i.e., ANN, ELM, and PSO–ELM) were used to illustrate the accuracy of the proposed model and eight combinations of factors were applied to demonstrate the sensitivity of influential factors. Accuracy refers to the differences between the reference reservoir capacity and the estimated reservoir capacity under the same combination of influential factors. Sensitivity refers to the differences between the reference reservoir capacity and the estimated reservoir capacity by the same model under different combinations of influential factors. The differences were tested by the mean absolute percentage error (MAPE) (Khair et al. 2017) and correlation coefficient R2.

Asc, Lsc, and Aws were regarded as fixed factors because they were significantly correlated with the reservoir capacity. Ssc, Sb, and Rsc were chosen as optional factors because they are insignificantly correlated with the reservoir capacity. Different simulation scenarios are obtained by the combination of the fixed factors and the optional factors. The eight combination scenarios are shown in Table 2.

Table 2

Eight combination scenarios of the six factors

ScenarioDifferent combinations of six factors
The fixed factorsThe optional factors
Asc Lsc Aws    
Asc Lsc Aws Ssc   
Asc Lsc Aws  Sb  
Asc Lsc Aws   Rsc 
Asc Lsc Aws Ssc Sb  
Asc Lsc Aws Ssc  Rsc 
Asc Lsc Aws  Sb Rsc 
Asc Lsc Aws Ssc Sb Rsc 
ScenarioDifferent combinations of six factors
The fixed factorsThe optional factors
Asc Lsc Aws    
Asc Lsc Aws Ssc   
Asc Lsc Aws  Sb  
Asc Lsc Aws   Rsc 
Asc Lsc Aws Ssc Sb  
Asc Lsc Aws Ssc  Rsc 
Asc Lsc Aws  Sb Rsc 
Asc Lsc Aws Ssc Sb Rsc 

ELM, ANN, and PSO–ELM models were established with different scenarios. The 101 small reservoirs with known information about their capacity were used to train and calibrate these models, of which 80% served as the training samples and 20% were the calibration samples. After calibration of the models, the calibration results of the three models are obtained under different scenarios, as shown in Table 3.

Table 3

The calibration results of the three models under different combinations of influential factors

ScenarioANN
ELM
PSO–ELM
MAPEOutliersR2MAPEOutliersR2MAPEOutliersR2
80.07% 0.1853 29.12% 0.7357 24.46% 0.9082 
39.56% 0.5814 25.87% 0.4473 13.22% 0.9612 
32.10% 0.8305 32.76% 0.7706 16.26% 0.9506 
48.53% 0.5117 27.77% 0.3621 16.02% 0.9545 
35.75% 0.8051 19.27% 0.7959 9.82% 0.9685 
38.74% 0.6805 20.20% 0.8521 10.69% 0.9778 
36.91% 0.5470 31.39% 0.2828 15.73% 0.9659 
32.06% 0.6904 24.33% 0.7917 10.59% 0.9699 
ScenarioANN
ELM
PSO–ELM
MAPEOutliersR2MAPEOutliersR2MAPEOutliersR2
80.07% 0.1853 29.12% 0.7357 24.46% 0.9082 
39.56% 0.5814 25.87% 0.4473 13.22% 0.9612 
32.10% 0.8305 32.76% 0.7706 16.26% 0.9506 
48.53% 0.5117 27.77% 0.3621 16.02% 0.9545 
35.75% 0.8051 19.27% 0.7959 9.82% 0.9685 
38.74% 0.6805 20.20% 0.8521 10.69% 0.9778 
36.91% 0.5470 31.39% 0.2828 15.73% 0.9659 
32.06% 0.6904 24.33% 0.7917 10.59% 0.9699 

According to Table 3 and Figure 7, the MAPE of PSO–ELM is smaller than that of the other two models (ANN and ELM) for eight combination scenarios. The value of R2 of PSO–ELM is better than that of the other two models for all scenarios. Therefore, compared with the ANN and ELM, the PSO–ELM-based estimation model is more suitable for estimating the small reservoir capacity in hilly areas.

Furthermore, the sensitivity of influential factors for the PSO–ELM is tested by different scenarios. The MAPE of scenario 1 is the largest one (24.46%) resulting in the worst estimation effect, which illustrates that only using the fixed factors can not accurately estimate the reservoir capacity. Although there is no significant linear relationship among Ssc, Sb, and Rsc and reservoir capacity, they are very important for estimating the small reservoir capacity. After adding one optional factor (scenarios 2, 3, and 4), both the MAPE and R2 are better than scenario 1. It is the same as adding two optional factors (scenarios 5, 6, and 7), both the MAPE and R2 are better than scenarios 2, 3, and 4. While, when adding three optional factors (scenario 8), the MAPE is better than scenarios 6 and 7 and less than scenario 5, and the value of R2 is better than scenarios 5 and 7 and less than scenario 6. This illustrates that blindly increasing the number of influential factors cannot continue to improve the accuracy of the model.

In fact, scenarios 5, 6, and 8 can all serve as well factor combinations; however, the MAPE of scenario 6 is slightly poor, while scenario 8 needs to add an additional factor. These two can be used as backup schemes. In scenario 5, both the number of factors and the accuracy of the model are better than in scenarios 6 and 8. Therefore, scenario 5 of Asc, Lsc, Aws, Ssc, and Sb are selected as the influential factors to establish the estimation model based on PSO–ELM to estimate the small reservoir's capacity in hilly areas.

To intuitively demonstrate improvements, the difference between the estimated capacity and the actual capacity of the three models under eight combination scenarios of influential factors were plotted as scatter graphs, as shown in Figure 8.

Figure 8

Comparison of the test results for the ANN, ELM, and PSO–ELM under eight scenarios.

Figure 8

Comparison of the test results for the ANN, ELM, and PSO–ELM under eight scenarios.

Close modal

Validation of estimation results

To further validate the robustness of the reservoir capacity estimation model established by PSO–ELM, the datasets of 12 known small reservoirs in the Mihe River basin were deleted from the training samples. Only the data of the 89 small reservoirs in the reference basin are used for training and calibration, and the 12 known reservoirs in the Mihe River basin are used to estimate and test. The results are shown in Figure 9. Figure 9 shows that even if the training samples do not contain any known information about small reservoirs in the target basin (the Mihe River basin), the MAPE increased to 15.6% and the value of R2 is 0.6863, which are still acceptable for the estimation model. This implies the robustness of the reservoir capacity estimation model established based on the five influential factors and PSO–ELM in this paper. Hence, the capacity of small reservoirs in areas without sufficient data can be estimated by transplanting the estimation model of basins with similar topography, thereby enlarging the application range of the proposed model.

Figure 9

Validation results of known small reservoirs in watersheds.

Figure 9

Validation results of known small reservoirs in watersheds.

Close modal

Estimation results of the capacity of small reservoirs

A total of 123 small reservoirs were identified in the study area, as shown in Figure 3. Among them, the capacity of 12 reservoirs is known and the capacity of the remaining 111 small reservoirs needs to be estimated. The five influential factors xij of these 111 small reservoirs were substituted into the trained estimation model to calculate the reservoir capacity yj, as shown in Figure 10. Figure 10 shows that the estimated capacity of 111 small reservoirs ranges from 0.54 to 3.80 million m3, which meets the requirements of the small reservoir's capacity defined in this paper (0.1–10 million m3), so the estimated results are rational.

Figure 10

The results of influential factors and the estimated reservoirs’ capacity in the study area.

Figure 10

The results of influential factors and the estimated reservoirs’ capacity in the study area.

Close modal

The acquisition of hydrological data in areas without sufficient monitoring has always been a heated topic and difficult point in hydrological research. Based on remote sensing images and DEM data, this paper estimated the capacity of small reservoirs without enough data in hilly areas.

Five topographic and geomorphic factors, namely the area of sub-catchment controlled by the reservoirs (Asc), the water surface area of the reservoir (Aws), the longest flow path of sub-catchment (Lsc), the average slope of sub-catchment (Ssc), and the average slope of buffer area (Sb), are selected as influential factors to establish the estimation model of reservoir capacity in the hilly area. These five factors are the best combination of simulation results among the information that can be mined so far.

Compared with the results of ANN and ELM, the estimated result of the PSO–ELM for the reservoir capacity is significantly better in hilly areas. Meanwhile, due to the robustness of the estimation model based on PSO–ELM, the estimation model established in the reference basin with monitoring data can be transferred to the target basin with similar topography and the lack of monitoring data.

In addition, the proposed method in this paper is applicable to estimate the capacity of small reservoirs located in hilly areas with a water surface area of greater than 0.002 km², while the effect of this method in estimating the capacity of the other types of reservoirs has not been studied. In the next step, a field exploration will be carried out on small reservoirs without sufficient data, and the simulation results will be compared with the measured results.

Financial support for this work is provided by the National Natural Science Foundation of Shandong Province (No. ZR2021QE009) and the science and technology projects of the Hydrology Center of Shandong Province: Impact of Rainstorm on Water Resources Management (No. SDYD2020-425).

All relevant data are included in the paper and the Supplementary Material.

The authors declare there is no conflict.

Asefa
T.
,
Kemblowski
M.
,
McKee
M.
&
Khalil
A.
2006
Multi-time scale stream flow predictions: the support vector machines approach
.
Journal of Hydrology
318
(
1–4
),
7
16
.
Atiquzzaman
M.
&
Kandasamy
J.
2016
Prediction of hydrological time-series using extreme learning machine
.
Journal of Hydroinformatics
18
(
2
),
345
353
.
Atiquzzaman
M.
&
Kandasamy
J.
2018
Robustness of extreme learning machine in the prediction of hydrological flow series
.
Computers & Geosciences
120
,
105
114
.
Avisse
N.
,
Tilmant
A.
,
Müller
M. F.
&
Zhang
H.
2017
Monitoring small reservoirs’ storage with satellite remote sensing in inaccessible areas
.
Hydrology and Earth System Sciences
21
(
12
),
6445
6459
.
Cao
J.
,
Lin
Z.
&
Huang
G.-B.
2010
Composite function wavelet neural networks with extreme learning machine
.
Neurocomputing
73
,
1405
1416
.
Chen
J.
,
Shi
H.
,
Sivakumar
B.
&
Peart
M. R.
2016
Population, water, food, energy and dams
.
Renewable and Sustainable Energy Reviews
56
,
18
28
.
Chen
W.
,
Nover
D.
,
He
B.
,
Yuan
H.
,
Ding
K.
,
Yang
J.
&
Chen
S.
2018
Analyzing inundation extent in small reservoirs: a combined use of topography, bathymetry and a 3D dam model
.
Measurement
118
,
202
213
.
Deka
P. C.
2014
Support vector machine applications in the field of hydrology: a review
.
Applied Soft Computing
19
,
372
386
.
Eilander
D.
,
Annor
F. O.
,
Iannini
L.
&
Van de Giesen
N.
2014
Remotely sensed monitoring of small reservoir dynamics: a Bayesian approach
.
Remote Sensing
6
(
2
),
1191
1210
.
Fassoni-Andrade
A. C.
,
de Paiva
R. C. D.
&
Fleischmann
A. S.
2020
Lake topography and active storage from satellite observations of flood frequency
.
Water Resources Research
56
(
7
),
e2019WR026362
.
Gao
H.
2015
Satellite remote sensing of large lakes and reservoirs: From elevation and area to storage. Wiley Interdisciplinary Reviews:
Water
2
(
2
),
147
157
.
Getirana
A.
,
Jung
H. C.
&
Tseng
K. H.
2018
Deriving three dimensional reservoir bathymetry from multi-satellite datasets
.
Remote Sensing of Environment
217
,
366
374
.
Huang
G. B.
,
Zhu
Q. Y.
&
Siew
C. K.
2006
Extreme learning machine: theory and applications
.
Neurocomputing
70
(
1–3
),
489
501
.
Huang
G.
,
Huang
G. B.
&
Song
S.
2015
Trends in extreme learning machines: a review
.
Neural Networks
61
,
32
48
.
Khair
U.
,
Fahmi
H.
,
Al Hakim
S.
&
Rahim
R.
2017
Forecasting error calculation with mean absolute deviation and mean absolute percentage error
.
Journal of Physics: Conference Series
930
,
012002
.
Langhorst
T.
,
Pavelsky
T. M.
,
Frasson
R. P. D. M.
,
Wei
R.
,
Domeneghetti
A.
,
Altenau
E. H.
,
Durant
M. T.
,
Minear
J. T.
,
Wegmann
K. W.
&
Fuller
M. R.
2019
Anticipated improvements to river surface elevation profiles from the surface water and ocean topography mission
.
Frontiers in Earth Science
7
,
102
.
Liebe
J.
,
Van De Giesen
N.
&
Andreini
M.
2005
Estimation of small reservoir storage capacities in a semi-arid environment: a case study in the Upper East Region of Ghana
.
Physics and Chemistry of the Earth, Parts A/B/C
30
(
6–7
),
448
454
.
Liebe
J. R.
,
Van De Giesen
N.
,
Andreini
M.
,
Walter
M. T.
&
Steenhuis
T. S.
2009
Determining watershed response in data poor environments with remotely sensed small reservoirs as runoff gauges
.
Water Resources Research
45
(
7
),
w07410
.
Liu
K.
,
Song
C.
,
Wang
J.
,
Ke
L.
,
Zhu
Y.
,
Zhu
J.
&
Luo
Z.
2020
Remote sensing-based modeling of the bathymetry and water storage for channel-type reservoirs worldwide
.
Water Resources Research
56
(
11
),
e2020WR027147
.
Mahmood
S. F.
,
Marhaban
M. H.
,
Rokhani
F. Z.
,
Samsudin
K.
&
Arigbabu
O. A.
2017
FASTA–ELM: a fast adaptive shrinkage/thresholding algorithm for extreme learning machine and its application to gender recognition
.
Neurocomputing
219
,
312
322
.
Mategaonkar
M.
,
Eldho
T. I.
&
Kamat
S.
2018
In-situ bioremediation of groundwater using a meshfree model and particle swarm optimization
.
Journal of Hydroinformatics
20
(
4
),
886
897
.
Mehran
A.
,
Li
D.
&
Lettenmaier
D. P.
2019
Assessing the potential of the surface water and ocean topography mission for reservoir bathymetry estimation
. In
AGU Fall Meeting Abstracts
, Vol.
2019
, pp.
H43N-2260
.
Mouatadid
S.
&
Adamowski
J.
2017
Using extreme learning machines for short-term urban water demand forecasting
.
Urban Water Journal
14
(
6
),
630
638
.
MWR
2016
Statistic Bulletin on China Water Activities
.
Ministry of Water Resources of the People's Republic of China
. .
Ogilvie
A.
,
Belaud
G.
,
Massuel
S.
,
Mulligan
M.
,
Le Goulven
P.
&
Calvez
R.
2018
Surface water monitoring in small water bodies: potential and limits of multi-sensor Landsat time series
.
Hydrology and Earth System Sciences
22
(
8
),
4349
4380
.
Pham
B. T.
,
Luu
C.
,
Van Phong
T.
,
Trinh
P. T.
,
Shirzadi
A.
,
Renoud
S.
,
Asadi
S.
,
Van Le
H.
,
Meding
J.
&
Clague
J. J.
2021
Can deep learning algorithms outperform benchmark machine learning algorithms in flood susceptibility modeling?
Journal of Hydrology
592
,
125615
.
Rodrigues
L. N.
,
Sano
E. E.
,
Steenhuis
T. S.
&
Passo
D. P.
2012
Estimation of small reservoir storage capacities with remote sensing in the Brazilian Savannah Region
.
Water Resources Management
26
(
4
),
873
882
.
Rosenberg
D. M.
,
McCully
P.
&
Pringle
C. M.
2000
Global-scale environmental effects of hydrological alterations: introduction
.
BioScience
50
(
9
),
746
751
.
Shouguang Water Resources Bureaus
2021
Mi River Flood Prevention Programs
.
Shouguang Water Resources Bureau
,
Shandong Province, China
. .
Tanty
R.
&
Desmukh
T. S.
2015
Application of artificial neural network in hydrology – a review
.
International Journal of Engineering Research & Technology
4
,
184
188
.
Tseng
K. H.
,
Shum
C. K.
,
Kim
J. W.
,
Wang
X.
,
Zhu
K.
&
Cheng
X.
2016
Integrating Landsat imageries and digital elevation models to infer water level change in Hoover Dam
.
IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing
9
(
4
),
1696
1709
.
Votruba
L.
&
Broza
V.
1989
Water Management in Reservoirs
.
Elsevier
,
USA
and
Canada
.
World Commission on Dams
.
2000
Dams and Development: A new Framework for Decision-Making: The Report of the World Commission on Dams
.
Earthscan
,
London
,
UK
.
Xu
Y.
&
Shu
Y.
2006
Evolutionary extreme learning machine-based on particle swarm optimization
. In:
International Symposium on Neural Networks
,
Springer
,
Berlin, Heidelberg
, pp.
644
652
.
Yang
G.
,
Wang
J.
,
Liang
Z.
,
Li
B.
&
Fu
H.
2021
Research on flood forecasting method considering the influence of small reservoir with missing data
.
China Rural Water and Hydropower
3
,
98
102 + 118
.
Zhang
X.
,
Srinivasan
R.
&
Liew
M. V.
2009
Approximating SWAT model using artificial neural network and support vector machine1
.
JAWRA Journal of the American Water Resources Association
45
(
2
),
460
474
.
Zhang
S.
,
Foerster
S.
,
Medeiros
P.
,
de Araújo
J. C.
,
Motagh
M.
&
Waske
B.
2016
Bathymetric survey of water reservoirs in north-eastern Brazil based on TanDEM-X satellite data
.
Science of the Total Environment
571
,
575
593
.
Zhu
Q. Y.
,
Qin
A.
,
Suganthan
P.
&
Huang
G. B.
2005
Evolutionary extreme learning machine
.
Pattern Recognition
38
,
1759
1763
.
Zhu
B.
,
Feng
Y.
,
Gong
D.
,
Jiang
S.
,
Zhao
L.
&
Cui
N.
2020
Hybrid particle swarm optimization with extreme learning machine for daily reference evapotranspiration prediction from limited climatic data
.
Computers and Electronics in Agriculture
73
,
105430
.
Zurada
J. M.
,
Malinowski
A.
&
Usui
S.
1997
Perturbation method for deleting redundant inputs of perceptron networks
.
Neurocomputing
14
(
2
),
177
193
.
This is an Open Access article distributed under the terms of the Creative Commons Attribution Licence (CC BY 4.0), which permits copying, adaptation and redistribution, provided the original work is properly cited (http://creativecommons.org/licenses/by/4.0/).

Supplementary data