Rainfall has a dominant role in rainfall-runoff models, with the rendering of these models depending on the data accuracy and on the way that rainfall is spatially allocated. The research proposes a methodological framework where a genetic algorithm (GA)-based method responsible for the spatial distribution of gauge observations at the basin scale is coupled with the HEC-HMS hydrological model to produce simulated discharges of high accuracy. The custom-developed GA is used to divide a 2D space, adhering to specific criteria, into polygonal geometries that represent gauge zones of influences, similar to the Thiessen polygon method concept. A collection of vectorial polygonal areas, equivalent in number to the employed monitoring stations, is produced with the areal weights to be used for distributing the rainfall across the case study basin and subsequently to force the hydrological simulations. The generated gauge weights are validated for a different temporal precipitation event. The final outputs expressed through a series of statistical measures, clearly demonstrate the effectiveness of the specific methodology (e.g. R2 and Nash–Sutcliffe are larger than 0.83 and 0.73). The methodology can foster accurate hydrological simulations, especially in cases where there is a limited number of rainfall stations and corresponding observations.

  • Geometric-based rain distribution introduces uncertainties in hydrological models.

  • Genetic algorithms (GAs) can produce alternate rain distribution shapes.

  • Automated coupling of GA with the hydrological model at a well-defined case basin.

  • Validation of the GA gauge weight method performance at different temporal events.

  • New perspectives in watershed's hydrological modeling with limited gauges.

Emerging alternate products, such as precipitation ones, are the response to sidestepping the lack of densely monitoring networks and derived observations (Kidd et al. 2017) that are consistently needed by the scientific community and component authorities, such as UNESCO's Intergovernmental Hydrological Programme (IHP). The non-traditionally captured rainfall data are usually derivatives of (i) remote sensing and/or terrestrial observation-based gridded datasets, e.g. the Integrated Multi-satellite Retrievals (IMERG) for Global Precipitation Measurement (GPM) mission (Pradhan et al. 2022), (ii) reanalysis datasets, e.g. the National Center for Environmental Prediction–National Center for Atmospheric Research reanalysis (NCEP/NCAR) (Kanamitsu et al. 2002), or the European Centre for Medium-Range Weather Forecasts (ECMWF) ERA5 data (Hersbach et al. 2020), and of (iii) reprocessed datasets, e.g. the European gridded dataset (E-OBS) (Hofstra et al. 2008), with the latter dataset being developed with the use of spatial interpolation techniques (Mavromatis & Voulanas 2021). The use of these datasets is broadly considered in hydrological modeling (e.g. Skoulikaris et al. 2019; Probst & Mauser 2022).

Regardless of the source, rainfall is the most significant input variable in rainfall-runoff models; it can thus be established that it decisively controls the accuracy of hydrological simulations (Renard et al. 2010). The impacts of rainfall's data errors and derived uncertainties on hydrological modeling are thoroughly investigated in the literature (e.g. Bárdossy & Das 2008). Synoptically, in gauge-based rainfall monitoring cases, these errors are related to the quality of the data itself (La Barbera et al. 2002), and/or to the mean areal precipitation usually used in non-event based simulations (Moulin et al. 2009), and/or to the spatial and temporal discretization of the gauge stations, i.e. the gauge network density (Skoulikaris et al. 2019; Wang et al. 2023), and/or to rainfall's variability accurate representation (Xu & Singh 1998). Moreover, the coverage scarcity in terms of gauges at global scale (Mishra & Coulibaly 2009) and the consequent use of interpolation techniques applied to spatially distribute point source information at watershed scales impose additional input-based uncertainties in the modeling process (Moulin et al. 2009).

The most widely applied precipitation interpolation based schemes are the Arithmetic Mean, Thiessen polygons, Inverse Distance Weighting (IDW), different types of Kriging, and the Spline method (Hohmann et al. 2021; Skoulikaris et al. 2022), with the literature depicting numerous comparative research on precipitation interpolation methods. For example, Liu et al. (2020a) concluded that the mix of methods may perform better than single applied methods, while the IDW method performed well but not in all the case study sites. Yang et al. (2015) also concluded that daily rainfall in an Australian case study was better represented by the IDW method in comparison to other methods. Nikolopoulos et al. (2015) showed that for the assessment of the debris flow, which relies on rainfall derivatives, the Thiessen polygon method works as good as more complex methods. Guo et al. (2022) proposed the use of the spline's method variation when coupled with hydrological models in Chaohe River basin, China. Recent advancements propose supplementing these well-established methods with a copula-based probability model that represents a multivariate uniform distribution, which examines the dependence between multiple variables. In other words, as supported by Li & Babovic (2019), copulas facilitate the isolation of joint or marginal probabilities for a pair of variables that are enmeshed in a more complex multivariate system, such as the rainfall-runoff hydrosystem. However, little evidence exists on the optimal interpolation method since it depends on the examined variable, the case study area physiography and extend, the available data quality and quantity and the number and spatial distribution of the available data points (Ohmer et al. 2017). Little consensus is also shown when the temporal and not only the spatial component of the rainfall data is considered. For example, Skoulikaris et al. (2022) in their analysis on whether bias correction or spatiotemporal interpolation should go first in hydrologic simulations, suggested the use of the spatiotemporal kriging method, while Hussain et al. (2010) highlight the use of the transformed hierarchical Bayesian method during monsoon periods in Pakistan.

Focusing on the Thiessen polygon method, it belongs to the so-called gauge weights precipitation methods and makes use of geometry rules to allocate point information, i.e. stations’ rainfall, to an area, i.e. to a basin's extent. Particularly, in this method, as thoroughly described by Croley & Hartmann (1985), all points are connected two by two with straight-line segments and vertical bisectors are drawing to these segments. Then, the vertical bisectors intersect and create as many polygons as the number of stations, covering the watershed of interest. The bisector-based geometric approach determines that any location within a Thiessen polygon is closer to its associated point than to any other point (that's for it is also known as the Nearest Neighbor (NN) method), thus observations near one another in space are more likely to be similar than those which have that are far apart. Τhe latter is also known as the spatial autocorrelation approach broadly used in geostatistics. The Thiessen polygon method is widely used in hydrologic modeling applications (e.g. Skoulikaris & Ganoulis 2012; Duraisekaran et al. 2021). Nonetheless, it is imperative to argue that distance alone cannot serve as the sole surrogate measure of spatial correlation between data (Teegavarapu 2012). For instance, the variability of precipitation may be subjected to topographic characteristics including local relief peculiarities (Formetta et al. 2022). Furthermore, deterministic geometric rules of gauge weights precipitation methods lead to a sharp change in rainfall at the defined geometric boundaries, as well as its uniform distribution within these boundaries, i.e. an arbitrary non-physically based consideration (Zhao et al. 2022). To that end, and due to the complexity and non-linearity inherent in rainfall patterns, emerging techniques such as genetic algorithms (GAs) or artificial intelligence (AI) are considered modern approaches in the spatial allocation, as well as the forecasting, of rainfall (Liu et al. 2020b; Fooladi et al. 2023).

GAs are a meta-heuristic method inspired by evolution, with a variety of applications in water resources topics (Maier et al. 2014). The literature demonstrates that GAs can be directly integrated into the modeling procedure, such as being used for the calibration of surface and groundwater hydrological models (e.g. Shafii & De Smedt 2009; Del Giudice & Padulano 2016; Kirlas & Nagkoulis 2023). GAs are also enabled as optimizers, e.g. for aquifer recharge locations’ selection (Kourakos et al. 2023) or for selecting the number and location of pressure sensors (Soroush & Abedini 2019). Scholars have investigated the usefulness of GAs in precipitation forecasting (Salih et al. 2020), while GAs have been partially employed for the optimization of techniques or numerical models dedicated to rainfall distribution. For instance, Bărbulescu et al. (2021) used GAs to optimize the parameter β (usually set to 2) corresponding to the power affecting the weight in IDW, while Chang et al. (2005) implemented fuzzy theory alongside with IDW to interpolate the precipitation, with GAs used to determine the fuzzy-related parameters.

The objective of the research is to introduce a novel rainfall distribution method designed to enhance hydrological modeling performance. To accomplish this goal, GAs are engaged in the areal segmentation of a particular river basin in terms of precipitation coverage, with the outputs driving a widely recognized hydrological model, specifically the Hydrologic Engineering Center–Hydrologic Modeling System (HEC-HMS) model, within a well-established pilot basin. The optimal simulations, achieved by comparing observed discharges with their corresponding simulated values during a designated calibration period, define the rainfall distribution geometry responsible for this optimal state. Subsequently, the suitability of the selected geometries is validated across different time periods providing very promising results. The proposed methodology represents a modern, innovative, and fully automated approach that enhances hydrological simulation outputs by optimizing the input data spatial allocation. It can be readily applied to any case study basin with recorded discharge measurements.

Case study area and hydrological model

In the research, the investigation focuses on the behavior of an alternate rainfall distribution method rather than well-established gauge weight methods, such as the Thiessen polygon method, when integrated with hydrologic modeling processes at a basin scale. The research was conducted in the demonstration watershed featured in the online tutorial and guide of the HEC-HMS1 for the following reasons: (i) minimize biases and uncertainties arising from hydrological model parameterization and focus solely on the influence of rainfall distribution on discharge, (ii) ensure that the input data, e.g. precipitation and observed runoff, are reliable and lacking of monitoring errors, and (iii) compare the derived outputs with outputs of high accuracy. The selection was made to provide an optimal environment for our study; hence by selecting a well-established case study area and hydrological model and solely modifying the rainfall distribution scheme in each set of simulations based on the GA algorithm outputs (please refer to the next section for detailed information), it is considered that the criteria set were effectively met.

The HEC-HMS hydrological model has been developed by the US Army Corps of Engineers (Feldman 2000) and is a well-established and routinely applied conceptual and lumped model to various spatiotemporal scales. It is used to analyze, among others, urban flooding, assess the flood frequency, perform hydrologic simulations and reservoirs simulations (Frysali et al. 2023; Gelete et al. 2023), while it can be also combined with artificial neural networks (ANNs) for streamflow simulations (Gunathilake et al. 2021). Epigrammatically and regarding its operation, the simulation of a watershed's runoff is conducted with specific predefined methods attributing the various components and processes of the hydrological cycle at the sub-basin scale, with precipitation being the main input variable. In particular, the so-called loss methods are responsible for computing the infiltration and the resulting runoff volume, the transform methods are used to represent the direct runoff, including overland flow and interflow, the baseflow methods for assessing the subsurface flow, and the routing methods for calculating open channel flow (HEC-HMS User Manual 2016). The methods provided in HEC-HMS are either designed for simulating events, or others support continuous simulation.

Within the HEC-HMS tutorial, the model is applied in the Mahoning Creek watershed, which is one of the sub-basins of the Allegheny River basin in Pennsylvania. The Allegheny River basin, in turn, is a sub-basin of the Ohio River basin and the Mississippi Basin in the United States. The Mahoning Creek watershed covers an area of 407.44 km2 with the river's streamflow to be gauged at the basin's outlet at 15 min time step, while the rainfall is observed by three rainfall stations (namely, DUJP, MFFP, and PNXP), Figure 1, at hourly time step. Within the tutorial, the rainfall distribution is conducted with the use of the Thiessen gauge weight method and the streamflow simulation occurs for a 6-day precipitation event, i.e. from 28 April 1996 to 3 May 1996. The methods chosen for simulating infiltration losses, direct runoff, and baseflow are the ‘Initial and Constant’ method, the ‘Clark Unit Hydrograph’ method, and the ‘Linear Reservoir’ method, respectively. These methods are listed in Table 1, while no routing method is employed. The comparison between the simulated and observed discharges depicts the high quality of the output data as indicated by the statistical metrics of the model (coefficient of determination (R2) = 0.9924, Nash–Sutcliffe efficiency (NSE) = 0.9914, root mean square error (RMSE) = 0.0026 m3/s).
Table 1

Watershed processes methods and parameters used for the simulation of the case study basin

ProcessesMethodParametersParameters’ values
Loss Initial and Constant Initial deficit (mm) 12.7 
  Maximum deficit (mm) 203.2 
  Constant rate (mm/h) 0.2794 
  Impervious (%) 
Transform Clark Unit Hydrograph Time of concentration (h) 11 
  Storage coefficient (h) 31 
  Time–area method Default 
Baseflow Linear Reservoir Layers 
  Initial (m3/s/km2
  Fraction 0.5 
  Coefficient (h) 330 and 1,100 for each layer 
ProcessesMethodParametersParameters’ values
Loss Initial and Constant Initial deficit (mm) 12.7 
  Maximum deficit (mm) 203.2 
  Constant rate (mm/h) 0.2794 
  Impervious (%) 
Transform Clark Unit Hydrograph Time of concentration (h) 11 
  Storage coefficient (h) 31 
  Time–area method Default 
Baseflow Linear Reservoir Layers 
  Initial (m3/s/km2
  Fraction 0.5 
  Coefficient (h) 330 and 1,100 for each layer 
Figure 1

Illustration of the Mahoning Creek watershed, part of the Ohio River basin (red polygon in left frames) which is sub-basin of the Mississippi basin (black polygon in the upper left frame). The location of the hydrometeorological gauges used for triggering and validating the HEC-HMS hydrological model simulation is shown with stars and green doted symbols, respectively.

Figure 1

Illustration of the Mahoning Creek watershed, part of the Ohio River basin (red polygon in left frames) which is sub-basin of the Mississippi basin (black polygon in the upper left frame). The location of the hydrometeorological gauges used for triggering and validating the HEC-HMS hydrological model simulation is shown with stars and green doted symbols, respectively.

Close modal

In the current study, the HEC-HMS parameters and coefficients representing the various hydrological processes remain constant across all conducted simulations, with their values matching those used in the HEC-HMS tutorial (Table 1). While the tutorial employs the U.S. Customary metric system for data units, e.g. rainfall and discharges are presented in inches and cubic feet per second, the outputs in this paper are presented using the international metric system. The case study data repository also includes precipitation and discharge data for the period of April 10 to April 14, 1994. Although these data are not utilized in the tutorial, they are employed to validate the proposed methodology, as described in the following section. Finally, the validation of the proposed methodology is conducted through the well-established goodness-of-fit measures, i.e. the RMSE, NSE, R2 (Krause et al. 2005; Chadalawada & Babovic 2019), plus the percent bias (%) index.

Generic algorithm and objective function

The development of the various geometries attributing the best possible rainfall distribution, namely the optimization problem at the basin scale, was accomplished with the use of R programming language and GAs (Scrucca 2013). The sequential implementation approach begins with initially creating a set of random binary vectors (first generation of chromosomes) that represent possible solutions to the optimization problem. The created vectors are validated using an objective function (presented in the following paragraph) that verifies the accuracy of the solution. Thereafter, new vectors are created (next generation of chromosomes) and are tested again for their accuracy. The implemented loop process continues until a threshold is reached, with the threshold either being an accuracy limit, or a number of created generations. In this study, we performed 700 generations containing 50 chromosomes each. The crossover probability has been chosen to be 0.85 and the mutation probability equals 0.2. Elitism is used by keeping the five best chromosomes of each generation to the next generation without any transformation.

The vectors’ creation derives from the designation of arbitrary points within and around the case study basin using their X and Y values as variables. Subsequently, the points are connected to each other with the use of K-Nearest Neighbor Graphs (K-NNGs) (Eppstein et al. 1997). K-NNGs are the graphs that result by connecting each node to the k closest neighbors. In this paper, the approach of k = 1 is utilized, connecting each node to each closest neighbor which practically leads to the creation of a set of lines responsible for dividing the space into polygons. The binary chromosomes are transformed to decimal sets of to create points. In this way, each point is connected to at least one other point, whereas the number of connections can increase in case of multiple points close to each others. In terms of the objective function, GA aims to minimize the RMSE between the simulated and observed discharge time series. Subsequently, the GA outputs are passed into the hydrological model to generate corresponding simulated runoff for each set of vectors. Schematically the initial points will yield a RMSE for the obtained solutions. Hence, the objective of the optimization is the following:
where is a set of initial Euclidean points with , and RMSE, which is calculated automatically from the HEC-HMS model, is based on a number (m) of measured values and the simulated ones . In order to identify a threshold of vector points needed to implement the methodology, 8, 10, and 12 points are used as the base for polygon constructions through the K-NNGs application. The chromosomes’ length is a result of the number of the points and the discretization of the area (approximately 50 m × 50 m). For example for the case of 8 points, the chromosomes’ length is 252 bits. It is important to point out that the way that the polygons created are embedded to the HEC-HMS simulation is through the gauge weight method. The ratio between the area of each polygon and the overall area is used to assign a weight to the station, similar to the Thiessen method.
To sum up, the methodological steps are presented in the flowchart of Figure 2. By setting a basin's extent and the coordinates of existing meteorological gauges, GAs create a number of points which when fulfilling an areal proximity criterion are converted to polygons. The created polygons after satisfying specific criteria are used for calculating the gauge weights for the rainfall distribution. The latter is automatically inserted within the hydrological model and the simulated discharges are compared with the observed ones with the use of the RMSE statistical metric. When the number of generations is reached, 700 generations in our case, the optimum solutions are saved and the corresponding geometries are converted to shapefiles, which is the most common format used in geographic information systems tools. To guide the algorithm to the optimum solution, three penalties are introduced:
  • Penalty A: If at least one of the generated points falls outside the designated area, a high penalty is applied. The penalty increases further if multiple points are situated outside the designated area. This particular penalty encourages the GA to ensure that all points remain within the examined area, facilitating the implementation of K-NNGs and the subsequent creation of polygons.

  • Penalty B: If the number of created polygons does not match the number of stations, a moderate penalty is applied. A greater disparity between the polygon number and the station number leads to an increased penalty.

  • Penalty C: If a polygon intersects with more than one station or less than one station, a penalty is applied. Each polygon should encompass exactly one station, similar to the Thiessen polygons concept. In the case that this criterion is not met a low penalty is given. The penalty decreases as the number of polygons containing precisely one station increases.

Figure 2

Schematic representation of processes flow within the proposed methodological modeling approach for the assessment of geometrical shapes representing the best rainfall distribution used in hydrological modeling.

Figure 2

Schematic representation of processes flow within the proposed methodological modeling approach for the assessment of geometrical shapes representing the best rainfall distribution used in hydrological modeling.

Close modal

In all three cases, the fitness value guides GA in identifying distinct solutions that satisfy the problem's criteria. Once these criteria are met, the fitness value becomes equivalent to the RMSE, with the aim of minimizing it.

Rainfall datasets

The default simulation outputs, i.e. those from the tutorial, for the Mahoning Creek watershed concern a precipitation event that occurred from 28 April 1996 to 3 May 1996, which serves as the calibration period for the methodology; thus, the specific event, both in terms of duration and data, is used as the algorithm's training period. Equally, the validation period of the GA is a storm event that took place from 10 April 1994 to 14 April 1994. The hourly recorded rainfalls for both time periods and for the three gauge stations are presented in the charts of Figure 3.
Figure 3

Recorded rainfalls in all the gauge stations (namely, DUJP, MFFP, and PNXP) of the case basin used for the calibration (28 April 1996–3 May 1996) and validation (10 April 1994–14 April 1994) of the utilized GA.

Figure 3

Recorded rainfalls in all the gauge stations (namely, DUJP, MFFP, and PNXP) of the case basin used for the calibration (28 April 1996–3 May 1996) and validation (10 April 1994–14 April 1994) of the utilized GA.

Close modal

Examining the data for the calibration period (Figure 3(a)), it is observed that the DUJP station recorded the maximum rainfall of 10.67 mm on 30 April 1996. On the same day, MFFP and PNXP also recorded substantial rainfall, with both stations measuring 9.1 mm of rainfall height. Regarding the MFFP station, its maximum rainfall height of 9.4 mm recorded one day earlier. The total recorded rainfall during the calibration period is 146.1 mm, with the DUJP, MFFP, and PNXP stations recording 44.7, 52.3, and 49.1 mm, respectively. On the other hand, and during the validation period of April 1994, the PNXP station recorded the highest rainfall of 14.98 mm on April 12th. On the same day, the DUJP and MFFP stations recorded rainfall amounts that were 52.6 and 75.9% smaller than the maximum, respectively. Over the 5-day storm event in this validation period, the DUJP, MFFP, and PNXP stations recorded rainfall totals of 66.0, 28.5, and 84.6 mm, respectively, summing up to 179.1 mm in total. Comparatively, the cumulative rainfall during the 5-day storm event in the validation period is 18.5% greater than the one of the 6-day event in the calibration period.

From a spatial perspective, during the calibration period, the DUJP station that is located farther from the catchment's boundaries, recorded the highest precipitation, with the DUJP and PNXP stations registering approximately 14.5 and 6.1% less rainfall respectively than MFFP. In contrast, during the validation period, the PNXP station, which is the only one found within the catchment, recorded 21.9% more rainfall than DUJP and 66.3% more than MFFP. This highlights the randomness of the rainfall pattern in the area of interest and the non-stationarity of the utilized datasets.

Compliance to the penalties

The implementation of the custom-developed GA commences with generating an initial set of random point vectors, which set the base for the polygons’ designation. To confine the potential locations of these points in terms of distance from the case study basin, an artificial orthogonal boundary ranging from 10 to 25 km around the watershed's perimeter is drawn, as shown in Figure 4(a). Thereafter, the algorithm starts creating the point vectors. In the current research, we request the algorithm to initially generate 8, 10, and 12 vector points, while considering the specified penalties. Particularly Penalty A, which mandates that the created points must fall within the designated boundary. For instance, in Figure 4(b), a case where two randomly generated points (indicated by red pins) fall outside the predetermined boundaries is observed, while the points within the designated area are represented by black pins; consequently, this particular set of points gets a high penalty, making it very unlikely to be considered further. It should be mentioned that, for the sake of simplicity, Figure 4 contains only the outputs of the utilization of 8 points, as well as the algorithm's further description considers the use of 8 points.
Figure 4

Illustration of the case study basin where (a) an outer polygonal boundary (black dotted curve) is artificial designated for the spatial control of the GA application area, (b) a set of 8 random point vectors (black pins) have been generated with two of them (gray pins) being outside the predetermined boundaries (namely, Penalty A), (c) the automated created polygons overpass the number of the rainfall stations (namely, Penalty B), and (d) a one-to-one correspondence between stations and polygons occurs, but one polygon does not intersect with a station (namely, Penalty C).

Figure 4

Illustration of the case study basin where (a) an outer polygonal boundary (black dotted curve) is artificial designated for the spatial control of the GA application area, (b) a set of 8 random point vectors (black pins) have been generated with two of them (gray pins) being outside the predetermined boundaries (namely, Penalty A), (c) the automated created polygons overpass the number of the rainfall stations (namely, Penalty B), and (d) a one-to-one correspondence between stations and polygons occurs, but one polygon does not intersect with a station (namely, Penalty C).

Close modal

Once the point sets that fall within the orthogonal boundary of the case study area are defined, the algorithm proceeds to generate K-NNGs for delineating the orthogonal area into polygons. These generated polygons are then evaluated for compliance with the constraints outlined by the other penalties. Starting with Penalty B, where achieving a one-to-one correspondence between stations and polygons is an ideal scenario, the algorithm needs to explore numerous combinations to meet these requirements. Figure 4(c), for example, illustrates a case where the number of automatically drawn polygons is twice that than of the stations, i.e. six polygons vs. three stations. The polygons that successfully satisfy the one-to-one rule are examined to ensure their conformity with Penalty C, which involves validating whether each station precisely intersects with a unique polygon. In alignment with Penalty C, Figure 4(d) shows a generated scenario where a low penalty is assigned, since despite having three stations and exactly three polygons, one polygon does not encompass any station.

Finally, all the sets of polygons produced by 8, 10, or 12 points and successfully met the custom criteria are further evaluated for their effectiveness in spatially distributing rainfall compared to the Thiessen polygon method, as detailed in the following sections.

GA spatial outputs

The numerical representation of the spatial entities (polygons) is the gauge weight method. The default weights used in the tutorial (namely Thiessen polygons) as well as the best ones determined after implementing the GA that was formulated starting with 8, 10, and 12 vector points, respectively (hereinafter these methods are named as GA – 8 points, GA – 10 points, and GA – 12 points, respectively), are depicted in Table 2. As observed, in the case of the GA – 12 points method, the distribution of rainfall follows the spatial pattern of the Thiessen polygon method. This means, that in both methods, the station PNXP has the greatest influence within the catchment, covering 71.6 and 57.0% of the catchment area, respectively. In the case of the GA – 8 points method, the PNXP station has less influence, while each of the other two stations, DUJP and MFFP stations, cover approximately 36.8% of the watershed. Notably, in the case of the GA – 10 points method, the PNXP station dominates over the other two stations, with a spatial coverage of 99.5%; in other words, the rainfall from the other two stations has a negligible impact on the river hydrology and corresponding hydrological simulations.

Table 2

Gauge weights for rainfall distribution, categorized by each method, as used in the HEC-HMS model during the calibration period from 28 April 1996 to 3 May 1996

Gauge nameGauge weights per method
Thiessen polygonsGA – 8 pointsGA – 10 pointsGA – 12 points
DUJP 0.12 0.369 0.004 0.125 
MFFP 0.31 0.368 0.001 0.159 
PNXP 0.57 0.263 0.995 0.716 
Gauge nameGauge weights per method
Thiessen polygonsGA – 8 pointsGA – 10 pointsGA – 12 points
DUJP 0.12 0.369 0.004 0.125 
MFFP 0.31 0.368 0.001 0.159 
PNXP 0.57 0.263 0.995 0.716 

The visual illustration of the spatial influence of each rainfall station on the precipitation distribution over the case study catchment for each implemented method is given in Figure 5. In the case of the Thiessen polygon method (Figure 5(a)), the coverage exhibits the expected and well-known spatial pattern associated with this specific method. Moving on to the GA – 8 points method (Figure 5(b)), the algorithm generates a shape comprising three more peculiar polygons, in terms of geometry, compared to the Thiessen polygons output. However, this new shape distributes the rainfall more smoothly, i.e. there is not much difference between the coverage percentages. As far the GA – 10 points method is concerned (Figure 5(c)), the algorithm predominantly relies on a single station (accounting for 99.5% of coverage), which is also the only station located within the catchment, while the rainfall data from two other stations are utilized for very small areas along the catchment boundaries. In Figure 5(c), the areas corresponding to the less used stations, i.e. DUJP and MFFP stations, are highlighted with orange and red polygons, respectively. Finally, for the GA – 12 points method (Figure 5(d)), the algorithm's mapping output results on three distinct polygons, following a pattern where stations closer to the basin have a larger impact on coverage, as shown by the coverage percentages and the location of the stations.
Figure 5

Spatial coverage representation and relevant percentage (over the watershed's extent) of the rainfall measured in three stations for the (a) Thiessen polygons method, (b) GA – 8 points method, (c) GA – 10 points method, and (d) GA – 12 points method.

Figure 5

Spatial coverage representation and relevant percentage (over the watershed's extent) of the rainfall measured in three stations for the (a) Thiessen polygons method, (b) GA – 8 points method, (c) GA – 10 points method, and (d) GA – 12 points method.

Close modal

Calibration and validation of methodology, and sensitivity analysis of the number of utilized vector points

For the calibration period and after integrating the gauge weight coefficients of Table 2 in the hydrological model, the latter produced improved results for all the cases, as illustrated in Figure 6 and detailed in the statistical metrics provided in Table 3. As observed in all the simulation sets, the baseflow of 8.9 m3/s is gradually starts to increase almost 30 h after the beginning of the simulations. Subsequently, the simulated discharges per method follow a pattern similar to the observed discharges (light blue curve of Figure 6), with all simulation sets successfully replicating the discharge peak of approximately 81.5 m3/s.
Table 3

Comparison between observed river discharges and the simulated discharges generated through various gauge weight methods used for rainfall distribution over the catchment area during the calibration period from 28 April 1996 to 3 May 1996

Gauge weight methodStatistical metrics
RMSE (m3/s)Nash–SutcliffePercent bias (%)R2
Thiessen polygons 0.00262 0.991 2.021 0.992 
GA – 8 points 0.00321 0.987 0.583 0.987 
GA – 10 points 0.00253 0.992 0.850 0.993 
GA – 12 points 0.00253 0.992 0.846 0.993 
Gauge weight methodStatistical metrics
RMSE (m3/s)Nash–SutcliffePercent bias (%)R2
Thiessen polygons 0.00262 0.991 2.021 0.992 
GA – 8 points 0.00321 0.987 0.583 0.987 
GA – 10 points 0.00253 0.992 0.850 0.993 
GA – 12 points 0.00253 0.992 0.846 0.993 
Figure 6

Graphical comparison of the observed discharges (light blue curve) with the simulated ones obtained using a GA with 12 vector points (orange curve with orange markers), a GA of 10 vector points (thin gray curve), a GA of 8 vector points (black dotted curve), and discharges resulting from the application of the Thiessen polygons method (dark blue curve).

Figure 6

Graphical comparison of the observed discharges (light blue curve) with the simulated ones obtained using a GA with 12 vector points (orange curve with orange markers), a GA of 10 vector points (thin gray curve), a GA of 8 vector points (black dotted curve), and discharges resulting from the application of the Thiessen polygons method (dark blue curve).

Close modal

The high accuracy of all the implemented methods and the success of the model calibration are also clearly attributed to the goodness-of-fit measures presented in Table 3. When considering the Thiessen polygon method as the reference simulation, it is evident that the corresponding simulated discharges closely match the observed ones, as indicated by the statistical metrics (e.g., R² = 0.992 or Nash–Sutcliffe = 0.991). Similarly, it is observed that the proposed GA methods meet the high standards set by the reference simulation. The hydrological simulations derived from the GA – 10 points and the GA – 12 points methods, especially, are nearly identical, with only a statistically negligible difference in the percent bias coefficient to be apparent. For these two methods, the RMSE is equal to or slightly smaller than that of the reference method, while the Nash–Sutcliffe and R² coefficients are slightly better than those of the Thiessen polygon method (e.g. 0.992 vs. 0.991 and 0.993 vs. 0.992, respectively). Finally, the GA – 8 points method also provides excellent results, with all corresponding statistical measures reaching their best performance. For instance, R2 and Nash–Sutcliffe are almost equal to unity, percent bias is less than one and the RMSE is very close to zero.

Regarding the validation of the methods during the storm event from 10 April 1994 to 15 April1994, the outputs show that all models fail to accurately simulate the initial baseflow, with the observed one being more than the double, i.e. 20.5 m3/s, than the simulated one, i.e. 8.9 m3/s. Consequently, while the curves representing the simulation outputs successfully replicate the occurrence of the streamflow pattern, they consistently underestimate the water volume, as clearly demonstrated in Figure 7. Specifically, the observed peak flow of 179.9 m3/s is better represented by the GA – 10 points method (155.0 m3/s) followed by the GA – 12 points method (133.8 m3/s) and the Thiessen polygon method (116.6 m3/s). Conversely, the GA – 8 points method provides the least satisfactory results of 85.1 m3/s. As can also be observed, the hydrograph's raising limp is better represented by the GA – 10 points method followed by the GA – 12 points method, while in the case of the falling limp, the opposite is presented, i.e. it is slightly better simulated by the GA – 12 points rather than the GA – 10 points method. Finally, in terms of time, the observed peak flow occurs 4 h earlier than the simulated ones.
Figure 7

Validation of the use of GAs for rainfall distribution by comparing the GA – 12 points (orange curve with orange markers), GA – 10 points (gray curve), the GA – 8 points (black dotted curve) simulated discharge against the simulated with the Thiessen polygon method (dark blue curve), and the observed runoff (light blue curve).

Figure 7

Validation of the use of GAs for rainfall distribution by comparing the GA – 12 points (orange curve with orange markers), GA – 10 points (gray curve), the GA – 8 points (black dotted curve) simulated discharge against the simulated with the Thiessen polygon method (dark blue curve), and the observed runoff (light blue curve).

Close modal

The inability of the GA – 8 points method to accurately simulate the observed runoff is also evident in the utilized statistical measures, as shown in Table 4. While the coefficient of determination (R2) is relatively high, at 0.710, the other measures are quite small. For instance, the NSE is only 0.190, indicating a rather poor performance of this particular point vector combination. On the other hand, the GA – 10 points and GA – 12 points methods not only yield relatively high R2 coefficients (0.839 and 0.832, respectively) but also demonstrate very good NSE coefficients (0.799 and 0.722, respectively). These measures surpass those of the Thiessen polygon method, which serves as the reference method. For example, the NSE corresponding to the Thiessen polygon method is only 0.565.

Table 4

Validation of the methodology performance by comparing the simulated streamflow coming from the implemented rainfall distribution methods versus the observed streamflow during the validation period from 10 April 1994 to 14 April 1994

Gauge weight methodStatistical metrics
RMSE (m3/s)Nash–SutcliffePercent bias (%)R2
Thiessen polygons 0.018 0.565 −30.152 0.822 
GA – 8 points 0.026 0.190 −42.143 0.710 
GA – 10 points 0.013 0.799 −6.113 0.839 
GA – 12 points 0.015 0.722 −19.807 0.832 
Gauge weight methodStatistical metrics
RMSE (m3/s)Nash–SutcliffePercent bias (%)R2
Thiessen polygons 0.018 0.565 −30.152 0.822 
GA – 8 points 0.026 0.190 −42.143 0.710 
GA – 10 points 0.013 0.799 −6.113 0.839 
GA – 12 points 0.015 0.722 −19.807 0.832 

In terms of computational resources, all simulations were conducted on a workstation with an i7 processor clocked at 2.8 GHz and 16 GB of memory. The required time for both the GA to create distinct polygons adhering to the generation rules and overcoming penalties, as well as for executing the hydrologic simulations and generating results matrices with corresponding statistical measures, varied across methods (between 12 and 14 h per method).

Sensitivity analysis on vector points and model's performance

To assess the impact of the number of vector points used to delineate rainfall in the case basin, multiple tests with different starting point vectors were conducted to investigate the way the shape of rainfall distribution affects hydrological simulation outputs. After numerous iterations, it was observed that when the number of starting points is fewer than 5 or more than 13, the algorithm fails to provide solutions in the case study basin, i.e. fails to satisfy the penalty criteria and create three polygons, one for each station. Especially, when the number of points in low (e.g. 5 points), although the algorithm could rather easily satisfy the penalties and initiate the hydrological model with plenty of remaining generations, thereby achieving high performance, the limited number of points restricts the algorithm's freedom because there are fewer variables (points) available for optimization. On the other hand, when the number of points is high (e.g. 13 points), the algorithm has ample freedom due to the availability of numerous variables (points) to utilize. However, many generations are wasted as the algorithm attempts to create three polygons using 13 points and NNGs. Between the two thresholds, i.e. in the space from 5 to 13 points, there are solutions converging to similar high performance levels, as illustrated in the sensitivity analysis comparing the vector points and the statistical measures of observed and simulated discharges in Figure 8 for both the calibration period (28 April 1996–3 May 1996) and the validation period (10 April 1994–14 April 1994).
Figure 8

Correlation between the vector points triggering the GA sequential process (horizontal axis) and the model's outputs expressed in Nash–Sutcliffe efficiency, coefficient of determination (R2) (left axis), and percent bias (right axis) for (a) the calibration (28 April 1996–3 May 1996) and (b) the validation (10 April 1994–14 April 1994) periods.

Figure 8

Correlation between the vector points triggering the GA sequential process (horizontal axis) and the model's outputs expressed in Nash–Sutcliffe efficiency, coefficient of determination (R2) (left axis), and percent bias (right axis) for (a) the calibration (28 April 1996–3 May 1996) and (b) the validation (10 April 1994–14 April 1994) periods.

Close modal

In regard to the calibration period, it is observed that all the GA – XX points methods, with XX ranging from 5 to 13, achieve outputs of high accuracy, as both the R2 and NSE consistently exceeding 0.986 and 0.984, respectively (Figure 8(a)). It can be noted that after the GA – 9 points method, the outputs are stabilized around the same high values, with the fluctuations observed from the GA – 5 points method to the GA – 8 points method being negligible, as they pertain to the third decimal place of the figures. Similar results of high accuracy are also obtained for the validation period, Figure 8(b), where the R2 consistently exceeds 0.7, except for the GA – 6 points method, and is getting stable to values greater than 0.8 when the number of utilized vector points increases. NSE values also demonstrate the relatively good correlation (>0.6) of observed and simulated discharges when 9 or more vector points are used to force the simulation. Finally, both during the calibration period, Figure 8(a), and the validation period, Figure 8(b), the RMSE, which served as the objective function of the GA, was approximately 0.0027 and 0.019 m3/s, respectively, demonstrating the successful minimization of the objective function. The weights generated for each utilized vector points method, i.e. GA – XX point method, along with the resulting simulation outputs are given in the Supplementary Appendix.

In terms of computational resources, all simulations were conducted on a workstation with an i7 processor clocked at 2.8 GHz and 16 GB of memory. The required time for both the GA to successfully create distinct polygons adhering to the generation rules and overcoming penalties, as well as for executing the hydrologic simulations and generating results matrices with corresponding statistical measures, varied across methods (between 12 and 14 h per method).

Geometric deterministic approaches for analyzing the spatial distribution of rainfall data, which are used to trigger rainfall-runoff models, are commonly employed in hydrological simulation processes. These methods often involve gauge weight techniques like Thiessen polygons, which are routinely applied by various scholars, as exemplified in the introduction. The literature also demonstrates that GAs can be utilized to optimize techniques and numerical models dedicated to rainfall distribution (e.g. Bărbulescu et al. 2021), although their applicability remains somewhat limited. In the current research, GAs are used to determine rainfall spatial distribution geometries that optimize the performance of the HEC-HMS hydrological model by comparing the simulated runoffs with the observed ones. The optimization is based on runoff information rather than the a-prior rainfall's distribution optimization, which is typically conducted by selecting a method that responds better to independent rainfall gauges (Cheng et al. 2017; Lazoglou et al. 2019). The proposed methodology, along with the produced outputs, is considered the innovation of the research. The coupling of GAs with interpolation techniques for rainfall allocation and hydrological optimization is a contemporary theme that has been limitedly explored. Additionally, the produced outputs highlight that there are multiple distinct areal combinations that yield results the same as or even better than those obtained from the Thiessen polygon method.

The selection of the specific pair ‘case study area’ (namely the Mahoning Creek watershed) and ‘hydrological model’ (namely the HEC-HMS) is an intentionally strategic choice for the research. Firstly, HEC-HMS is a highly reputed model in the field of hydrological simulations, supported by a robust community of scholars and with numerous easy-to-access applications at various scales and places as detailed in the Materials and Methods section (e.g. Frysali et al. 2023). What's important to note is that HEC-HMS is a freeware model, available for download and use by anyone. This ensures the applicability and transferability of the proposed methodology to any case study since free access to the specific software is guaranteed. Moreover, while HEC-HMS is not an open-source model, meaning users cannot access the core simulation processes, all the input parameters are stored in .ascii files in specific repositories, which can be easily edited with simple notepad editors to store the custom user's demands. This allows the uninterrupted implementation of the proposed methodological automated sequence, which begins with the gauge weight generation through GAs, followed by the adjustment of the batch files responsible for storing the gauge parameters for each simulation run and scenario, and finalized with the automated export and analysis of the statistical measures of each simulation, all of which are stored in separate .ascii files. Secondly, the case basin corresponds to the pilot basin used in the online-accessible HEC-HMS tutorial. This choice ensures that all figures, such as precipitation records, and/or infiltration coefficients, and/or baseflow coefficients, have been rigorously examined for their consistency and accuracy. The possibility, thus, of inserting uncertainties that could bias the proposed methodology is minimal.

Regarding the obtained results, it is observed that during the calibration period, the objective function of generating new polygonal shapes, when used with the HEC-HMS model, produces smaller hydrological RMSE values than those obtained using Thiessen polygons. It is worth mentioning that in the presentation of the results (see Table 3), five decimal places are used to represent the achieved RMSE values. This precision is necessary because the RMSE value produced by the Thiessen method is exceptionally small, reflecting the excellent outputs of the reference simulation. Particularly, in the GA – 10 points and the GA – 12 points methods, the RMSE equals 0.00253 m3/s, a figure that is slightly larger than the Thiessen polygons one and which is the aim of the objective function. Contrarily, in the case of the GA – 8 points method, the RMSE is negligibly smaller than the Thiessen polygons one (0.00321 vs. 0.00262 m3/s, respectively). During the validation period, the outputs have similar behavior to those of the calibration period, i.e. the GA – 10 points and GA – 12 points methods outperform the Thiessen polygon method in terms of RMSE. Additionally, while all three methods produce comparable R2 coefficients (see Table 3), the GA – 10 points and GA – 12 points methods demonstrate much significantly higher NSE coefficients (0.799 and 0.722, respectively), compared to the Thiessen polygon coefficient of 0.565. Supplementary Appendix Table A1 presents the optimum weights that were produced by each GA – XX method, where XX ranges from 5 to 13, both for the calibration and validation period. Additionally, it includes the produced hydrological outputs expressed in statistical metrics.

What is particularly important to note is that high accuracy outputs can be obtained even when using only one of the case study rainfall stations, instead of all three stations. This is the case with the GA – 10 points method, where the simulation is practically forced by the rainfall data of the PNXP station, as shown in Figure 5. This implies that the arbitrary spatial boundaries set by the Thiessen polygon method, where no other physical characteristics in rainfall distribution apart from the vectorial proximity to a gauge are considered, can be effectively reproduced using different geometric shapes when not all gauge stations are utilized. Similar high accuracy outputs are also obtained when other disproportional gauge weights are used, e.g. the GA – 7 points or the GA – 13 points methods cases, as illustrated in Supplementary Appendix Table A1. Therefore, solutions like those initiated in this research and relies on the use of GAs offer an enhanced alternative to the conventional interpolation techniques used to fill data gaps. This is particularly valuable as very few stations on a global scale maintain continuous measurements (Estévez et al. 2022). Moreover, the proposed methodology mitigates biases introduced by interpolation methods used to estimate missing precipitation, as these methods tend to understate high values and overstate low values (Teegavarapu 2014).

In the research, and regarding the optimization process, various combinations of vector points, which delineate the rainfall-influenced area, give optimal outputs; however, the manuscript emphasizes the use of 8, 10, and 12 points. The sensitivity analysis demonstrated that increasing the number of points gives the algorithm greater flexibility to create complex shapes that can better divide the overall area into sub-areas (polygons), thereby enhancing the algorithm's performance. However, the use of more vector points results in the need for more computational resources and longer simulation times. Particularly, when attempting to start with more than 14 points, it was found that the algorithm required significantly more time to find acceptable solutions (polygons that intersect with exactly one station). This was attributed to the complexity of the shapes created, necessitating an increase in the number of generations to provide the algorithm with more ‘time’ to search for possible solutions. Similarly, radically decreasing the vector points, i.e. less than 5 points, make it also impossible to find an optimal solution. For example, using 4 points and K-NNGs may satisfy the penalties, but it is practically impossible to optimize such a complicated space into the best three areas with the use of only 4 points. The results also validate that the partially less good performance is achieved when fewer points are used, i.e. with the GA – 8 points method or fewer as indicated in the sensitivity analysis section. At the same time, by increasing the spatial discretization (e.g. at 10 × 10 m), the outputs demonstrated a negligible increase of the accuracy of the solution, the computation cost, nevertheless, was significantly more impacted.

To sum up, the designation of weights, apart from the classical Thiessen and IDW gauge weight methods, is a topic that has attracted the interest of scholars. For example, the Geographically Weighted Regression (GWR) model (Brunsdon et al. 1996) deals with data having ‘spatial non-stationarity’, i.e. having alterations in relationships between variables from one point to another, such as precipitation, with the specific model to have been adopted by numerous scholars for assessing the spatial variability of precipitation and investigate its relationship with other factors (e.g. Brunsdon et al. 2002; Xu et al. 2015; Kara et al. 2016). Based on the aforementioned model, Chao et al. (2018) proposed the use of topographic variables (elevation, slope, aspect, surface roughness, and distance to the coastline) and a meteorological variable (wind speed) to address the issue of artificial spatial autocorrelation in traditional interpolation methods in order to merge satellite with gauge precipitation data. In the research, the use of GAs has yielded successful out-of-the-box rainfall distributions, aiming to optimize the accuracy between simulated and observed river discharges regardless of how the rainfall is spatially allocated within the basin. It is, nevertheless, important to note that not considering the basin's physical parameters, such as elevation, represents a limitation that requires further investigation. The forthcoming research advancements are planned to address the physiography of the case basin and assess the methodology's applicability to catchments of larger scales and different climatic conditions. Concluding, the research aims to demonstrate that GAs, together with other modern approaches such as AI, can effectively determine accurate rainfall patterns for hydrological simulations, relying solely on the coordinates of the rainfall network, with the proposed methodology offering an engineering solution to address accuracy issues in cases with limited data. AI models such as neural networks are data dependent and should be preferred in cases where large datasets are available. Conversely, GAs utilized in research necessitate the presence of an objective function to evaluate potential solutions. Utilizing HEC-HMS facilitates the construction of an objective function and enables the numerical evaluation of solutions until optimization is achieved.

The research introduces a reverse-engineering approach for determining the spatial distribution of rainfall before being applied in rainfall-runoff models. Practically, instead of calibrating and validating a hydrological model solely based on observed streamflow measurements, the implemented approach utilizes GAs to reconfigure the rainfall spatial pattern and to increase the model performance. The effectiveness of these new configurations (geometries) is validated for a different simulation period than the one used for the model's calibration. The validation is conducted using performance indicators, comparing HEC-HMS simulated discharges with observed ones. Additionally, the simulations are compared with those obtained using the Thiessen polygon method which is embedded in the HEC-HMS model.

The results demonstrate that numerous geometries can be generated depending on the number of point vectors that are used to initiate the algorithm. Many of these geometries meet the customized criteria, set within the methodology, and are further employed to drive hydrologic simulations. Furthermore, the study yields highly accurate outputs (with R2 ranging from 0.83 to 0.99 and Nash–Sutcliffe ranging from 0.73 to 0.99, where the high and low limits correspond to the calibration and validation periods, respectively) for both the GA – 10 points and GA – 12 points methods in the case study. In the validation period, the results outperform those obtained using the reference method. It is important to highlight that the generated geometries from the two methods are entirely distinct. This suggests that the simplification of abrupt changes in rainfall allocation at vectorial boundaries, as introduced by the Thiessen polygons, can be better addressed through more sophisticated approaches such as GAs. Concluding, the proposed gauge weight method, which essentially mirrors the principles of the Thiessen polygon method, along with the implemented methodology can be applied in basins that are negligibly monitored (limited number of gauges) and in cases where observation datasets are subject to uncertainties and errors.

C.S.: Supervision, Conceptualization, Investigation, Methodology, Software, Data curation, Validation, Writing – Original draft preparation, Writing – Reviewing and Editing. N.N.: Conceptualization, Investigation, Methodology, Software, Data curation, Writing – Original draft preparation.

All relevant data are included in the paper or its Supplementary Information.

Bărbulescu
A.
,
Șerban
C.
&
Indrecan
M. L.
2021
Computing the beta parameter in IDW interpolation by using a genetic algorithm
.
Water
13
(
6
),
863
.
https://doi.org/10.3390/w13060863
.
Bárdossy
A.
&
Das
T.
2008
Influence of rainfall observation network on model calibration and application
.
Hydrology and Earth System Sciences
12
,
77
89
.
https://doi.org/10.5194/hess-12-77-2008
.
Brunsdon
C.
,
Fotheringham
A. S.
&
Charlton
M. E.
1996
Geographically weighted regression: A method for exploring spatial nonstationarity
.
Geographical Analysis
28
(
4
),
281
298
.
https://doi.org/10.1111/j.1538-4632.1996.tb00936.x
.
Brunsdon
C.
,
Fotheringham
A. S.
&
Charlton
M.
2002
Geographically weighted summary statistics – A framework for localised exploratory data analysis
.
Computers, Environment and Urban Systems
26
(
6
),
501
524
.
https://doi.org/10.1016/S0198-9715(01)00009-6
.
Chadalawada
J.
&
Babovic
V.
2019
Review and comparison of performance indices for automatic model induction
.
Journal of Hydroinformatics
21
(
1
),
13
31
.
https://doi.org/10.2166/hydro.2017.078
.
Chang
C. L.
,
Lo
S. L.
&
Yu
S. L.
2005
Applying fuzzy theory and genetic algorithm to interpolate precipitation
.
Journal of Hydrology
314
(
1–4
),
92
104
.
https://doi.org/10.1016/j.jhydrol.2005.03.034
.
Chao
L.
,
Zhang
K.
,
Li
Z.
,
Zhu
Y.
,
Wang
J.
&
Yu
Z.
2018
Geographically weighted regression based methods for merging satellite and gauge precipitation
.
Journal of Hydrology
558
,
275
289
.
https://doi.org/10.1016/j.jhydrol.2018.01.042
.
Cheng
M.
,
Wang
Y.
,
Engel
B.
,
Zhang
W.
,
Peng
H.
,
Chen
X.
&
Xia
H.
2017
Performance assessment of spatial interpolation of precipitation for hydrological process simulation in the Three Gorges Basin
.
Water
9
(
11
),
838
.
https://doi.org/10.3390/w9110838
.
Croley
T. E.
II
&
Hartmann
H. C.
1985
Resolving Thiessen polygons
.
Journal of Hydrology
76
(
3–4
),
363
379
.
https://doi.org/10.1016/0022-1694(85)90143-X
.
Del Giudice
G.
&
Padulano
R.
2016
Sensitivity analysis and calibration of a rainfall-runoff model with the combined use of EPA-SWMM and genetic algorithm
.
Acta Geophysica
64
,
1755
1778
.
https://doi.org/10.1515/acgeo-2016-0062
.
Duraisekaran
E.
,
Mohanraj
T.
,
Samuel
J. S. K.
,
Rajagopalan
S.
&
Govindasamy
R.
2021
Investigation of multiple flood mitigation strategies for an urban catchment using semi-distributed hydrological modelling
.
Arabian Journal of Geosciences
14
,
1
16
.
https://doi.org/10.1007/s12517-021-07619-w
.
Eppstein
D.
,
Paterson
M. S.
&
Yao
F. F.
1997
On nearest-neighbor graphs
.
Discrete & Computational Geometry
17
,
263
282
.
https://doi.org/10.1007/PL00009293
.
Estévez
J.
,
Llabrés-Brustenga
A.
,
Casas-Castillo
M. C.
,
García-Marín
A. P.
,
Kirchner
R.
&
Rodríguez-Solà
R.
2022
A quality control procedure for long-term series of daily precipitation data in a semiarid environment
.
Theoretical and Applied Climatology
149
,
1029
1041
.
https://doi.org/10.1007/s00704-022-04089-2
.
Feldman
A. D.
2000
Hydrologic Modeling System HEC-HMS, Technical Reference Manual
.
U.S. Army Corps of Engineers, Hydrologic Engineering Center, HEC
,
Davis, CA
.
Fooladi
M.
,
Golmohammadi
M. H.
,
Rahimi
I.
,
Safavi
H. R.
&
Nikoo
M. R.
2023
Assessing the changeability of precipitation patterns using multiple remote sensing data and an efficient uncertainty method over different climate regions of Iran
.
Expert Systems with Applications
221
,
119788
.
https://doi.org/10.1016/j.eswa.2023.119788
.
Formetta
G.
,
Marra
F.
,
Dallan
E.
,
Zaramella
M.
&
Borga
M.
2022
Differential orographic impact on sub-hourly, hourly, and daily extreme precipitation
.
Advances in Water Resources
159
,
104085
.
https://doi.org/10.1016/j.advwatres.2021.104085
.
Frysali
D.
,
Mallios
Z.
&
Theodossiou
N.
2023
Hydrologic modeling of the Aliakmon River in Greece using HEC–HMS and open data
.
Euro-Mediterranean Journal for Environmental Integration
8
,
539
555
.
https://doi.org/10.1007/s41207-023-00374-2
.
Gelete
G.
,
Nourani
V.
,
Gokcekus
H.
&
Gichamo
T.
2023
Ensemble physically based semi-distributed models for the rainfall-runoff process modeling in the data-scarce Katar catchment, Ethiopia
.
Journal of Hydroinformatics
25
(
2
),
567
592
.
https://doi.org/10.2166/hydro.2023.197
.
Gunathilake
M. B.
,
Karunanayake
C.
,
Gunathilake
A. S.
,
Marasingha
N.
,
Samarasinghe
J. T.
,
Bandara
I. M.
&
Rathnayake
U.
2021
Hydrological models and artificial neural networks (ANNs) to simulate streamflow in a tropical catchment of Sri Lanka
.
Applied Computational Intelligence and Soft Computing
2021
,
1
9
.
https://doi.org/10.1155/2021/6683389
.
Guo
B.
,
Zhang
J.
,
Xu
T.
,
Song
Y.
,
Liu
M.
&
Dai
Z.
2022
Assessment of multiple precipitation interpolation methods and uncertainty analysis of hydrological models in Chaohe River basin, China
.
Water SA
48
(
3
),
324
334
.
http://dx.doi.org/10.17159/wsa/2022.v48.i3.3884
.
HEC-HMS User Manual
2016
Hydrologic Modeling System (HEC-HMS) User Manual: Version 4.2.0
.
USACE (U.S. Army Corps of Engineers), Hydrologic Engineering Center
,
Davis, CA
.
Hersbach
H.
,
Bell
B.
,
Berrisford
P.
,
Hirahara
S.
,
Horányi
A.
,
Muñoz-Sabater
J.
,
Nicolas
J.
,
Peubey
C.
,
Radu
R.
,
Schepers
D.
,
Simmons
A.
,
Soci
C.
,
Abdalla
S.
,
Abellan
X.
,
Balsamo
G.
,
Bechtold
P.
,
Biavati
G.
,
Bidlot
J.
,
Bonavita
M.
,
De Chiara
G.
,
Dahlgren
P.
,
Dee
D.
,
Diamantakis
M.
,
Dragani
R.
,
Flemming
J.
,
Forbes
R.
,
Fuentes
M.
,
Geer
A.
,
Haimberger
L.
,
Healy
S.
,
Hogan
R. J.
,
Hólm
E.
,
Janisková
M.
,
Keeley
S.
,
Laloyaux
P.
,
Lopez
P.
,
Lupu
C.
,
Radnoti
G.
,
Rosnay
P.
,
Rozum
I.
,
Vamborg
F.
,
Villaume
S.
&
Thépaut
J. N.
2020
The ERA5 global reanalysis
.
Quarterly Journal of the Royal Meteorological Society
146
(
730
),
1999
2049
.
https://doi.org/10.1002/qj.3803
.
Hofstra
N.
,
Haylock
M.
,
New
M.
,
Jones
P.
&
Frei
C.
2008
Comparison of six methods for the interpolation of daily, European climate data
.
Journal of Geophysical Research: Atmospheres
113
,
D21
.
https://doi.org/10.1029/2008JD010100
.
Hussain
I.
,
Spöck
G.
,
Pilz
J.
&
Yu
H. L.
2010
Spatio-temporal interpolation of precipitation during monsoon periods in Pakistan
.
Advances in Water Resources
33
(
8
),
880
886
.
https://doi.org/10.1016/j.advwatres.2010.04.018
.
Kanamitsu
M.
,
Ebisuzaki
W.
,
Woollen
J.
,
Yang
S. K.
,
Hnilo
J. J.
,
Fiorino
M.
&
Potter
G. L.
2002
NCEP–Doe AMIP-II reanalysis (r-2)
.
Bulletin of the American Meteorological Society
83
(
11
),
1631
1644
.
https://doi.org/10.1175/BAMS-83-11-1631
.
Kara
F.
,
Yucel
I.
&
Akyurek
Z.
2016
Climate change impacts on extreme precipitation of water supply area in Istanbul: Use of ensemble climate modelling and geo-statistical downscaling
.
Hydrological Sciences Journal
61
(
14
),
2481
2495
.
https://doi.org/10.1080/02626667.2015.1133911
.
Kidd
C.
,
Becker
A.
,
Huffman
G. J.
,
Muller
C. L.
,
Joe
P.
,
Skofronick-Jackson
G.
&
Kirschbaum
D. B.
2017
So, how much of the Earth's surface is covered by rain gauges?
Bulletin of the American Meteorological Society
98
(
1
),
69
78
.
https://doi.org/10.1175/BAMS-D-14-00283.1
.
Kirlas
M. C.
&
Nagkoulis
N.
2023
Effects of pumping flow rates on the estimation of hydrogeological parameters
.
Journal of Hydroinformatics
25
(
3
),
611
627
.
https://doi.org/10.2166/hydro.2023.059
.
Kourakos
G.
,
Brunetti
G.
,
Bigelow
D. P.
,
Wallander
S.
&
Dahlke
H. E.
2023
Optimizing managed aquifer recharge locations in California's central valley using an evolutionary multi-objective genetic algorithm coupled with a hydrological simulation model
.
Water Resources Research
e2022WR034129
.
https://doi.org/10.1029/2022WR034129
.
Krause
P.
,
Boyle
D. P.
&
Bäse
F.
2005
Comparison of different efficiency criteria for hydrological model assessment
.
Advances in Geosciences
5
,
89
97
.
https://doi.org/10.5194/adgeo-5-89-2005
.
La Barbera
P.
,
Lanza
L. G.
&
Stagi
L.
2002
Influence of systematic mechanical errors of tipping-bucket rain gauges on the statistics of rainfall extremes
.
Water Science & Technology
45
(
2
),
1
9
.
https://doi.org/10.2166/wst.2002.0020
.
Lazoglou
G.
,
Anagnostopoulou
C.
,
Skoulikaris
C.
&
Tolika
K.
2019
Bias correction of climate model's precipitation using the copula method and its application in river basin simulation
.
Water
11
(
3
),
600
.
https://doi.org/10.3390/w11030600
.
Liu
D.
,
Zhao
Q.
,
Fu
D.
,
Guo
S.
,
Liu
P.
&
Zeng
Y.
2020a
Comparison of spatial interpolation methods for the estimation of precipitation patterns at different time scales to improve the accuracy of discharge simulations
.
Hydrology Research
51
(
4
),
583
601
.
https://doi.org/10.2166/nh.2020.146
.
Liu
Y. Y.
,
Li
L.
,
Liu
Y. S.
,
Chan
P. W.
&
Zhang
W. H.
2020b
Dynamic spatial-temporal precipitation distribution models for short-duration rainstorms in Shenzhen, China based on machine learning
.
Atmospheric Research
237
,
104861
.
https://doi.org/10.1016/j.atmosres.2020.104861
.
Maier
H. R.
,
Kapelan
Z.
,
Kasprzyk
J.
,
Kollat
J.
,
Matott
L. S.
,
Cunha
M. C.
,
Dandy
G. C.
,
Gibbs
M. S.
,
Keedwell
E.
,
Marchi
A.
,
Ostfeld
A.
,
Savic
D.
,
Solomatine
D. P.
,
Vrugt
J. A.
,
Zecchin
A. C.
,
Minsker
B. S.
,
Barbour
E. J.
,
Kuczera
G.
,
Pasha
F.
,
Castelleti
A.
,
Giulani
M.
&
Reed
P. M.
2014
Evolutionary algorithms and other metaheuristics in water resources: Current status, research challenges and future directions
.
Environmental Modelling & Software
62
,
271
299
.
https://doi.org/10.1016/j.envsoft.2014.09.013
.
Mishra
A. K.
&
Coulibaly
P.
2009
Developments in hydrometric network design: A review
.
Reviews of Geophysics
47
(
2
),
00000
00000
.
https://doi.org/10.1029/2007RG000243
.
Moulin
L.
,
Gaume
E.
&
Obled
C.
2009
Uncertainties on mean areal precipitation: Assessment and impact on streamflow simulations
.
Hydrology and Earth System Sciences
13
,
99
114
.
https://doi.org/10.5194/hess-13-99-2009
.
Nikolopoulos
E. I.
,
Borga
M.
,
Creutin
J. D.
&
Marra
F.
2015
Estimation of debris flow triggering rainfall: Influence of rain gauge density and interpolation methods
.
Geomorphology
243
,
40
50
.
https://doi.org/10.1016/j.geomorph.2015.04.028
.
Ohmer
M.
,
Liesch
T.
,
Goeppert
N.
&
Goldscheider
N.
2017
On the optimal selection of interpolation methods for groundwater contouring: An example of propagation of uncertainty regarding inter-aquifer exchange
.
Advances in Water Resources
109
,
121
132
.
https://doi.org/10.1016/j.advwatres.2017.08.016
.
Pradhan
R. K.
,
Markonis
Y.
,
Godoy
M. R. V.
,
Villalba-Pradas
A.
,
Andreadis
K. M.
,
Nikolopoulos
E. I.
,
Papaplexiou
S. M.
,
Rahim
A.
,
Tapiador
F. J.
&
Hanel
M.
2022
Review of GPM IMERG performance: A global perspective
.
Remote Sensing of Environment
268
,
112754
.
https://doi.org/10.1016/j.rse.2021.112754
.
Probst
E.
&
Mauser
W.
2022
Evaluation of ERA5 and WFDE5 forcing data for hydrological modelling and the impact of bias correction with regional climatologies: A case study in the Danube River Basin
.
Journal of Hydrology: Regional Studies
40
,
101023
.
https://doi.org/10.1016/j.ejrh.2022.101023
.
Renard
B.
,
Kavetski
D.
,
Kuczera
G.
,
Thyer
M.
&
Franks
S. W.
2010
Understanding predictive uncertainty in hydrologic modeling: The challenge of identifying input and structural errors
.
Water Resources Research
46
(
5
),
000000
000000
.
https://doi.org/10.1029/2009WR008328
.
Salih
S. Q.
,
Sharafati
A.
,
Ebtehaj
I.
,
Sanikhani
H.
,
Siddique
R.
,
Deo
R. C.
,
Bonakdari
H.
,
Shahid
S.
&
Yaseen
Z. M.
2020
Integrative stochastic model standardization with genetic algorithm for rainfall pattern forecasting in tropical and semi-arid environments
.
Hydrological Sciences Journal
65
(
7
),
1145
1157
.
https://doi.org/10.1080/02626667.2020.1734813
.
Scrucca
L.
2013
GA: A package for genetic algorithms in R
.
Journal of Statistical Software
53
,
1
37
.
https://doi.org/10.18637/jss.v053.i04
.
Shafii
M.
&
De Smedt
F.
2009
Multi-objective calibration of a distributed hydrological model (WetSpa) using a genetic algorithm
.
Hydrology and Earth System Sciences
13
(
11
),
2137
2149
.
https://doi.org/10.5194/hess-13-2137-2009
.
Skoulikaris
C.
,
Ganoulis
J.
,
2012
Climate change impacts on river catchment hydrology using dynamic downscaling of global climate models
. In:
National Security and Human Health Implications of Climate Change
(
Fernando
H. J. S.
,
Klaić
Z.
&
McCulley
J. L.
, eds).
Springer
,
Dordrecht, The Netherlands
, pp.
281
287
.
Skoulikaris
C.
,
Anagnostopoulou
C.
&
Lazoglou
G.
2019
Hydrological modeling response to climate model spatial analysis of a South Eastern Europe international basin
.
Climate
8
(
1
),
1
.
https://doi.org/10.3390/cli8010001
.
Skoulikaris
C.
,
Venetsanou
P.
,
Lazoglou
G.
,
Anagnostopoulou
C.
&
Voudouris
K.
2022
Spatio-temporal interpolation and bias correction ordering analysis for hydrological simulations: An assessment on a Mountainous River Basin
.
Water
14
,
660
.
https://doi.org/10.3390/w14040660
.
Soroush
F.
&
Abedini
M. J.
2019
Optimal selection of number and location of pressure sensors in water distribution systems using geostatistical tools coupled with genetic algorithm
.
Journal of Hydroinformatics
21
(
6
),
1030
1047
.
https://doi.org/10.2166/hydro.2019.023
.
Teegavarapu
R. S.
2012
Spatial interpolation using nonlinear mathematical programming models for estimation of missing precipitation records
.
Hydrological Sciences Journal
57
(
3
),
383
406
.
https://doi.org/10.1080/02626667.2012.665994
.
Teegavarapu
R. S.
2014
Statistical corrections of spatially interpolated missing precipitation data estimates
.
Hydrological Processes
28
(
11
),
3789
3808
.
https://doi.org/10.1002/hyp.9906
.
Wang
J.
,
Zhuo
L.
,
Han
D.
,
Liu
Y.
&
Rico-Ramirez
M. A.
2023
Hydrological model adaptability to rainfall inputs of varied quality
.
Water Resources Research
59
(
2
),
e2022WR032484
.
https://doi.org/10.1029/2022WR032484
.
Xu
C. Y.
&
Singh
V. P.
1998
A review on monthly water balance models for water resources investigations
.
Water Resources Management
12
,
20
50
.
https://doi.org/10.1023/A:1007916816469
.
Xu
S.
,
Wu
C.
,
Wang
L.
,
Gonsamo
A.
,
Shen
Y.
&
Niu
Z.
2015
A new satellite-based monthly precipitation downscaling algorithm with non-stationary relationship between precipitation and land surface characteristics
.
Remote Sensing of Environment
162
,
119
140
.
https://doi.org/10.1016/j.rse.2015.02.024
.
Yang
X.
,
Xie
X.
,
Liu
D. L.
,
Ji
F.
&
Wang
L.
2015
Spatial interpolation of daily rainfall data for local climate impact assessment over greater Sydney region
.
Advances in Meteorology
2015
,
1
12
.
https://doi.org/10.1155/2015/563629
.
Zhao
Y.
,
Zhang
X.
,
Xiong
F.
,
Liu
S.
,
Wang
Y.
&
Liang
C.
2022
Acquisition of rainfall in ungauged basins: A study of rainfall distribution heterogeneity based on a new method
.
Natural Hazards
114
,
1723
1739
.
https://doi.org/10.1007/s11069-022-05444-2
.
This is an Open Access article distributed under the terms of the Creative Commons Attribution Licence (CC BY 4.0), which permits copying, adaptation and redistribution, provided the original work is properly cited (http://creativecommons.org/licenses/by/4.0/).

Supplementary data