Abstract
Relationships between peak discharges and catchment size (e.g., flood scaling) in a catchment have the potential to support new river flood forecasting approaches but have not been tested in tropical regions. This study determined flood scaling relationships between peak discharge and nested drainage areas in the La Sierra catchment (Mexico). A statistical power law equation was applied to selected rainfall–runoff events that occurred between 2012 and 2015. Variations in flood scaling parameters were determined in relation to catchment descriptors and processes for peak downstream discharge estimation. Similar to studies in humid temperate regions, the results reveal the existence of log-linear relationships between the intercept (α) and exponent (θ) parameter values and the log–log power–law relationships between (α) and the peak discharge observed from the smallest headwater catchments. The flood parameter values obtained were then factored into the scaling equation (QP = αAθ) and successfully predicted downstream flood peaks, especially highly recurrent flood events. The findings contribute to a better understanding of the nature of flood wave generation and support the development of new flood forecasting approaches in unregulated catchments suitable for non-stationarity in hydrological processes with climate change.
HIGHLIGHTS
Peak discharge in the La Sierra catchment follows a power–law relationship, similar to humid temperate regions.
Log-linear and log–log relationships can be used to estimate flood parameters and peak downstream discharge especially for frequent events.
INTRODUCTION
Flood prediction in ungauged regions is a global challenge. Most tropical regions with increasing flood risk do not have enough hydrometeorological records required for setting up flood models, and, as a result, people are exposed to elevated and increasing flood risk. To address this challenge, the United States Geological Survey (USGS) pioneered a flood estimation method that employs regional quartile equations to build regression relationships between catchment descriptors and variables where no streamflow data are available (Gupta et al. 2007). The equations are statistical relationships that use observations of annual maximum discharge over large, homogeneous regions to estimate flood magnitudes (Mandapaka et al. 2009b). The use of homogeneous regions enables the pooling of streamflow data from various basins within the same region with streamflow records to address the lack of streamflow data in other ungauged basins within the same region (Gupta et al. 2007). Regional flood frequency (RFF) analysis is still used currently to estimate annual peak flow quantiles in ungauged basins.
Many regions of the world, however, remain poorly gauged, resulting in limited data to apply the RFF method. Furthermore, the significant alterations to catchment hydrology caused by climate change and land use change make past observations less useful for flood estimation in many regions. Also, the current flood quantile equations do not account for the physical processes that cause flooding and make providing flood estimates at smaller basins challenging (Gupta et al. 2007). A physical solution has long been required to improve the accuracy of flood quantile estimates and compensate for the limited data records (Gupta 2017; Gupta et al. 2010; Furey et al. 2016). There has been a need for methodologies that estimate flood frequencies based on physical principles of water movement and general knowledge of the geographic and geomorphologic characteristics of upstream catchments for the location of interest (Perez Mesa 2019).
A solution is to estimate flood parameters and peak discharge at a rainfall–runoff event scale rather than current large-scale resolutions based on annual temporal scales and large homogeneous spatial regional scales. To switch from RFF approaches to physics-based flood frequency estimates, it is important to find the patterns between rainfall and runoff at increasing scales of sub-catchments within a river network, i.e., flood scaling (Perez Mesa 2019). In this regard, several studies have provided the theoretical and empirical basis for formulating a geophysical or a scaling theory of floods focused on scalable drainage areas and rainfall event scales (Gupta et al. 1996, 2010, 2015; Robinson & Sivapalan 1997; Menabde & Sivapalan 2001; Ogden & Dawdy 2003; Gupta 2004, 2017; Dawdy et al. 2012; Farmer et al. 2015; Medhi & Tripathi 2015; Furey et al. 2016; Lee & Huang 2016; Ayalew et al. 2018; Yang et al. 2020; Mantilla et al. 2011).
The intercept (α) and exponent (θ) of the peak discharge power–law relations are referred to as flood scaling parameters. An important discovery is that the values of these two scaling parameters change from one rainfall event to another, and this variation determines the magnitude of peak discharge (Ogden & Dawdy 2003; Ayalew et al. 2018). Thus, a better understanding of catchment factors and the processes governing the variation of these parameters is critical in developing scaling equations capable of estimating peak discharge, contributing towards solving the problem of flood prediction in ungauged basins (Sivapalan 2003; Hrachowitz et al. 2013; Yang et al. 2020).
The focus of most research studies has been on establishing and validating the scaling relationships between rainfall, peak discharge, and drainage areas, as well as other factors that control flood parameters (Ayalew et al. 2014b). However, most of these studies have been done in humid northern latitude regions, with several studies conducted in the United States and the United Kingdom, and it is not known whether the scaling flood theory is applicable in other climatic regions with different rainfall types and coverages (Wilkinson & Bathurst 2018). There is a need to generalise the scaling theory of floods to medium and large river basins spanning different climatic regions with a greater variety of rainfall event types (Gupta 2017).
This study aims to determine if flood scaling relationships are valid in large tropical regions to support flood forecasting. The three main objectives are (i) to investigate flood scaling relationships between peak discharge and nested drainage areas in an unregulated tropical catchment, (ii) to identify and explain factors influencing variations in the scaling parameters and the confidence levels, and (iii) to provide a unified framework for estimating flood parameter values and peak flood magnitudes across data-scarce tropical catchments.
METHODS
Study area and data
The La Sierra catchment area experiences two types of rainfall: large-scale rainfall events that affect the entire catchment and intense convective rainfall events that affect localised areas of the catchment. The three major sources of rainfall in the catchment area are tropical cyclones from the Caribbean Sea and the Atlantic Ocean; the Inter-Tropical Convergence Zone (ITCZ), which extends to higher latitudes during the summer and affects the upper Grijalva basin; and late summer tropical waves, which cause significant rainfall in the northern parts of the catchment (Arreguín-Cortés et al. 2014).
Catchment name (names derived from outlet flow GSs) . | Catchment area ×1,000,000 (m2) . | Length of the main channel ×1,000 m . | Minimum elevation of the main channel (m) . | Maximum elevation of the main channel (m) . | Average slope of the main channel (%) . | Standard average annual rainfall (mm) . | Standard runoff (%) . |
---|---|---|---|---|---|---|---|
Gaviotas (for La Sierra) | 6,743 | 173,615 | 6 | 2,216 | 1.27 | 1,730 | 20–30 |
Pueblo nuevo | 4,748 | 157,442 | 9 | 2,177 | 1.38 | 1,730 | 20–30 |
Tapijulapa | 3,698 | 131,192 | 19 | 2,216 | 10.67 | 1,602 | >30 |
Oxolotan | 2,416 | 115,064 | 109 | 2,216 | 20.08 | 1,602 | >30 |
El Puente | 1,787 | 97,219 | 7 | 902 | 1.00 | 2,101 | 20–30 |
Teapa | 438 | 42,027 | 35 | 1,106 | 20.55 | 2,513 | 20–30 |
Pichulcalco | 431 | 3,992 | 10 | 2,100 | 30.00 | 2,600 | 20–30 |
Puyacatengo | 229 | 15,370 | 59 | 644 | 30.81 | 2,561 | >30 |
Catchment name (names derived from outlet flow GSs) . | Catchment area ×1,000,000 (m2) . | Length of the main channel ×1,000 m . | Minimum elevation of the main channel (m) . | Maximum elevation of the main channel (m) . | Average slope of the main channel (%) . | Standard average annual rainfall (mm) . | Standard runoff (%) . |
---|---|---|---|---|---|---|---|
Gaviotas (for La Sierra) | 6,743 | 173,615 | 6 | 2,216 | 1.27 | 1,730 | 20–30 |
Pueblo nuevo | 4,748 | 157,442 | 9 | 2,177 | 1.38 | 1,730 | 20–30 |
Tapijulapa | 3,698 | 131,192 | 19 | 2,216 | 10.67 | 1,602 | >30 |
Oxolotan | 2,416 | 115,064 | 109 | 2,216 | 20.08 | 1,602 | >30 |
El Puente | 1,787 | 97,219 | 7 | 902 | 1.00 | 2,101 | 20–30 |
Teapa | 438 | 42,027 | 35 | 1,106 | 20.55 | 2,513 | 20–30 |
Pichulcalco | 431 | 3,992 | 10 | 2,100 | 30.00 | 2,600 | 20–30 |
Puyacatengo | 229 | 15,370 | 59 | 644 | 30.81 | 2,561 | >30 |
Note: the catchment area and channel length are in km and can be expressed in square metres and metres (metre unit) when multiplied by 100,000 and 1,000, respectively.
Hydrometeorological data
Rainfall data for the La Sierra catchment area were obtained from a total of 12 rain gauge stations. However, these rainfall data only encompassed the time period spanning from 2012 to 2015, as depicted in Table 2. The data were obtained from the Mexican rainfall database (CLIMCOM 2013) maintained by the Mexican Meteorological Service (Servicio Meteorologico Nacional, SMN) and the Mexican Water Commission (CONAGUA). The nested hydrometric network's discharge data were collected from seven GSs located at the outlets of each nested catchment (Figure 2), spanning from 2012 to 2015. The flow data were obtained from CONAGUA, the Mexico Surface Water Management and Rivers Engineering (GASIR) database, and the National Surface Water Data Bank (BANDAS) database. However, discharge data for El Puente and Tapijulapa nested catchment areas were limited, with missing values or having some gaps in their time series. As a result, the El Puente sub-catchment area was not included in the analysis (Table 2).
Investigation of scaling relationships
The analysis sought to investigate the scaling relationships between peak discharge and nested catchment drainage areas, as well as to identify underlying factors driving variations in the relationships, to enable flood prediction in the La Sierra catchment. The scaling relationships between selected rainfall–runoff events between 2012 and 2015 were investigated using a statistical power law equation (Equation (1)). The study investigated scaling relationships not only with drainage area data but also with other catchment descriptors and processes that contribute significantly to the mechanism of flood peak generation in the study area (Formetta et al. 2021).
Selection of observed peak discharge events
Tt is the travel time, in hours, l is the flow length, in metres, V is the average velocity (m/s), and 3,600 is the conversion factor (seconds to hours).
Peak scaling analysis
The following procedure was followed to establish significant scaling relationships for estimating flood parameter values for the La Sierra catchment area (Ayalew et al. 2015). The α, θ, and coefficient of determination (R2) values were determined using the power–law equation (Equation (1)) by establishing scaling relationships between the observed peak discharge for each rainfall event and its corresponding nested catchment drainage area plotted on a logarithmic scale scatter plot with a straight line fitted between data points. The R2 was used to evaluate the reliability of the scaling parameters (α and θ) and thus the accuracy of prediction of the scaling equation (Chen et al. 2020). Individual rainfall events were selected based on their R2 values; those with values greater than 0.5 were considered, while those with values less than 0.5, indicating that the scaling relationship or equation contained discrepancies, were discarded (Ayalew et al. 2015, 2018; Farmer et al. 2015).
Catchment descriptors and processes affecting parameters
A linear regression model was used to determine the relationships between estimated flood parameter values and other measurable catchment descriptors and processes that include rainfall location, soil moisture, rainfall accumulation, river levels, and peak discharge (Gupta et al. 1996; Mandapaka et al. 2009a; Ayalew et al. 2015, 2018).
Pearson's correlation analysis was conducted using IBM SPSS Statistics for Windows, Version 25, to assess both the strength and the direction of established relationships (IBM Corp. 2017). This study employed the Pearson correlation coefficient (CC) (r) to measure the relationship between scaling parameters and physical catchment characteristics. The resulting values were evaluated on a scale ranging from −1 to 1. When r exceeds 0.5 or falls below −0.5, it is inferred that the data points are in close proximity to the line of best fit.
Furthermore, linear and multiple regression analyses were conducted to investigate the dependency of the flood scaling parameters on multiple variables from the catchment's properties and processes. A forward stepwise multiple regression was employed to identify the strongest predictors of the flood scaling parameter values from the measurable catchment descriptors and processes identified (Formetta et al. 2021). The regression equations were further assessed based on the goodness-of-fit measure shown by the R2, the normal distribution of errors, the t-statistic, collinearity statistics, and the Shapiro–Wilk residual tests (Ayalew et al. 2015, 2018).
The Shapiro–Wilk test was used to check the normality of the data and produces a W value; low values indicate that the sample was not normally distributed. The null hypothesis of normally distributed data was rejected if the alpha (α) value was 0.05 and the p-value was less than 0.05. The p-values indicate the probability that any observed difference between datasets was attributable to chance. Checks for multicollinearity were done to check if it occurs when independent variables in a model are correlated. Two variables were regarded as perfectly collinear if their CC was within the tolerance of >0.1 and the variance inflation factor (VIF) is <10. The VIF detected and quantified collinearity in the regression models developed (Bruin 2006).
The best optimal equation, which should not overfit data and perform well in new contexts, was selected using the XLSTAT software, version 2019.3.2, from a pool of several equations developed. The software employs the Akaike information criterion (AIC) and the Bayesian information criterion (BIC), to select optimal equations for predicting flood parameter values for the catchment (Ding et al. 2018). The two model selection methods were selected due to their ability to effectively determine the optimal statistical model for fitting hydrological extremes. Di Baldassarre et al. (2009) did a study to see how well the AIC, the BIC, and the Anderson–Darling criterion work in statistical flood model selection. The results of their study revealed that the model selection methods were valuable tools for model selection and for reducing the uncertainty of design flood estimation.
The best regression equations selected were then used in a flood prediction framework to estimate the scaling parameter values and peak discharge across nested catchments in the study area. The model results were validated by comparing them to observed values from different time periods and sub-catchments that were not included in the development of equations.
Framework for predicting flood parameters and peak discharge magnitudes
To validate the model results and confirm that the equations developed were sufficiently predictive, the simulated α and θ values were compared to observed values derived from scaling relationships from a different sub-catchment and time (2016–2018) than used in model development. The model was run with the Teapa nested catchment added to the flood scaling equation (Equation (1)), along with the estimated parameters. This was done to test the model's ability to predict in different spatial conditions and see if it can make simulations that are accurate enough. The Nash–Sutcliffe efficiency (NSE), the Pearson CC, and the percentage bias (PBIAS) were calculated to objectively compare the simulated and observed discharge obtained (Moriasi et al. 2007). The root mean square error (RMSE) was also used to estimate the errors between observed and simulated peak discharge values found. A flowchart to visually depict the statistical scaling modelling stages is shown in Figure 4.
Uncertainties in the estimated flood scaling parameters and peak discharge values were shown using confidence and prediction intervals. The confidence interval bounds of 5 and 95% were used to represent the range of frequently varying discharge and parameter values, illustrating the bounds of their natural variability, and thus expressing the uncertainty in the flood parameters and peak discharge estimates found (Wadsworth 1990; NIST 2012). The prediction interval was taken as the range likely to include the mean value of the estimated flood parameter and peak discharge values given the specific values of independent variables (Helsel & Hirsch 2002).
RESULTS AND DISCUSSION
Investigation of scaling relationships in the La Sierra catchment
The results of flood scaling analysis of peak discharge data based on hydrometric records of peak flow from 59 rainfall events in the La Sierra catchment from 2012 to 2015 demonstrate that peak discharge exhibits a power–law relationship with drainage areas, which is consistent with findings from humid temperate regions. The average R2 is 0.86, with R2 values greater than 0.7 for 60% of all 59 rainfall–runoff events under consideration (Table 3).
Rainfall event No. . | Peak discharge date . | Scaling parameters . | R2 . | Rainfall event No. . | Peak discharge date . | Scaling parameters . | R2 . | ||
---|---|---|---|---|---|---|---|---|---|
α . | θ . | α . | θ . | ||||||
1 | 1 January 2012 | 6.72 | 0.46 | 0.76 | 31 | 24 May 2014 | 29.07 | 0.31 | 0.66 |
2 | 08 January 2012 | 3.98 | 0.51 | 0.97 | 32 | 28 May 2014 | 5.89 | 0.81 | 0.90 |
3 | 30 January 2012 | 12.32 | 0.34 | 0.88 | 33 | 03 June 2014 | 5.11 | 0.61 | 0.82 |
4 | 17 April 2012 | 37.38 | 0.14 | 0.51 | 34 | 05 July 2014 | 3.39 | 0.49 | 0.78 |
5 | 14 May 2012 | 30.40 | 0.22 | 0.52 | 35 | 14 July 2014 | 5.84 | 0.50 | 0.73 |
6 | 21 May 2012 | 80.85 | 0.20 | 0.53 | 36 | 03 August 2014 | 5.67 | 0.70 | 0.75 |
7 | 26 June 2012 | 10.76 | 0.36 | 0.59 | 37 | 15 August 2014 | 5.04 | 0.45 | 0.52 |
8 | 12 August 2012 | 50.40 | 0.22 | 0.57 | 38 | 26 August 2014 | 5.03 | 0.56 | 0.53 |
9 | 27 September 2015 | 17.46 | 0.35 | 0.59 | 39 | 31 August 2014 | 4.75 | 0.58 | 0.85 |
10 | 20 December 2012 | 10.76 | 0.36 | 0.59 | 40 | 12 September 2014 | 5.09 | 0.54 | 0.78 |
11 | 20 January 2013 | 33.01 | 0.24 | 0.56 | 41 | 19 September 2014 | 4.39 | 0.66 | 0.83 |
12 | 08 June 2013 | 5.41 | 0.84 | 0.58 | 42 | 23 September 2014 | 5.78 | 0.73 | 0.79 |
13 | 06 July 2013 | 5.25 | 0.59 | 0.81 | 43 | 14 October 2014 | 16.22 | 0.35 | 0.58 |
14 | 18 August 2013 | 28.08 | 0.24 | 0.60 | 44 | 26 October 2014 | 10.87 | 0.39 | 0.50 |
15 | 27 August 2013 | 11.57 | 0.38 | 0.59 | 45 | 29 October 2014 | 51.13 | 0.47 | 0.55 |
16 | 07 September 2013 | 10.75 | 0.44 | 0.53 | 46 | 08 November 2014 | 28.60 | 0.24 | 0.57 |
17 | 14 September 2013 | 48.45 | 0.17 | 0.59 | 47 | 13 November 2014 | 55.09 | 0.20 | 0.64 |
18 | 26 September 2013 | 46.15 | 0.18 | 0.51 | 48 | 27 November 2014 | 15.09 | 0.37 | 0.60 |
19 | 14 October 2013 | 12.14 | 0.41 | 0.71 | 49 | 13 June 2015 | 10.80 | 0.40 | 0.85 |
20 | 28 October 2013 | 1.72 | 0.73 | 0.79 | 50 | 29 August 2015 | 7.73 | 0.45 | 0.92 |
21 | 16 November 2013 | 12.32 | 0.44 | 0.87 | 51 | 15 September 2015 | 5.39 | 0.81 | 0.82 |
22 | 27 November 2013 | 1.15 | 0.77 | 0.86 | 52 | 23 September 2015 | 9.33 | 0.60 | 0.82 |
23 | 13 December 2013 | 0.61 | 0.95 | 0.80 | 53 | 01 October 2015 | 6.62 | 0.63 | 0.94 |
24 | 16 December 2013 | 19.91 | 0.39 | 0.89 | 54 | 20 October 2015 | 9.72 | 0.61 | 0.66 |
25 | 13 March 2014 | 2.30 | 0.61 | 0.97 | 55 | 26 October 2015 | 17.52 | 0.45 | 0.91 |
26 | 08 April 2014 | 10.12 | 0.88 | 0.57 | 56 | 14 November 2015 | 5.41 | 0.56 | 0.70 |
27 | 16 April 2014 | 27.30 | 0.25 | 0.50 | 57 | 23 November 2015 | 82.32 | 0.26 | 0.70 |
28 | 20 April 2014 | 5.60 | 0.69 | 0.88 | 58 | 08 December 2015 | 19.70 | 0.42 | 0.75 |
29 | 05 May 2014 | 5.37 | 0.70 | 0.78 | 59 | 20 December 2015 | 12.24 | 0.50 | 0.92 |
30 | 14 May 2014 | 4.64 | 0.60 | 0.76 |
Rainfall event No. . | Peak discharge date . | Scaling parameters . | R2 . | Rainfall event No. . | Peak discharge date . | Scaling parameters . | R2 . | ||
---|---|---|---|---|---|---|---|---|---|
α . | θ . | α . | θ . | ||||||
1 | 1 January 2012 | 6.72 | 0.46 | 0.76 | 31 | 24 May 2014 | 29.07 | 0.31 | 0.66 |
2 | 08 January 2012 | 3.98 | 0.51 | 0.97 | 32 | 28 May 2014 | 5.89 | 0.81 | 0.90 |
3 | 30 January 2012 | 12.32 | 0.34 | 0.88 | 33 | 03 June 2014 | 5.11 | 0.61 | 0.82 |
4 | 17 April 2012 | 37.38 | 0.14 | 0.51 | 34 | 05 July 2014 | 3.39 | 0.49 | 0.78 |
5 | 14 May 2012 | 30.40 | 0.22 | 0.52 | 35 | 14 July 2014 | 5.84 | 0.50 | 0.73 |
6 | 21 May 2012 | 80.85 | 0.20 | 0.53 | 36 | 03 August 2014 | 5.67 | 0.70 | 0.75 |
7 | 26 June 2012 | 10.76 | 0.36 | 0.59 | 37 | 15 August 2014 | 5.04 | 0.45 | 0.52 |
8 | 12 August 2012 | 50.40 | 0.22 | 0.57 | 38 | 26 August 2014 | 5.03 | 0.56 | 0.53 |
9 | 27 September 2015 | 17.46 | 0.35 | 0.59 | 39 | 31 August 2014 | 4.75 | 0.58 | 0.85 |
10 | 20 December 2012 | 10.76 | 0.36 | 0.59 | 40 | 12 September 2014 | 5.09 | 0.54 | 0.78 |
11 | 20 January 2013 | 33.01 | 0.24 | 0.56 | 41 | 19 September 2014 | 4.39 | 0.66 | 0.83 |
12 | 08 June 2013 | 5.41 | 0.84 | 0.58 | 42 | 23 September 2014 | 5.78 | 0.73 | 0.79 |
13 | 06 July 2013 | 5.25 | 0.59 | 0.81 | 43 | 14 October 2014 | 16.22 | 0.35 | 0.58 |
14 | 18 August 2013 | 28.08 | 0.24 | 0.60 | 44 | 26 October 2014 | 10.87 | 0.39 | 0.50 |
15 | 27 August 2013 | 11.57 | 0.38 | 0.59 | 45 | 29 October 2014 | 51.13 | 0.47 | 0.55 |
16 | 07 September 2013 | 10.75 | 0.44 | 0.53 | 46 | 08 November 2014 | 28.60 | 0.24 | 0.57 |
17 | 14 September 2013 | 48.45 | 0.17 | 0.59 | 47 | 13 November 2014 | 55.09 | 0.20 | 0.64 |
18 | 26 September 2013 | 46.15 | 0.18 | 0.51 | 48 | 27 November 2014 | 15.09 | 0.37 | 0.60 |
19 | 14 October 2013 | 12.14 | 0.41 | 0.71 | 49 | 13 June 2015 | 10.80 | 0.40 | 0.85 |
20 | 28 October 2013 | 1.72 | 0.73 | 0.79 | 50 | 29 August 2015 | 7.73 | 0.45 | 0.92 |
21 | 16 November 2013 | 12.32 | 0.44 | 0.87 | 51 | 15 September 2015 | 5.39 | 0.81 | 0.82 |
22 | 27 November 2013 | 1.15 | 0.77 | 0.86 | 52 | 23 September 2015 | 9.33 | 0.60 | 0.82 |
23 | 13 December 2013 | 0.61 | 0.95 | 0.80 | 53 | 01 October 2015 | 6.62 | 0.63 | 0.94 |
24 | 16 December 2013 | 19.91 | 0.39 | 0.89 | 54 | 20 October 2015 | 9.72 | 0.61 | 0.66 |
25 | 13 March 2014 | 2.30 | 0.61 | 0.97 | 55 | 26 October 2015 | 17.52 | 0.45 | 0.91 |
26 | 08 April 2014 | 10.12 | 0.88 | 0.57 | 56 | 14 November 2015 | 5.41 | 0.56 | 0.70 |
27 | 16 April 2014 | 27.30 | 0.25 | 0.50 | 57 | 23 November 2015 | 82.32 | 0.26 | 0.70 |
28 | 20 April 2014 | 5.60 | 0.69 | 0.88 | 58 | 08 December 2015 | 19.70 | 0.42 | 0.75 |
29 | 05 May 2014 | 5.37 | 0.70 | 0.78 | 59 | 20 December 2015 | 12.24 | 0.50 | 0.92 |
30 | 14 May 2014 | 4.64 | 0.60 | 0.76 |
R2 is the coefficient of determination.
Other catchment variables affecting flood scaling parameter estimation
(α) parameter estimation
The results indicate that the peak discharge observed at the Puyacatengo GS can be used to predict the α parameter values across the catchment area using a log–log regression equation with an α value of −9.17, varying from −14.00 to −4.34, and an estimated mean slope value of 0.34, varying from 0.25 to 0.42 at a 95% confidence level (Table 4).
Source . | Value . | Standard error . | t . | Pr > |t| . | Lower bound (95%) . | Upper bound (95%) . |
---|---|---|---|---|---|---|
ln(α) | −9.170 | 2.399 | −3.822 | <0.003 | −14.002 | −4.338 |
ln(Puyacatengo) | 0.339 | 0.042 | 7.995 | <0.001 | 0.254 | 0.424 |
Source . | Value . | Standard error . | t . | Pr > |t| . | Lower bound (95%) . | Upper bound (95%) . |
---|---|---|---|---|---|---|
ln(α) | −9.170 | 2.399 | −3.822 | <0.003 | −14.002 | −4.338 |
ln(Puyacatengo) | 0.339 | 0.042 | 7.995 | <0.001 | 0.254 | 0.424 |
Notes: t is the Student's t-test statistic. The Pr(>t) relates to the probability of observing any value equal to or larger than t.
Student's t-test on the coefficients of Equation (4) indicates that they are statistically significant at a 95% confidence level. Checks on the Q–Q plot for residuals and the Shapiro–Wilk test for normality yield W = 0.96 and p= 0.007, indicating that the residuals are normally distributed.
θ parameter estimation
Analysis results (Table 5) show the robustness of the log-linear relationship between the scaling θ and α parameters, with significant coefficients (p= 0.05). This shows that α values can significantly predict θ values using a log-linear equation with an α value of 0.60 ranging from 0.55 to 0.64 and an estimated mean slope value of −0.01 varying from −0.01 to −0.006 at the 95% confidence level.
Source . | Value . | Standard error . | t . | Pr > |t| . | Lower bound (95%) . | Upper bound (95%) . |
---|---|---|---|---|---|---|
θ | 0.596 | 0.023 | 25.947 | <0.0012 | 0.550 | 0.641 |
α | −0.007 | 0.001 | −8.184 | <0.0014 | −0.009 | −0.006 |
Source . | Value . | Standard error . | t . | Pr > |t| . | Lower bound (95%) . | Upper bound (95%) . |
---|---|---|---|---|---|---|
θ | 0.596 | 0.023 | 25.947 | <0.0012 | 0.550 | 0.641 |
α | −0.007 | 0.001 | −8.184 | <0.0014 | −0.009 | −0.006 |
Notes: t is the Student's t-test statistic. The Pr(>t) relates to the probability of observing any value equal to or larger than t.
Student's t-test on the equation coefficients (Equation (5)) reveals that they are statistically significant at a 95% confidence level. Checks on the Q–Q plot for residuals between α and θ values as well as the Shapiro–Wilk test for normality (W = 0.89 and p= 0.01) show that the residuals are normally distributed.
α values multi-regressed with other catchment variables
A multiple regression analysis was also used to estimate α and θ parameter values, this time using more than one exploratory variable. Analysis results show that α parameter values have the strongest relationship with the observed discharge at the Puyacatengo GS (r= − 0.77, p= 0.01), followed by discharge at the Teapa GS (r= 0.38, p= 0.01) (Table 6). The natural logarithm of α values was multi-regressed on the natural logarithm of peak discharges observed at Puyacatengo and Teapa GSs. A simple equation with all significant coefficients (p= 0.05) was then obtained using a stepwise forward elimination procedure.
Model . | Unstandardised coefficients . | Standardised coefficients . | . | . | Correlations . | Collinearity statistics . | ||||
---|---|---|---|---|---|---|---|---|---|---|
B* . | Std. Error . | Beta . | t . | Sig. . | Zero-order . | Partial . | Part . | Tolerance . | VIF . | |
ln(constant) | 0.667 | 0.04 | 17.73 | 0.004 | ||||||
ln(Puyacatengo) | −0.003 | 0.00 | −0.81 | −8.03 | 0.003 | −0.73 | −0.77 | −0.77 | 0.91 | 1.11 |
ln(Teapa) | 0.0001 | 0.00 | 0.27 | 2.65 | 0.011 | 0.02 | 0.38 | 0.26 | 0.91 | 1.11 |
Model . | Unstandardised coefficients . | Standardised coefficients . | . | . | Correlations . | Collinearity statistics . | ||||
---|---|---|---|---|---|---|---|---|---|---|
B* . | Std. Error . | Beta . | t . | Sig. . | Zero-order . | Partial . | Part . | Tolerance . | VIF . | |
ln(constant) | 0.667 | 0.04 | 17.73 | 0.004 | ||||||
ln(Puyacatengo) | −0.003 | 0.00 | −0.81 | −8.03 | 0.003 | −0.73 | −0.77 | −0.77 | 0.91 | 1.11 |
ln(Teapa) | 0.0001 | 0.00 | 0.27 | 2.65 | 0.011 | 0.02 | 0.38 | 0.26 | 0.91 | 1.11 |
Notes: B* is the unstandardised beta value representing the slope of the line between the predictor variable and the dependent variable. t is the Student's t-statistic. VIF is the factor for quantifying collinearity (see Section 2.3 for explanation).
Student's t-test on the equation coefficients reveals that Puyacatengo and Teapa peak discharge values can significantly (p= 0.05) predict α parameter values for the catchment area. Furthermore, checks for multicollinearity in the model show that it is at 1.1, which is within the tolerance of >0.1 and VIF <10.
Developing flood scaling equations
Using the methods described above, six scaling equations were developed for estimating flood parameter values for the entire La Sierra the catchment. Four of the six equations were for the θ parameter estimation, with an average good model fit (R2 = 0.57). The remaining two were found to be valid for estimating α parameter values, both with an overall good model fit and an R2 = 0.62.
Selection of the best scaling equations
Equation (1) was found to have the lowest AIC (−232.30) and BIC (−228.14) values and was thus chosen as the best log-linear regression model for estimating θ parameter values for the La Sierra catchment (Table 7). Equation (5) was found to be the best for estimating α parameter values, with the lowest AIC (−191.15) and BIC (−185.66) values.
Rank No . | Overall equations . | Equation selection metrics . | |||
---|---|---|---|---|---|
MSE . | Adj. R2 . | AIC . | BIC . | ||
1 | θ = 5.96 ×10−1 − 7.43 ××10−3 lnα | 0.02 | 0.53 | −232.30 | −228.14 |
2 | θ = 6.99 × 10−1 − 3.01 × 10−3 ln Puyacatengo peak discharge | 0.02 | 0.49 | −192.54 | −188.80 |
3 | θ = 3.17 × 10−1 − 3.21 × 10−4 ln 2 h Oxolotan peak discharge | 0.01 | 0.56 | −34.16 | −34.00 |
4 | lnα = 9.17 × 100 + 3.40 × 10−1 ln Puyacatengo peak discharge | 78.19 | 0.74 | 211.19 | 214.94 |
5 | ln α = 6.67 × 10−1 − 5.92 × 10−4 ln (Puyacatengo peak discharge + 3.46E-03 × ln Teapa peak discharge) | 0.015 | 0.58 | −191.15 | −185.66 |
6 | lnθ = 4.93 × 10−1 − 3.33 × 10−3 ln (Puyacatengo peak discharge +4.99E-04 × Gaviota peak discharge) | 0.006 | 0.83 | −232.20 | −226.71 |
Rank No . | Overall equations . | Equation selection metrics . | |||
---|---|---|---|---|---|
MSE . | Adj. R2 . | AIC . | BIC . | ||
1 | θ = 5.96 ×10−1 − 7.43 ××10−3 lnα | 0.02 | 0.53 | −232.30 | −228.14 |
2 | θ = 6.99 × 10−1 − 3.01 × 10−3 ln Puyacatengo peak discharge | 0.02 | 0.49 | −192.54 | −188.80 |
3 | θ = 3.17 × 10−1 − 3.21 × 10−4 ln 2 h Oxolotan peak discharge | 0.01 | 0.56 | −34.16 | −34.00 |
4 | lnα = 9.17 × 100 + 3.40 × 10−1 ln Puyacatengo peak discharge | 78.19 | 0.74 | 211.19 | 214.94 |
5 | ln α = 6.67 × 10−1 − 5.92 × 10−4 ln (Puyacatengo peak discharge + 3.46E-03 × ln Teapa peak discharge) | 0.015 | 0.58 | −191.15 | −185.66 |
6 | lnθ = 4.93 × 10−1 − 3.33 × 10−3 ln (Puyacatengo peak discharge +4.99E-04 × Gaviota peak discharge) | 0.006 | 0.83 | −232.20 | −226.71 |
Note: Adj. R2 is a corrected goodness-of-fit measure for assessing models.
Flood prediction framework
The estimated values for the intercept and exponent parameters were incorporated in the scaling equation to estimate discharge values in the study area. By comparing the values that were predicted and those that were observed, it was possible to see how well the equations were able to predict flood parameters and peak discharge across the La Sierra catchment.
α parameter comparison
θ parameter comparison results
Peak discharge comparison results
Summary of statistical significance
The α and θ parameter equations developed were found to be statistically significant in their relationship with catchment descriptors and processes in the La Sierra catchment and validated by comparing them to historical data. A series of statistical tests were conducted to validate the robustness of the regression equations obtained. The test results show that the peak flood observed at the smallest sub-catchments in the basin significantly predicted the intercept (p < 0.05). The residuals were also confirmed to be normally distributed through the inspection of their Q–Q plot and the Shapiro–Wilk test (W = 0.96 and P = 0.12) for the α and between the α and θ values (W = 0.89 and p= 0.01), showing that the residuals were normally distributed. Similarly, the study checked for multicollinearity in the model data, whether it was within the tolerance of >0.1 and VIF <10 and whether the data were normally distributed. Based on these findings, it is possible to conclude that the regression models generated were reliable though with uncertainties that were quantified and reflected in the results.
DISCUSSION
The results reveal that peak discharge in the La Sierra catchment exhibits power–law relationships. The study also demonstrates how the estimated scaling parameters are linked to catchment descriptors and processes, allowing statistical equations to be developed and utilised to estimate flood parameter values and the magnitude of peak discharge across the studied catchment area. Similarly, recent studies on scaling laws have demonstrated the potential of estimating flood parameters by analysing the physical characteristics of drainage areas (Menabde & Sivapalan 2001; Furey & Gupta 2005, Gupta et al. 2007; Mantilla et al. 2006; Mandapaka et al. 2009a; Ayalew et al. 2014a, 2014b, 2015; Farmer et al. 2015; Medhi & Tripathi 2015; Furey et al., 2016; Lee & Huang 2016; Wilkinson & Bathurst 2018; Perez Mesa 2019; Yang et al. 2020).
A noteworthy result from present study is that peak discharge exhibits a power–law relationship with drainage areas, similar to studies from humid temperate regions (Table 2). The results of the scaling relationship analysis in the La Sierra sub-catchment confirmed that the flood parameters, θ and α in the scaling equations, vary with rainfall events, which is consistent with findings in the United States and the United Kingdom (Gupta et al. 1996, 2007; Ogden & Dawdy 2003; Gupta 2004; Furey & Gupta 2005; Mantilla 2007; Mantilla et al. 2011; Ayalew et al. 2014a, 2014b, 2015).
The study reveals the existence of a log-linear relationship between α and θ values (Equation (5)) and a log–log power–law relationship between α and the peak discharge observed from the smallest headwater catchments (Equation (4)). The results show that the log-linear and log–log relationships can be used to estimate the values of flood scaling parameters across the nested catchments, even though there are some quantifiable errors in the predictions. This finding is consistent with the research conducted by Ayalew et al. (2015, 2018) in Iowa River basin, which revealed a log-linear relationship between the intercept and exponent as well as a log–log relationship between the intercept and the peak flood observed in the smallest gauged sub-catchments within the nested catchment. These relationships, when combined with the scaling law equation for peak floods, can be used to reasonably predict peak floods at any location within the nested catchments after a rainfall–runoff event (Ayalew et al. 2018).
Chen et al. (2020) recently revealed that using log-transformation methods in peak discharge power–law analysis produces errors and uncertainties when compared to nonlinear regression methods, which is noted in this study. The study found that the log–log-linear regressions produced higher peak discharge prediction errors than nonlinear regressions, and the logarithmic transformation results in smaller peak and drainage area data points that are more heavily weighted. As a result, Chen et al. (2020) recommend using nonlinear regression on the arithmetic scale to estimate the scaling parameters when performing peak discharge scaling analysis, particularly for prediction purposes. This study was conducted before the study by Chen et al. (2020), so it did not incorporate their findings into the design of this study. However, there is some evidence in this work that the log transformations show a satisfactory fitting on the log–log scale, reliably control the size-dependent variability present, and ensure a normal distribution of the residuals. The regression equations obtained from using multi-scale nested catchments, each less than 10,000 km2 in size, were found to be adequately fitted, not overfitting the data, and performing well in new contexts. The errors and uncertainties generated by the log-transformation approach used in this study were quantified to demonstrate the variability in the estimated flood scaling parameters and peak discharge values. Thus, the knowledge of quantified uncertainties improves the flood modelling approach employed and allows for a clear uncertainty analysis aimed at taking into account errors and uncertainties in the modelling process.
The ability of the scaling equations to predict small-scale flood events has been demonstrated with a degree of confidence (Figure 9); however, the equation's ability to predict large-scale flood events is questioned. The results show that the simulated peak discharges have greater variability around their expected values, which is particularly noticeable in the estimation of high-magnitude flood events (Figure 9). These results could support the findings of Chen et al. (2020); however, it is believed that the uncertainties in the scaling relationships can be explained in part by the lack of long records of hydrometeorological datasets. This study was limited to a short window, between 2012 and 2015, as it was the only period that had data across all gauging sites. Thus, in their current form, the equations are suitable for estimating low- to medium-peak discharges with high recurrence intervals. Therefore, caution should be exercised when extrapolating the findings of this study over time. Future research is recommended to repeat this work in locations with long hydrometeorological records extending across several years to fully capture the low frequent severe flood events.
The findings of this study are significant because they quantify relationships between drainage size, peak discharge, and other catchment processes that cause flooding from small to larger downstream catchments in a tropical region. The results provide statistical relationships (Equations (4) and (5)) that shed light on the broader catchment flood response and flood generation across catchments. In other words, the findings contribute to establishing how flood scaling relationships are quantifiably linked to larger catchment-scale descriptors and processes to enable the estimation of flooding. The most important finding in this regard is how point-scale processes at the smallest nested catchment, such as observed discharge, are related to catchment-wide flooding, which enables new strategies in flood monitoring and prediction. The result contributes to the formulation of a flood risk strategy that promotes a network of flow GSs in smaller headwater streams for peak discharge monitoring and prediction for larger downstream catchment areas (Gupta 2017). This application is especially important for the La Sierra catchment area, which drains floodwaters from smaller mountainous catchments to frequently flood larger downstream catchments in the Gaviotas district areas of Villahermosa City.
Although this research has established valid flood scaling relationships between small headwater catchments and larger downstream catchment areas, not all flooding factors were included in formulating the equations developed. Several other physical catchment properties and processes that directly or indirectly control flooding were not considered due to the lack of relevant data. For example, this study did not consider channel geometry and roughness, key factors that strongly influence river flooding. Although some additional flood-generating variables were not considered, the results obtained were statistically significant and were validated through comparison with historical data.
Thus, this study supports the potential use of scaling equations to simulate discharge without using hydrological models, a method which is essential and relevant under current conditions of climate change. Model calibration has always been a major challenge in the hydrological community (Zheng et al. 2021). In changing climatic conditions, the accuracy of historical data is unreliable, and parameter re-calibration using historical data does not reflect future conditions. The scaling equations provide a flood estimation technique that does not require model calibration against historical observations to forecast peak discharge in the future (Gupta 2017). As a result, this study has important implications for the long-term use of statistical equations for flood monitoring and prediction in the face of climate change. It is also suggested that more research should be done to find out what roles climate variability and change play in scaling peak floods.
CONCLUSIONS
The significance of this study lies in extending the scaling theory of floods to tropical regions and investigating catchment descriptors and processes that influence the flood scaling parameters and the estimation of the same parameter values and peak discharge magnitudes. This study has shown that peak discharge in the La Sierra catchment area exhibits power–law relationships with drainage areas similar to findings from humid temperate regions. The findings substantiate the log-linear relationship identified by Ayalew et al. (2015) between the θ and the α of the power–law relationships, not often used in other studies, as well as the significance of the log–log relationship between the α and peak discharge observed from the smallest nested catchment. In this regard, this study contributes to the ongoing application of the scaling theory of floods and to efforts in predicting floods in data-scarce regions. However, the question of whether the relationship between the intercept and exponent is universal remains unanswered. It is hoped that this question may spark further research.
Nonetheless, the current study has provided a framework for estimating flood parameter values and peak discharge magnitudes that enables flood prediction in data-scarce tropical catchments. The results of the study have led to the identification of some influential factors and provide the foundation for operational flood prediction tools. The study has demonstrated the potential of scaling relationships to estimate the magnitude of peak floods by leveraging the intrinsic relationship between peak discharge and geophysical characteristics and processes within nested catchment areas. The relationships obtained offer valuable insights into the wider catchment flood response and the generation of floods linked to headwater flood peaks (Gupta 2017). One potential implication of this finding is the potential for developing a flood risk strategy that advocates for the setting up of a network of flow GSs in smaller headwater streams that serve the purpose of monitoring peak discharge and estimating flood levels in larger downstream catchment areas.
The self-similarity of river networks and the flood scaling parameters are not expected to change as the climate changes. The scaling laws and statistical models developed will remain unchanged for some time (Gupta 2017). Therefore, there is potential for the long-term use of statistical models for flood monitoring and prediction. This study provides a new research direction to make flood predictions at multiple spatial and temporal scales under a changing global hydroclimate. However, there is a need to investigate the role of climate variability and change using the scaling framework for floods discussed.
Thus, the persistent challenge that remains is to ‘bridge the scaling gap’ by extending the scaling theory of floods (scaling relationships) from the smallest rivers on the mainland to large regional and continental rivers in different climate regions. There is still a shortage of scaling relationship studies in multi-scale catchments across climatic regions. In several catchments, the process by which floods are generated from small headwater catchments to larger downstream catchments is still not quantified. Therefore, there is a need for further studies that apply the scaling theory of floods and establish patterns of scaling relationships across regional and continental river catchments.
ACKNOWLEDGEMENTS
This work was supported by funding from the UK National Environment Research Council (NE/M009009/1). The authors would like to acknowledge the guidance and technical advice of Tim Brewer and thank CONAGUA for providing the data for the analyses. The authors also thank two anonymous reviewers who provided constructive feedback on an earlier draft of the manuscript.
DATA AVAILABILITY STATEMENT
The data used in this study have been uploaded to Cranfield University's open access data repository (CORD - DOI: 10.17862/cranfield.rd.23530956).
CONFLICT OF INTEREST
The authors declare there is no conflict.