## ABSTRACT

Instantaneous peak flows (IPFs) are often required to derive design values for sizing various hydraulic structures, such as culverts, bridges, and small dams/levees, in addition to informing several water resources management-related activities. Compared to mean daily flows (MDFs), which represent averaged flows over a period of 24 h, information on IPFs is often missing or unavailable in instrumental records. In this study, conventional methods for estimating IPFs from MDFs are evaluated and new methods based on the nonlinear regression framework and machine learning architectures are proposed and evaluated using streamflow records from all Canadian hydrometric stations with natural and regulated flow regimes. Based on a robust model selection criterion, it was found that multiple methods are suitable for estimating IPFs from MDFs, which precludes the idea of a single universal method. The performance of machine learning-based methods was also found reasonable compared to conventional and regression-based methods. To build on the strengths of individual methods, the fusion modeling concept from the machine learning area was invoked to synthesize outputs of multiple methods. The study findings are expected to be useful to the climate change adaptation community, which currently heavily relies on MDFs simulated by hydrologic models.

## HIGHLIGHTS

New methods for estimating instantaneous peak flows from mean daily flows.

Data completion by filling in missing values of instantaneous peak flows.

Reliable estimation of design flood magnitudes.

Machine learning inspired fusion modeling to synthesize outputs of multiple methods.

Fulfilled a critical need of the climate change impact analysis and adaptation community.

## INTRODUCTION

Instantaneous peak flows (IPFs) are often required to derive design values for sizing various hydraulic structures, such as culverts, bridges, small dams, and levees, in addition to informing hydraulic procedures of floodplain delineations, erosion risk management strategies, and water resources management-related activities. Statistical frequency analysis approaches, which are commonly used for deriving targeted design flow magnitudes, rely considerably on long sequences of observed IPFs. However, compared to IPFs, information on mean daily flows (MDFs) is more frequently available at hydrometric stations, both in Canada and other regions of the world. For a given year or season, IPF is the maximum recorded flow at the crest of the hydrograph at the point of interest within a river and is generally associated with a fine temporal scale such as 5- or 10-min interval. In comparison, MDF represents an averaged aggregated quantity corresponding to a continuous period of 24 h (or a calendar day) and is generally expected to be smaller than the IPF due to the averaging involved. Consequently, design flow values derived from extreme observations of MDFs need to be adjusted upward based on established techniques or using region- or watershed-specific empirical experience.

In cases where long sequences of IPFs are available, individual missing values and continuous gaps of missing values are often encountered in observational records. To overcome such situations, investigators have designed different approaches to estimate IPFs from MDFs (e.g., Sangal & Kallio 1977; Sangal 1981; Chen *et al.* 2017). This estimation using a reliable approach has become important recently for evaluating climate resilience of various flood control and mitigation structures in the face of changing climatic conditions. According to the International Panel on Climate Change (IPCC) (IPCC 2013), it is very likely that the frequency and magnitude of floods in certain regions of the world, including Canada, will increase due to the anticipated climate change. In order to evaluate such impacts of climate change and outline/plan adaptation strategies, climate model outputs of temperature, precipitation, and other relevant variables are integrated with hydrological models to simulate streamflow sequences for a selected reference period and many future time horizons. These simulated sequences from hydrological models mostly become available in the form of MDFs, and very rarely at smaller sub-hourly timescales to support estimation of IPFs. For this purpose, appropriate methods must be used to estimate IPFs from sequences of MDFs. Currently, this information is lacking in the literature. Therefore, the main objective of this paper is to develop new methods for estimating IPFs from MDFs, using data from a vast network of hundreds of hydrometric stations located across Canada, so that they can reliably be used in climate change adaptation-related studies.

The research on estimating IPFs from MDFs started decades ago. Early researchers assumed IPFs as functions of associated MDFs occurring over a short time window, which was taken as three consecutive days. In these methods, it was explicitly assumed that the IPF is a function of the maximum MDF of the sequence and select characteristics of the associated hydrograph shape, reflective of rising and falling portions of the hydrograph. Langbein (1944) developed such a graphical approach which was tested by Ellis & Gray (1966) for Canadian Prairie watersheds. Inspired by Langbein's work, Sangal (1981) introduced a quantitative approach, which was tested on observed floods from 387 natural and regulated stations from Ontario, Canada. About 70% of these stations had drainage areas less than 1,000 km^{2} and about 50% of floods were snowmelt dominated, which is a common feature in Canada. Fill & Steiner (2003) also introduced an MDF-driven IPF estimation method, which was applied to 117 floods from 14 stations with drainage area ranging from 84 to 687 km^{2}. Along these lines, Chen *et al.* (2017) recently introduced a slope-based method which takes into account slopes of the rising and falling portions of the hydrograph to describe the shape of the MDF hydrograph. As no parameter is required to be estimated from data, this method is much simpler than the methods of Sangal (1981) and Fill & Steiner (2003). This method was evaluated on roughly 3,800 floods observed at 144 watersheds from Iowa, with drainage area ranging from 70 to 221,700 km^{2}. Among other methods evaluated for estimating IPFs include machine learning-based approaches (e.g., Dastorani *et al.* 2013; Jimeno-Sáez *et al.* 2017) and deterministic type conceptual and distributed hydrologic modeling (e.g., Singh 1995; Singh & Frevert 2002). The hydrologic modeling approach requires much more data and deterministic modeling expertise than the other methods mentioned above.

In this paper, Sangal's method (Sangal 1981) and three new nonlinear generalizations of the same concept as used in this method, along with an Artificial Neural Network (ANN)-based approach are evaluated using IPF and MDF data from 1,938 available natural and regulated flow measuring stations, covering nearly all provinces and northern territories of Canada. This level of country-wide study has never been attempted before. A country-wide study is useful in uncovering relative suitability of various methods in different regions as floods can originate from many different generating mechanisms across the Canadian landscape (Teufel & Sushama 2021). This combined with different geophysical and other factors can significantly impact the relationship between IPFs and MDFs. Consideration of regulated stations is useful in validating the applicability of observations made from natural flows to regulated flows, particularly during high flow conditions when the influence of regulation weakens. Performance of the studied methods is evaluated using multiple assessments and an objective model selection criterion that accounts for the structural complexity of candidate models is also introduced to aid decision-making, especially for the conventional and regression-based methods. An ANN-based fusion modeling approach is also introduced to synthesize outputs from multiple methods. The work reported in this paper is a natural extension of the pilot study conducted recently (Khaliq 2023), where the performance of the aforementioned methods was evaluated using data from natural flow measuring stations from the four Canadian Atlantic provinces, i.e., New Brunswick, Newfoundland and Labrador, Nova Scotia, and Prince Edward Island. Most of the outputs of that preliminary study are integrated in this country-wide study and for this reason, the reader may find some resemblances for the Atlantic provinces, which are also covered in this study. In addition, theoretical considerations as described in Section 2 also bear resemblance with those provided in Khaliq (2023) because most of the theoretical descriptions and principles cannot be changed.

This paper is organized as follows. Information on theoretical underpinning of the conventional and newly introduced nonlinear regression-based IPF estimation methods is described in Section 2, along with the parameter estimation and performance evaluation strategy employed in the study. Information on hydrometric data and the study area is provided in Section 3, followed by detailed results of the study in Section 4. Overall conclusions and future directions are summarized in Section 5.

## METHODS

### Methodological background

*et al.*2013; Chen

*et al.*2017) only Equation (2) was considered, instead of evaluating the full potential of the method. It is likely to confuse the annual or seasonal maximum MDF, which is often considered for statistical frequency analyses, with the maximum MDF . These two are different quantities, however, both will coincide in the majority of the cases, especially where hydrological regimes are dominated by snowmelt spring floods. Fill & Steiner (2003) introduced an IPF estimation method by assuming IPF as a linear combination of MDFs of three consecutive days and a regression-based correction factor. The methodological framework of this method lacks theoretical robustness and therefore extreme caution should be exercised in using this method. No further consideration is given to this method in the present study. Chen

*et al.*(2017) proposed a slope-based method, by assuming a triangular hydrograph. Following their proposal, the IPF as a function of MDFs is given by:

Equations (4) and (6) are of multiplicative nature, while Equation (5) is of power form. The latter equation assumes that the IPF is a direct nonlinear function of the maximum MDF of the sequence of three consecutive MDFs occurring in the vicinity of the IPF. In these equations, , , , and are the flows as explained above; and *a*, *b*, *c*, and *d* are the parameters of the functional relationships which are to be estimated by iterative means or following an optimization routine subject to applicable constraints or via nonlinear least squares approach. In a watershed, IPFs are also impacted by many physical factors, such as hydrological and physiographical controls, water storage features, soil moisture state, and meteorological inputs. These factors were not considered in this study as the focus was kept on explicit functional relationships and machine learning architectures.

Similar to Khaliq (2023), Sangal's method (Equation (1)) and the above mentioned nonlinear functional relationships (i.e., Equations (4)–(6)), were evaluated using data from the Canadian hydrometric network, to be discussed in Section 3. Henceforth, these methods are respectively referred to as M1, M2, M3, and M4 for presentation convenience. In addition to these four methods, a feedforward multilayer perception (MLP) ANN (Maier *et al.* 2010; Lantz 2015; Chattopadhyaya *et al.* 2019) was also considered. This method is referred to as M5. Complexity of different ANN structures and their relative performance are discussed in Section 4.2.

### Parameter estimation and optimization of ANN structure

*a*,

*b*,

*c*, and

*d*of Equations (4)–(6) were estimated by minimizing root mean square error (RMSE), which is equivalent to minimizing mean squared differences between observed and estimated IPFs. The RMSE is defined as:where is the observed and is the estimated IPF and

*n*is the number of sample pairs used in the optimization of parameters. Alternatively, these parameters can also be estimated following ready to use least squares software tools by linearizing Equations (4)–(6) using logarithmic transformation, which is relatively a straightforward procedure.

The MLP ANN was trained using the resilient backpropagation algorithm (Anastasiadis *et al.* 2005) as implemented in the ‘neuralnet’ package of R computing platform (R Core Team 2023) with ReLU (rectified linear unit) as the activation function, an input layer, an output layer, and one or more hidden layers. The number of hidden layers and the number of neurons in each of the hidden layers is generally determined by trial and error after evaluating the performance of different network structures (Maier *et al.* 2010; Wolfs & Willems 2014; Lantz 2015). In this study, the optimal network structure and all controlling parameters were determined using 70% of data for training and reserving 30% for testing purposes, and experimenting with multiple hidden layers containing multiple neurons.

### Model selection criterion

In this equation, SSE is the sum of squares of errors, which is embedded in Equation (7), *k* is the number of estimated parameters and is the natural logarithm. The second and the third term on the right penalize AIC for the number of estimated parameters and sample size. A method or a model with a smaller value of AICc is the most preferred option from a given set of modeling options. In situations where competing modeling options exhibit varying levels of structural complexities reflected in terms of controlling model parameters and small sample problems, the AICc is generally favored due to the corrections introduced (Brewer *et al.* 2016).

### Performance measures

*R*

^{2}measure) were also considered. The MAPE is given by Equation (9), while

*R*

^{2}is given by Equation (10):

In Equation (10), and are, respectively, the observed and estimated mean values. The *R*^{2} measure is included here due to its common usage in the literature, but it must be carefully interpreted with guidance from the published literature (Helsel & Hirsch 2002; Walpole *et al.* 2012). This measure can be impacted by outliers and systematic patterns in data and therefore it should be accompanied by visual comparisons as is done in this study and also demonstrated in Helsel & Hirsch (2002).

## STUDY AREA AND STREAMFLOW DATA

P&T-wise station counts that resulted following the above criteria are given in Table 1. The station count for PEI was small and hence these stations were merged with those for NS and the same was done for the case of NU, where stations were merged with those for NT. The number of flood events or station years considered for the analysis are also shown in this table.

Province and territory (P&T) . | Natural stations . | Regulated stations . | ||
---|---|---|---|---|

Number of stations . | Number of flood events/station years . | Number of stations . | Number of flood events/station years . | |

AB | 86 | 3,030 | 44 | 1,972 |

BC | 435 | 12,593 | 143 | 4,390 |

MB | 43 | 1,193 | 24 | 740 |

NB | 67 | 2,024 | 14 | 517 |

NL | 117 | 3,137 | 12 | 280 |

NS + PEI | 57 | 1,590 | 26 | 749 |

NT + NU | 104 | 1,921 | 6 | 147 |

ON | 355 | 9,080 | 204 | 6,914 |

SK | 55 | 1,912 | 70 | 2,941 |

YT | 72 | 1,876 | 4 | 116 |

Province and territory (P&T) . | Natural stations . | Regulated stations . | ||
---|---|---|---|---|

Number of stations . | Number of flood events/station years . | Number of stations . | Number of flood events/station years . | |

AB | 86 | 3,030 | 44 | 1,972 |

BC | 435 | 12,593 | 143 | 4,390 |

MB | 43 | 1,193 | 24 | 740 |

NB | 67 | 2,024 | 14 | 517 |

NL | 117 | 3,137 | 12 | 280 |

NS + PEI | 57 | 1,590 | 26 | 749 |

NT + NU | 104 | 1,921 | 6 | 147 |

ON | 355 | 9,080 | 204 | 6,914 |

SK | 55 | 1,912 | 70 | 2,941 |

YT | 72 | 1,876 | 4 | 116 |

^{2}and ∼80% of them lie between 20 and 10,000 km

^{2}. For the same percentage points, the regulated watershed limits tend to be larger than those of natural watersheds.

## RESULTS AND DISCUSSION

For each provincial and territorial region, two sets of analyses were conducted. One set pertains to the period of record (POR) highest IPF and associated MDFs for each of the considered stations together; the outcomes of this analysis can help develop flood envelop curves and Creager type diagrams (Creager *et al.* 1945; Neill 1986; Watt 1989) by infilling observational records of IPFs. The second set pertains to station-based analysis, conducted independently for each of the considered stations within each of the provincial and territorial region. The results of these two sets of analyses are presented and discussed below in an independent fashion.

*R*

^{2}, Sangal's parameter , and AICc are also provided. It can be seen that most of the estimated and observed IPFs align well along the diagonal line, suggesting a close agreement between observed and estimated IPFs. Performance metrics also suggest quite reasonable performance of all methods as is reflected by above 0.9

*R*

^{2}values for all P&Ts. Based on the values of AICc, method M2 outperformed other three methods. Although four parameters are involved in this method to provide additional flexibility in fitting, their effect is taken into account via the corrected form of AIC. The performance of method M3 is quite remarkable given that IPF is considered as a nonlinear function of the maximum MDF from the sequence of three consecutive day's MDFs. Thus, the maximum MDF alone is able to explain a considerable portion of the observed variability in IPF values. With respect to estimated number of parameters, this method is a sort of parsimonious method compared to M2 and M3 methods. The performance of method M1, the Sangal's original method, is also reasonable, except for a few stations for certain P&Ts (e.g., NB, NS, and SK). It was also noticed that the performance of method M1 was generally better than the simplified method (i.e., Equation (2)).

*R*

^{2}and MAPE for each natural and regulated station separately within all P&Ts. The results of this evaluation based on

*R*

^{2}, as a function of drainage area, for natural stations are summarized in Figure 5 and those based on MAPE in Figure 6. For the case of regulated stations, similar results are provided respectively in Figures S3 and S4 (Supplementary material). Overall, scatterplots of

*R*

^{2}suggest values larger than 0.7 for the majority of watersheds, with generally higher degree of spread for smaller watersheds than larger ones. For AB, SK, MB, NT, and YT, the percentage of watersheds with larger than 0.8

*R*

^{2}is significantly higher than other P&Ts. For a small number of watersheds, especially for BC, ON, and NS, very low

*R*

^{2}values (<0.2) can also be seen in Figure 5. For such cases, among other elements, data reliability or influences of local factors could be potential reasons. However, no causative analysis was conducted to investigate these issues. Side-by-side box and whisker plots of

*R*

^{2}for each of the four methods (see Figure S5) suggest better performance of the M2 method for most P&Ts.

Scatterplots of MAPE suggest below 20% values for most of the watersheds for all methods and P&Ts, with decreasing pattern for larger watersheds, which can clearly be seen in Figure 6 for BC, NB, SK, and ON. This means IPFs are better predicted for larger watersheds compared to smaller ones. Additional analyses based on box and whisker plots of MAPE values (Figure S6, Supplementary material) confirm these findings and suggest that the median values of MAPE are relatively smaller for method M2 than the other methods.

### Machine learning approach

As mentioned before, an ANN-based method (M5) was also considered apart from the conventional and newly introduced nonlinear regression type methods. It is well known that machine learning methods require larger data sets for proper training purposes. Therefore, these methods were tested considering three data sets, i.e., IPFs and MDFs from all natural stations from BC, ON, and four Atlantic provinces (NB, NL, NS, and PEI) together in order to carry out relatively an unbiased assessment.

*et al.*2010). For the analyzed data sets, it was found that ANN structures with several hidden layers and neurons did not result in better performing models compared to simpler options. For ON, a structure with two hidden layers respectively containing nine and three neurons led to a better performing model. For BC and Atlantic provinces together, a structure consisting of two hidden layers respectively containing seven and five neurons resulted in a better model. After conducting training and validation, the optimal model was applied to the entire data set and the comparison of observed and estimated IPFs is provided in Figure 8. For BC and ON, the performance metrics are quite comparable to those provided in Figure 4 for methods M1–M4. For Atlantic provinces, the simpler structure performs reasonably well, with RMSE (

*R*

^{2}) of 79 (0.988), compared to 73.9 (0.984), 62.3 (0.988), 89.2 (0.975), and 69.0 (0.985), respectively, for models M1–M4.

### Machine learning-based fusion modeling

To synthesize outputs from multiple methods, one can exploit an ensemble modeling framework and can combine outputs using several different approaches, e.g., simple averaging, weighted averaging, optimal combination, or Bayesian averaging. In this study, machine learning-based fusion modeling was used to synthesize outputs of multiple methods for the same three cases as considered in the above section, i.e., POR highest IPFs for BC, ON, and Atlantic provinces. In the machine learning area, this is also known as super learning. An ANN-based fusion model was developed using M1–M5 estimates as input and observed IPFs as target. After experimenting with several shallow and deep ANN structures and multiple training strategies, a network with a single hidden layer and 10 hidden neurons was adopted for Atlantic provinces. This led to RMSE = 54.8, *R*^{2} = 0.991, and MAPE = 5.33, which are slightly better than those of individual methods discussed in the above section. In the case of ON, a network with two hidden layers with respectively nine and three hidden neurons was adopted. This resulted in RMSE = 58.3, *R*^{2} = 0.979, and MAPE = 3.69. These measures are similar to those presented in Figure 4 for methods M1–M4 and in the above section for method M5. In the case of BC, the fusion modeling approach did not lead to any better results than those of methods M2 and M5.

## CONCLUSIONS

Information on IPFs is often required to derive design values for sizing various hydraulic structures, such as culverts, bridges, small dams, and levees, in addition to informing hydraulic procedures of floodplain delineations, erosion risk management strategies, and water resources management-related activities. In this paper, most of the previous work on the estimation of IPFs from MDFs is reviewed in the historical context. In particular, Sangal's method (M1), which was proposed for Ontario watersheds, is given special attention. Three new methods (M2–M4) are proposed by generalizing Sangal's method in terms of statistical functional relationships. These methods and a machine learning-based method are evaluated using IPFs and MDFs from Canadian watersheds, with both natural and regulated flow regimes, and multiple performance measures. To select a best performing method from the group of four competing methods, a corrected form of the AIC-based model selection criterion, which is well-established in the literature, is used. The following main conclusions can be drawn from the results presented and discussed in this paper.

Through a detailed evaluation of Sangal's original method (M1), it was found that the parameter , which varies between , tends to become negative with increasing drainage area for most watersheds in most of the P&Ts, with the majority of the values found below zero. This suggests that it is not reasonable to assume and therefore full form of the method should be used in future studies. The simplified method that results by assuming is applicable only for a small number of watersheds. Through statistical simulation based inferential statistics, it is possible to define a range where this parameter can be assumed equal to zero. This is a reasonable research topic for future studies.

By modeling POR highest IPF and associated MDFs, it was found that the method M2 performs relatively better than the other three methods for most of the P&Ts. For example, M2 was selected as the most suitable method based on AICc for AB, ON, NL, MB, NB, NS, NT, and YT, while M1 was selected for BC and SK for the case of natural stations. For regulated stations, the same conclusion holds. In contrast, when a similar evaluation was performed for individual watersheds within each P&T, other methods also emerged as the best performing ones based on the AICc. Overall, the performance of various methods appears to be linked with the size of the watershed, i.e., the spread of performance measures *R*^{2} and MAPE was found to be relatively large for smaller watersheds than for larger ones. In any given situation, it is also important to consider additional constraints. For example, the estimated IPF must be greater than or equal to the maximum value of the sequence of MDFs used in its estimation.

In addition to conventional methods and regression-based relationships, machine learning approaches were also evaluated for estimating IPFs from MDFs. For this evaluation, POR highest IPFs for BC, ON, and Atlantic provinces together were considered. These provinces have the greatest number of flood events, which was necessary for an unbiased evaluation. It was found that a carefully trained and validated machine learning model can also be used to estimate IPFs from associated MDFs. By investigating several ANN structures, it was also found that simpler structures can perform much better than their complex counterparts for the data sets considered in this study.

Since this study represents a multiple modeling situation, the fusion modeling technique from the machine learning area was invoked in order to synthesize a single output by combining outputs of multiple methods. This was explored by considering POR highest IPF for BC, ON, and Atlantic provinces together and training ANN-based fusion models. These models led to slightly improved estimates of IPFs for Atlantic provinces and comparable estimates for ON. In the case of BC, fusion modeling results were inconclusive. Conceptually, fusion modeling is similar to ensemble modeling, which is commonly used in hydrology and meteorology to combine and synthesize multiple outputs from an array of different candidate models or simulations of the same model. Given the importance of this technique, additional investigations are required, especially to combine outputs of multiple machine learning models, statistical functional relationships, and other modeling approaches.

## ACKNOWLEDGEMENTS

This work was completed within the framework of National Research Council Canada's Climate Resilient Built Environment (CRBE) initiative, which is funded through Infrastructure Canada (INFC). The financial support of INFC and the leadership of CRBE are gratefully acknowledged. All data analyses were performed on the R computing platform. The helpful comments of three anonymous referees and editors are very much appreciated.

## DATA AVAILABILITY STATEMENT

All relevant data are available from an online repository or repositories. https://wateroffice.ec.gc.ca/mainmenu/historical_data_index_e.html

## CONFLICT OF INTEREST

The authors declare there is no conflict.