Flood routing via a copula-based approach

Floods are among the most common natural disasters that if not controlled may cause severe damage and high costs. Flood control and management can be done using structural measures that should be designed based on the flood design studies. The simulation of outflow hydrograph using inflow hydrograph can provide useful information. In this study, a copula-based approach was applied to simulate the outflow hydrograph of various floods, including the Wilson River flood, the River Wye flood and the Karun River flood. In this regard, twodimensional copula functions and their conditional density were used. The results of evaluating the dependence structure of the studied variables (inflow and outflow hydrographs) using Kendall’s tau confirmed the applicability of copula functions for bivariate modeling of inflow and outflow hydrographs. The simulation results were evaluated using the root-mean-square error, the sum of squared errors and the Nash–Sutcliffe efficiency coefficient (NSE). The results showed that the copula-based approach has high performance. In general, the copula-based approach has been able to simulate the peak flow and the rising and falling limbs of the outflow hydrographs well. Also, all simulated data are at the 95% confidence interval. The NSE values for the copula-based approach are 0.99 for all three case studies. According to NSE values and violin plots, it can be seen that the performance of the copula-based approach in simulating the outflow hydrograph in all three case studies is acceptable and shows a good performance.


GRAPHICAL ABSTRACT
INTRODUCTION structure independently from the marginal distribution functions and create multivariate distributions with different margins and dependence structures.
In recent years, the copula functions have been used in simulation and modeling of meteorological and hydrological variables and their efficiency and accuracy have been confirmed. For example, Tahroudi et al. (2020a) introduced a new approach to simulate the occurrence of related variables based on the conditional density of copulas. The proposed approach was adopted to investigate the dynamics of hydrological and meteorological droughts in the Zarinehroud basin, Lake Urmia, Iran. Tahroudi et al. (2020b) used the conditional behavior of two signatures to analyzing the signatures of rainfall deficiency and groundwater-level deficiency in the Naqadeh sub-basin, Lake Urmia Basin, Iran, based on copulas. They showed that the presented conditional density function was an alternative method to the conditional return period.  investigated the frequency analysis of the suspended sediment load of the Zarinehrood basin, Lake Urmia, Iran, given by the peak flow at the Chalekhmaz hydrometric station using copula functions. The results showed that the simulated suspended load is closer to the measured suspended load of the Chalekhmaz station in bivariate analysis. Copula-based models have recently become very popular due to the use of joint distribution and the involvement of effective parameters in equations. Copulabased simulations and modeling are also important due to the high accuracy of the simulations. This model has a high ability in simulating and modeling of the meteorological and hydrological parameters (Tahroudi et al. 2020a(Tahroudi et al. , 2020b. In this study, the accuracy of the copula-based approach in flood routing and flood hydrograph simulation was investigated and compared with previous researches. The proposed copula-based approach, by combining the conditional density of copula functions and the diagonal section of copula functions, attempts to simulate flood hydrographs such as the Wilson River flood (WRF) in 1974 (USA), the River Wye flood (RWF) in 1960 (England) and the Karun River flood (KRF) in 2008 (Iran). Flood routing in these rivers has already been studied by various researchers using various methods including the Muskingum method, meta-heuristic methods and optimization algorithms. The main objective of this study is to investigate the accuracy of the copula-based approach in flood routing based on the conditional density of the copula functions.

Studied floods
In this study, three floods studied in previous researches including the WRF in 1974 (USA), the River Wye flood in 1960 (England) and the KRF in 2008 (Iran) were considered to investigate the accuracy of the proposed copula-based approach and its conditional density. This data set has also been extensively studied by others (Gill 1978;Tung 1985;Yoon & Padmanabhan 1993;Mohan 1997;Kim et al. 2001;Das 2004;Geem 2006Geem , 2011Luo & Xie 2010;Xu et al. 2011;Karahan et al. 2013;Vafakhah et al. 2015;Zeinali & Pourreza-Bilondi 2018). Also these case studies present a pronounced nonlinear relationship between weighted flow and storage volume. The 69.75-km stretch of the River Wye from Erwood to Belmont has no tributaries and a very small lateral inflow. It is, thus, an excellent case study to demonstrate the use of flood-routing techniques (Natural Environment Research Council (NERC) 1975;Bajracharya & Barry 1997;Karahan et al. 2013). The Karun River basin is located in the southern part of Iran between longitudes of 48°15 0 and 52°30 0 east, latitudes of 30°17 0 and 33°49 0 north with a basin area of 67,000 km 2 . Flood data from 30 November 2008 to 3 December 2008 are considered for the purpose (Vafakhah et al. 2015). Using the proposed approach, the outflow hydrographs of the studied floods were simulated. The hydrographs of the studied floods are presented in Figure 1.

Copula functions
The introduction of copulas is attributed to Sklar (1959), which describes in a theorem how 1-D distribution functions can be combined in the form of multivariate distributions. For 2-D continuous random variables X 1 and X 2 with marginal distribution functions F X 1 (x 1 ) and F X 2 (x 2 ) , the joint distribution of the variables can be expressed as follows: Copula is a function that joins the univariate marginal distribution functions to create a bivariate or multivariate distribution function. Thus, Sklar (1959) described that the probability multivariate distribution of H using the marginal distribution functions and dependence structure can be represented by the copula function C: Since the cumulative marginal distribution functions for continuous random variables are non-decreasing from zero to one, the copula of C can be considered as a transform H X 1 ,X 2 from (À1, þ 1) 2 to [0,1] 2 . This transformation divides marginal distribution functions. Therefore, the copula function of C only relates to the relationship between the variables and a complete description of the dependence structure achieved (Nelsen 2006;). For 2-D copulas, Sklar's theorem is as follows. Suppose H is the joint distribution of variables X 1 and X 2 with cumulative distributions u ¼ F X1 (x 1 ) and v ¼ F X2 (x 2 ). There exists a 2-D copula in the set of real numbers and is shown in the following equation: The 2-D copula function has the following properties: A. For each u and v in [0, 1]: This feature is called the boundary condition of the 2-D copula. Considering these boundary conditions, it can be concluded from Equation (3) that if one of the marginal distribution functions has a value of zero, then the value of the copula function is zero (same conclusion for Equation (5)).
Before applying the copulas, the dependence between the variables must be investigated. There are some coefficients for evaluating the dependence structure such as the Pearson coefficient, Kendall's tau (τ) and Spearman's rank correlation (r ).
To overcome the problems presented by the Pearson coefficient, some non-parametric measures such as Kendall's τ and Spearman's rank correlation (r) have been applied. The main advantage of using the Kendall's τ over the Spearman and Pearson coefficients is that the Kendall's τ method can interpret its value as a direct measurement of concordance and discordance pairs. The disadvantages related to the Pearson coefficient are: (1) it measures the linearity between variables and (2) it exists only if the standard deviations of the two variables exist finite. In addition, they cannot be used to diagnose dependency when involved with more than two variables (see also Gauthier 2001;De Michele et al. 2005;Nazeri Tahroudi et al. 2021). In this study, the Kendall's τ is applied to assess the dependence between the two variables. For more information about copula functions, see the following references (De Michele et al. 2005;Nelsen 2006;Mirabbasi et al. 2012;Ramezani et al. 2019;Khozeymehnezhad & Nazeri-Tahroudi 2020).

Copula-based simulation
Copula-based simulations were first discussed in Bedford & Cooke (2001) as well as Bedford & Cooke (2002). To obtain the sample u 1 , …, u d from a d dimensional copula, the following steps are performed: Then, . . . To determine the conditional distribution functions C jjjÀ1,...,1 , j ¼ 1, . . . , d required for the pair copula structure, a feedback relation for the conditional distribution function with h function is used. For a bivariate copula with parameter θ ij , the h functions are defined as follows:

Model performance
To evaluate the performance of the approach, the root-mean-square error (RMSE), the sum of squared errors (SSE) and the Nash-Sutcliffe efficiency coefficient (NSE) were used.
Lower RMSE and SSE, and higher NSE indicate higher accuracy of the model. In the above relations, Q i , b Q i and Q i are the measured, simulated and mean discharges of the outflow hydrograph, respectively, and n is the number of data Akbarpour et al. 2020;Shahidi et al. 2020).

RESULTS AND DISCUSSION
Here, the dependence between inflow and outflow hydrographs of the case studies were investigated using the Kendall's τ. The dependence results for WRF, RWF and KRF along with data scatter are presented in Figure 2.
In Figure 2, I is the inflow hydrograph and O the outflow hydrograph. Figure 2 provides the data histogram in the upper left and lower right panels: the upper right panel gives the value of the Kendall's τ, and the lower left panel gives the empirical contour lines. The highest dependence is related to KRF with 0.82, and the lowest value is related to WRF with 0.34. These values are all statistically significant according to the independence test with a confidence level of 95%. According to these results, it is possible to use copula functions for the joint analysis and simulation of outflow hydrographs. Each copula function is capable of modeling a particular range of dependencies. Some are suitable for weak dependencies and some can model the whole range of dependencies. For example, the Gumbel-Hougaard copula can only be used for positive dependence. The Ali-Mikhail-Haq copula is suitable only for weak dependence (À0:1807 , t , 0:3333), and the Farlie-Gumbel-Morgenstern copula is suitable for À2.9,t,2.9, while the Clayton and Frank copulas are suitable for both positive and negative dependencies (Nelsen 2006;).

Selection of the copula function
In this study, copula functions were applied to the simulation and modeling of the outflow hydrographs. In this regard, different copula functions (Clayton, Ali-Mikhail-Haq, Farlie-Gumbel-Morgenstern, Frank, Gumbel, Gumbel-Hougaard, Placket, Gaussian and Joe) and their rotational states were examined. For this purpose, the combination of conditional density of bivariate copulas with the diagonal section of the copula was used. While examining different copula functions, the copula was selected based on Bayesian information criterion (BIC), Akaike information criterion (AIC) and log-likelihood (Log-Like) criteria. The results of the examining different copula functions or joint analysis of KRF (inflow and outflow hydrographs), Clayton copula with a dependence parameter of 4.90 and AIC, BIC and Log-Like values of À86.1, À84.2 and 44, respectively, was selected. For RWF, Gaussian copula with a dependence parameter of 0.85 and AIC, BIC and Log-Like values of À35.8, À34.2 and 18.9, respectively, and for WRF also Gaussian copula with a dependence parameter of 0.61 and corresponding evaluation criteria of À527, À4.18 and 3.64 were selected. According to the results, it can be concluded that the Clayton and Gaussian copulas have the best performance for the studied floods. This is due to differences in the marginal functions. In general, the type of selected copula function depends on the dependence structure of the river flood, and for different rivers, different copula functions may be specified as the best fitted one.
After selecting the best fitted copulas to describe the dependence structure between the variables of the inflow and outflow hydrographs, these functions can be used to estimate the conditional density of copulas to evaluate the conditional state of the considered variables. The graphs of conditional density (c(u,v)) were studied using the second derivative of C(u|v) and diagonal section of copulas for the studied variables.
In this case, u represents the inflow hydrograph (I) and v also represents the outflow hydrograph (O) at the copula scale. For a certain amount of u, a graph of different values of v is provided. The maximum value of this graph is the O in copula scale. The reason for using the diagonal section of the copula was to reduce the computational complexity. These steps were implemented for WRF, RWF and KRF.

Simulation of the outflow hydrograph of the WRF
Inflow and outflow hydrographs of the WRF were analyzed according to the conditional density of the copula functions and using the selected copula, the outflow hydrograph of the WRF was simulated. Corresponding to each input data, output data were simulated using the copula-based approach and its conditional density. The results of the simulation of the outflow hydrograph of the WRF are presented in Figure 4.
The results of the simulation indicated that the simulated outflow hydrograph of the WRF lies between the 95% confidence interval. These results indicate the acceptable accuracy of the copula-based approach in the simulation of the outflow hydrograph. Also, as can be seen in Figure 4, the simulated outflow hydrograph of the WRF fit well with the measured data. The proposed copula-based approach has a higher accuracy than the different univariate and multivariate models due to the use of the marginal distribution of the studied variables. The connection of the statistical distribution of the studied variables increases the certainty of the simulations. The accuracy of the proposed approach in simulating and modeling of the different variables has been confirmed in various researches such as Tahroudi et al. (2020aTahroudi et al. ( , 2020b. To compare and evaluate the performance of the copula-based approach, the obtained results in this study were examined with the results of other researchers for the studied floods. In the case of the WRF, various researchers have examined different methods for estimating the outflow hydrograph. The most studies in this field have used different optimization methods to estimate the coefficients of the Muskingum method, all of which have led to the simulation and modeling of the outflow hydrographs. The results of comparing the performance of the proposed copula-based approach with other pervious researches are presented in Table 1. Abbreviations are introduced in Supplementary Material, Appendix A. As can be seen from Table 1, with the exception of the LSM, NL-LSM and LMM methods, the SSE value in the other methods varies between 36.3 and 45.6. Based on the Table 1, the SSE of the proposed copula-based approach to simulate the outflow hydrograph of the WRF is equal to 145.57. According to the SSE, the copula-based approach has higher accuracy compared to the LSM and NL-LSM models (mentioned in Table 1), but has lower accuracy than other considered models. However, the performance of the copula-based approach, according to NSE ¼ 0.99 and RMSE ¼ 2.57 m 3 /s, is acceptable and shows a good performance. As can be seen from Figure (4a), the copula-based approach has been able to simulate the peak flow and the rising and falling limbs of the outflow hydrograph of the WRF well. Also, all simulated data are at the 95% confidence interval (Figure (4b)), which indicate the high performance of the proposed approach.

Simulation of the outflow hydrograph of the RWF
The outflow hydrograph of the RWF was also simulated using the proposed copula-based approach. The simulation results of the outflow hydrograph of the RWF are shown in Figure 5. The results of Figure 5 show visually that the proposed copula-based approach is able to simulate the outflow hydrograph of the RWF. The peak flow of the outflow hydrograph is well simulated. With one exception, simulated data are laid at the 95% confidence interval. The proposed copula-based approach has been able to establish a good relationship between the inflow and outflow hydrographs of the RWF and simulate the outflow hydrograph. The RWF has also been studied in previous researches by different methods. The results of comparing the performance of the proposed copula-based approach with other Muskingum-based models are presented in Table 1.
The two studies presented in Table 1 used the Muskingum-based method and the optimization algorithms. The SSE for the simulation of the outflow hydrograph of the RWF using the copula-based approach is obtained as 12,968.34, which shows better performance than the two studies with SSE of 37,944.

Simulation of the outflow hydrograph of the KRF
Finally, the last flood is KRF. As with other studied floods, the copula-based approach and its conditional density were implemented. The simulation results of the outflow hydrograph of the KRF are presented in Figure 6. The proposed copula-based approach simulates the outflow hydrograph of the KRF and its peak flow well. All simulated points are at the 95% confidence interval (Figure 6(b)). According to Figure 6, the accuracy of the proposed approach in the simulation of the outflow hydrograph of the KRF is confirmed. The results of comparing the proposed approach with the results of other previous researches in the simulation of the outflow hydrograph of the KRF are presented in Table 1.
The SSE for the simulation of the outflow hydrograph of the KRF using the copula-based approach is obtained as 10,655.65, which shows much better performance than the two studies (Vafakhah et al. (2015) with an SSE value of 177,161.40 and Zeinli & Pourreza-Bilondi (2018) with an SSE value of 144,691.73). The RMSE values of WRF, RWF and KRF are 2.57, 19.53 and 14.75 m 3 /s, respectively. Therefore, the accuracy of the simulation in all three case studies is confirmed. In this study, the violin plot was used to evaluate the certainty of the proposed approach. The violin plots related to the simulation of the outflow hydrograph of the studied floods are presented in Figure 7.
According to the results of the violin plot, it can be seen that the copula-based approach has been able to cover the range of data changes, and quarters one and three. As can be seen from Figure 7, the range of changes in the studied floods, as well as 5 and 95% of the values simulated by the copula-based approach, is closer to the measured values, which confirms the reliability of the approach. Since the simulated value of peak flow is so important in flood warning systems, thus the simulated peak flow of flood routing should not be much different from the measured peak flow. On the other hand, there is always a possibility of over-estimation or under-estimation in the simulation of the outflow hydrograph; therefore, it is necessary to choose a model that has the highest accuracy. According to the results presented in this study and comparing with other studies implemented by different methods, it can be concluded that the proposed approach based on conditional density and diagonal section of copulas is a suitable approach to simulate the outflow hydrograph of the flood.
In addition, in order to evaluate the accuracy of the copula-based approach, the histogram of the measured and simulated values was also examined and the results were presented in Figure 8. The results of Figure 8 showed that the histogram of the simulated and measured values are similar and the contour lines and scattering of these values are the same. Figure 8 also shows a high correlation between the simulated and measured data.

CONCLUSION
In this study, the accuracy of the proposed copula-based approach and its conditional density in flood routing was investigated. The performance of the proposed approach was evaluated considering some floods used in the literature: WRF, RWF and KRF. The results of the simulation of outflow hydrograph for the three case studies showed that the accuracy and performance of the proposed copula approach are acceptable according to the 95% confidence interval and the comparison of violin plots. The simulated outflow hydrographs fit well the measured outflow hydrographs and the hydrograph peak flows were well simulated, which are the most important parameters in the design of hydraulic structures. The performances of the copula-based approach, according to the NSE, are satisfying showing a value of 0.99 in all three case studies. This is also confirmed by the comparison of violin plots between measured and simulated outflows. These make us confident in the  application of the proposed approach in other case studies. Due to the fact that the data-driven models, such as artificial intelligence, rely only on data, while the copula-based approach considers the dependence structure of data and uses the joint and marginal distributions of data, and it is more reliable than data-driven models. The results also showed that there is no limitation to the number of data used in this approach. The same sample size is not necessary for comparison. In addition to the presented results, the results of correlation and histogram analysis of the simulated and measured data show that the histogram of the pair variables in all three case studies is similar.