In this paper, a methodology is suggested to evaluate the goodness-of-fit between observed and numerically predicted variables in curved open channel flows. The mathematical model uses the continuity and the Reynolds averaged Navier–Stokes (RANS) equations with the RNG κ-ɛ turbulence model, numerically solved with the commercial software Ansys Inc. The concept of design of experiments (DOE) together with some criteria of goodness-of-fit was used in a new coherent suggested methodology for the calibration and validation of the numerical model. The goodness-of-fit criterion was applied to secondary flow variables like helicity, intensity of the secondary current and the vortex index instead of the primary variables like the velocity field and the free surface. This methodology guarantees that the numerical model reproduces values of the response variables very near to the observed values and it is valid for the mesh independence analysis and the setup of other numerical parameters, including those related to the boundary conditions.
Flow in curved channels is characterized by flow separation, by the presence of secondary currents along the curve and perpendicular to the channel axis generated by the deflexion of the flow stream lines and by free surface variations (Van Balen 2010). These flow structures generate complex tridimensional flow patterns that have been studied by numerical modelling and laboratory experiments, the numerical approach being the one that provides better methodological flexibility to pursue a detailed study despite the need to pursue a model calibration and validation process (Han et al. 2011; Song et al. 2012). Most of the numerical models used to simulate these types of flows are based on the Reynolds averaged mass and momentum conservation laws coupled with a turbulence closure model and in some cases, the ‘rigid lid’ approach is used to simplify the numerical solution (Bai et al. 2014).
No standard methodology has been suggested in the literature to calibrate and validate a numerical model. However, most cases reported in the literature use primary variables like the velocity field at different cross sections of the curve to compare the numerical results to laboratory experiments. Additionally, there is no general consensus on the most suitable variables to represent either the flow patterns or the statistical indicators to be used to evaluate the quality of the model calibration and validation processes. Some authors (Berger & Field 1984; Moffatt & Tsinober 1992; Stranden 2007; Ghobadian & Mohammadi 2011; Han et al. 2011; Ansys Inc. 2016) have successfully used secondary variables like the helicity of the flow, the intensity of the secondary currents and the swirl number for the calibration–validation process. To compare the model results to the laboratory experiments, two statistical methods have been traditionally used: (a) the one factor at a time (OFAT) method, which varies one model parameter at a time to analyse its influence on the results (Lye 2002; Wahid & Nadir 2013) and (b) the design of experiments (DOE) method, which is a systematic method to determine the relationship between factors (variables) of a process affecting the response variables, which has been used mainly for laboratory experiments or observed data Montgomery (2012).
The prediction capability of a model can be evaluated by using different statistical parameters which quantify the goodness-of-fit between the model results and some reference values (laboratory measurements, for example): the determination coefficient (R2); the mean absolute error (MAE); the mean absolute error in percentage (MAPE); the index of agreement (d); the efficiency coefficient of Nash and Sutcliffe (NSE); the root mean square error (RMSE) and the variation coefficient, among others. A detailed discussion of different goodness-of-fit indexes can be found in the following studies (Hsu & Kuo 1999; Legates & McCabe Jr 1999; Toledo & Koutsopoulos 2004; Alexandris et al. 2008; Jara-Rojas et al. 2009; Willmott et al. 2012; Ritter & Muñoz-Carpena 2013).
In this paper, a methodology to calibrate and validate a numerical model is presented and applied to the case of a velocity field flow in a curved channel. We use the DOE method, as a new alterative to the existent methods, to set up the numerical experiments and evaluate the goodness-of-fit between the model results and observed values by using several statistical indexes for the secondary flow variables.
Channel geometry and velocity field experimental data
The local and global coordinate systems for the velocity field are shown in Figure 1(a), where the global system is defined by the x, y, z coordinates and the local system, whose origin is located at the centre of the circle of the channel curve, is defined by t, r, z coordinates representing the tangential, radial and vertical directions, respectively, whose corresponding velocity components are Vt, Vr and Vz. The relationship between the local and global coordinate systems is obtained from Figure 1(a). The meridional velocity (Vm) is computed from the Vr and Vz components as Vm = (Vr2 + Vz2)0.5 and the direction in the r-z plane is obtained as θP = tan−1(Vz/Vr).
The channel was built with transparent acrylic plates, in a horizontal bottom with a total deflection angle ϕ of 180°, a channel width W = 0.40 metre (m), a radius of curvature Rm = 1.2 m and two upstream and downstream straight channels, with Le=Ls = 2.0 m. The velocity measurements were obtained for a flowrate Q = 0.01008 m3/s and a water depth at the channel inlet he = 0.15 m (see Figure 1). For these values, a Froude number of 0.148 and a Reynolds number of 68,967 were obtained, where DH is the hydraulic diameter (DH = 4RH), V is the mean velocity at the channel inlet, RH is the hydraulic radius, g is the acceleration due to gravity and ν is the water kinematic viscosity at 298.15 °K. Velocity profiles were measured at seven transverse sections located at 0°, 30°, 60°, 90°, 120°, 150° and 180° from the curve entrance. At every cross section, the velocities were measured, using a micro acoustic Doppler velocimeter (ADV) profiler and reported by Bai et al. (2014), at grid locations shown in Figure 1(b) by Bai et al. (2014).
ADV measurements like those of Bai et al. (2014) are commonly reported in the literature. Although they have some uncertainties, they are reliable as they are consistent with other techniques of measurement such as particle image velocimetry (PIV) (Craig et al. 2011). Among the possible uncertainties in the measurements with ADV are those of random or epistemic origin. The most common uncertainties detected in measurements with ADV are: the Doppler noise, digital filtrations made by the instrument, the presence of anomalous values and incorrect positioning of the instrument. The operator can also introduce errors in the measurements when defining the value of the processing software's input parameters such as the sampling rate, ping algorithm, power level, velocity range and collection time. There are several methods to diminish these errors (Craig et al. 2011). The data collected closer than 5 mm from the bottom were disregarded as the ADV has difficulties measuring close to a solid surface, so 539 data out of the 588 reported by Bai et al. (2014) were used here.
A given physical phenomenon, in a domain Ω, can be represented by a group of input parameters (Π) and a group of output (answer) parameters (Y) in a relationship of the form Π → Ω → Y. In the DOE procedure, those independent parameters are known as factors, the values they can take in the experiment are known as levels and the dependent parameters are known as answer variables (Gómez-Zambrano 2017). The purpose of a DOE is to find a proper combination of the Π parameters, the levels they can take in the experiment and the relevance that any of these Π parameters can take over the Y parameters (Montgomery 2012). For a numerical model calibration, a similar approach can be followed where the equivalent Π parameters may be related to the characteristics of the computational grid (grid size, elements form), the implementation of the boundary conditions, and the size of the computational domain compatible with the boundary conditions, among others. The Y parameters, in this case, are those results of the optimal combination of the Π parameters that make a minimum difference between the numerical results and a set of reference answer variables transformed into statistical indices data (Montgomery 2012).
Any methodology used to compare observed and simulated values must include at least: (i) a graphic procedure that illustrates the comparison between the calculated values and the observed; (ii) a dimensionless index (or relative error indicator) quantifier of the goodness-of-fit like the NSE; and (iii) an error indicator like the RMSE (Legates & McCabe Jr 1999; Ritter & Muñoz-Carpena 2013).
A flow chart of the suggested methodology presented in this work is shown in Figure 2. A critical point of this methodology is whether the goodness-of-fit indexes between the numerical model results and the reference answer variables are accepted or rejected. To resolve this critical point, we chose to follow the recommendations given by Legates & McCabe Jr (1999), and to include in the methodology the goodness-of-fit quantifier index (NSE) and the (RMSE) error indicator.
Within the DOE, the central statistical technique is the analysis of variance (ANOVA). Therefore, in this methodology, the ANOVA is used for selecting the significant factors (these are the independent variables) together with the standardized Pareto plot (this chart shows the absolute values of the standardized effects from the largest effect to the smallest effect; it also plots a reference line to indicate which effects are statistically significant to 95%) and half normal probability plot (this plot is a graphical tool that uses these ordered estimated effects to help assess which factors are important and which are unimportant). The ANOVA technique consists of separating the total variation in the variability due to the treatments (a treatment is a factor at a specified level; in our case, one treatment is equal to one simulation of the mathematical model) and the variability due to the error. When the former dominates the latter, it is concluded that the treatments have an effect, or in other words, the population means of the treatments are different. When the treatments do not dominate they contribute equal to or less than the error; it is concluded that the means are equal. In this work, the ANOVA applied to the design of factorial experiments allows determination of whether the simulations show significant differences with respect to the observed data. It also provides a safe criterion for the hydraulic engineer to make decisions about which of the factors has the greater effect on the response variable; in our case, the response variable is the RMSE. Additionally, it allows finding the optimal combination of the input data for the configuration of the numerical model so that the results produce a smaller difference with respect to the observed data. The meanings of these statistical concepts are defined with major profundity in the classical literature of the DOE, which is recommended to be studied for greater understanding of this work (Montgomery 2012).
Definition of the input parameters and the answer variables
Equation (1) includes two groups of factors, namely, ND factors that are related to the number of grid divisions in every edge (Type and Number of Divisions in meshing of Ansys Inc.) and Se factors that are related to parameters for the grid cells' distribution (Bias Type and Bias Factor in meshing of Ansys Inc.). Each group of factors has a sub-index representing the channel width (W), the water depth in the channel curve (H), the straight channel upstream of the curve (U), the straight channel downstream of the curve (D) and the factors of the curve with sub-index (C). The skewness or distribution factor (Se) represents the ratio Lmax/Lmin, where Lmax is the maximum element length and Lmin is the minimum element length in a given direction. A unit value for SeH and SeW indicates a uniform grid of identical cell size in height and width, respectively, as shown in Figure 3(a), and a value of SeH larger than one indicates that the cells near the solid walls are finer than those away from it, as is observed in Figure 3(b). The NPI factor represents the number of vertical coordinates used to prescribe the logarithmic velocity profile at the channel entrance (Schlichting & Gersten 2000). The answer variable Y is any of the statistical indexes computed for the three velocity components (Vt, Vr and Vz) and for angle
Statistical indexes used to evaluate good fitting for the velocity field
Definition of secondary flow variables
In order to calibrate and validate the numerical model in a more general context, in addition to the basic variables (velocity field and water depths), two secondary variables were computed from the basic variables, looking for the model to reproduce the general structure of the flow. Some of the secondary variables used in this work are presented below.
Intensity of secondary currents
The numerical solutions of Equations (11) and (12) are obtained with the Ansys Inc. software using four types of boundaries: (a) velocity inlet as inlet boundary at entrance, (b) pressure-outlet as open boundary at exit, (c) free surface on the top as symmetry and (d) close boundaries (solid walls). For the open boundary at the entrance, the logarithmic velocity profile (Schlichting & Gersten 2000) was prescribed given the 2 m straight branch of the channel downstream of the curve. For the open boundary at the exit, the zero pressure (atmospheric pressure) condition was used. For the solid walls (channel bottom and lateral walls) the non-slip condition with standard wall functions was prescribed and the free surface was modelled by the ‘rigid lid’ approximation, which is valid for free surface changes smaller than 10% of the still water depth upstream of the curve entrance, according to Bai et al. (2014). The numerical schemes used for the approximate solution of the governing equations, include: (a) the PISO algorithm for coupling velocity and pressure, (b) second order UPWIND scheme for the convective terms in the momentum equations and (c) the PRESTO algorithm for the pressure interpolation.
RESULTS AND DISCUSSION
Factors' detailed selection and optimal model configuration
Following the suggested methodology, we began by selecting the factors that show significant effect on the answer variable. The numerical experiment is designed for a fractional factorial of 210-5 with a IV resolution, allowing evaluation of the most important effects on the answer variable without confused effects or double interactions between factors. The DOE obtained give 32 model runs. After computing the Xrms as the answer variable for each numerical experiment, an ANOVA analysis was performed showing that, for a 95% confidence level, SeH and SeW, in their low level, were the factors showing significant effect on the answer variable (Xrms). This result shows that small variations at their level have a large impact on the answer variable, and produce a minimum value for a uniform size element distribution. This result indicates that a refinement of the elements near the channel walls is not needed as it does not improve the solution. Next in order of significance are the NDH, NDE and NDW factors at their low levels. The standardized Pareto plot (Figure 4(a)), for a 95% significance level (a standardized effect of nearly 2 in Figure 4(a), intermittent vertical line) shows that only the SeH and SeW factors show a significance level higher than 95%. As confirmation of this result, the half-normal probability plot, Figure 4(b) shows the SeH and SeW factors as the more relevant factors by being further away from the origin of the plot.
Those factors that have no significant effect on the answer variable can be set to a convenient level in order to minimize the computational effort. In particular, the NPI factor needs to be specified to a small number of points along the vertical to adequately describe the logarithmic velocity profile at the channel entrance.
The model optimal configuration obtained through the ANOVA analysis provides the factor levels that minimized the answer variable (Xrms). Then, those factors related to the computational grid were taken to a non-dimensional factor by scaling them by a characteristic length of the channel, which in our case was the hydraulics radius (RH) at the entrance section in the curve of the channel. Defining Te as the size of the mesh element, the results are: TeW = 0.1166 RH, TeH = 0.0486 RH, TeU = TeD = 0.2333 RH, and NDC = 2° (the subscripts are the same as described above). For those factors related to the element distribution in the domain, it was found that a skewness larger than 1 has no effect on Xrms, a result that indicates that elements are uniformly distributed in the whole domain. When the elements' aspect ratio (in the mesh metric of Ansys Inc.) is checked, a 4.8 value was obtained, which can be used later as a reference value for other cases (computational grids). For this optimal configuration, the model results will be compared to the reference data provided by the laboratory measurements.
Good fitting for the calibration stage
Figure 5 shows the observed (Bai et al. 2014) and modelled structure of the swirling flow in the curved channel. The helicity (Figure 5(a)) at several cross sections along the curve is well reproduced by our numerical model (S1), following closely the experimental data and the simulations reported by Bai et al. (2014), although our results slightly over-predict the observations. Notice how the helicity obtained by the non-optimal grid clearly underestimates the observations, including some differences in the shape of the plot. Regarding the secondary current intensity (Figure 5(b)), our model and the simulations by Bai et al. (2014) are very close. Both models over-predict the experimental data although the shape of the curve is maintained. For both variables, the simulations obtained by our model with the optimal grid correctly predict the location of the peaks (cross section 60° downstream of the entrance), an important feature of this type of flow reported by the literature (Stranden 2007; Bai et al. 2014).
An additional qualitative analysis was performed by comparing Figure 6(a), the streamlines at the cross section 120° from the entrance to the curve, simulation with the optimal grid, simulation by Bai et al. (2014) and simulation with a non-optimal grid. These figures show good agreement among the observed data and S1, S2 simulation, showing a strong and large counter-clockwise vortex with its eye located in the middle of the cross section and a smaller anti-clockwise vortex near the free surface at the inner wall. This last vortex is clearly observable for the simulations (S1 and S2 cases) and not so clearly for the observed data. In contrast, one simulation (S3 case) shows only one large and strong anti-clockwise vortex with its eye located much closer to the inner wall.
The next challenge is to quantify the good-fitting behaviour of the simulations considering the characteristics of the vortex structure, characteristics like the location of the vortex eye and the vortex intensity.
Measure of the absolute error
Based on the obtained optimal factor configuration and the results from the ANOVA analysis, the model results are compared, in terms of the goodness-of-fit, to those reported in the specialized literature. The comparison is performed with the data reported by Bai et al. (2014) and Patil et al. (2014) and by using Equations (2), (4)–(6), and whose results are presented in Figure 7(a). It is clear that the magnitude of the RMSE error for the three velocity components and for the angle θ for all cases and for all reference data are of the same order of magnitude. Patil et al. (2014) did not report error values for Vt because they considered a 2D simulation. Note that the RMSE values calculated with the poor mesh simulation (S3) and the observed data (Figure 7(a)) are very large compared to the other options; this indicates that the proposed methodology is able to reduce errors between predicted and observed values in an efficient manner. In Figure 7(b), we compare the RMSE2D and Xrms, obtained with our simulation in optimal mesh conditions (S1) and those obtained with the simulation performed by Bai et al. (2014); note in this figure the minimum difference between them, which indicates that the configuration of the numerical model resulting from applying to the proposed methodology points in the right direction.
Measure of the relative error
As a measurement of the relative error, the NSE coefficient for the axial velocity component is computed for the Bai et al. (2014) simulations (Figure 8(a)) and for our simulation (Figure 8(b)). The comparison was calculated using the FITEVAL software developed by Ritter & Muñoz-Carpena (2013). We compared 539 data of axial velocity to the observed data for the seven cross sections of the curved open channel. For both simulations, the NSE coefficient is of the same order of magnitude: 0.721 for Bai et al. (2014) and 0.765 for our simulation. The goodness-of-fit of the observed data to the simulated data by Bai et al. (2014), Figure 8(a), achieves 62.8% probability of obtaining values of the efficiency (NSE) coefficient greater than 65% (acceptable). For our simulated data, we achieve 81% probability of obtaining values of the efficiency (NSE) coefficient greater than 65% (Figure 8(b)) (acceptable). The NSE criterion is stricter than the RMSE index and none of the simulations (including those reported in the literature) reach a value of 95% corresponding to a qualification of ‘very good agreement’, this fact becoming a challenge in hydrodynamic simulations.
It is relevant to consider that in order to obtain a ‘very good agreement’ between the measurements and the results of a numerical model, it is necessary to maintain a coordinated procedure between the observed and the numerical simulations given the difficulties of applying exactly the same conditions for both procedures. It is not always possible to specify the same conditions for the numerical model, for example, irregularities in the geometry of the experimental channel presented after the channel construction (walls along the channel curve, junctions between the straight branches of the channel and the channel curve, etc.), irregularities that may induce some flow conditions (flow bursts) that cannot be captured by the numerical model because the irregularities are not introduced in the model geometry.
Meaning and use of the results
The literature does not report calibration of mathematical models to simulate vortical flow by using representative variables of this flow, variables such as helicity and intensity of secondary currents, showing that they have practical applicability despite being a theoretical concept. In open channel flows the meaning of these results is important for the hydraulic engineer because in the methodology presented here, it is possible to incorporate several factors that, together with DOE, allow study of secondary flow in curved channels.
The proposed methodology contributes to saving time in the calibration of a hydrodynamic numerical model since it follows a systematic procedure, which allows a significant reduction in the number of simulations compared, for example, with the classic OFAT methodology.
This methodology has applicability in projects of hydraulic engineering where it is necessary to calibrate hydrodynamic models, whose comparison variables correspond to the velocity field of a flow, to study the spatial distribution of velocities, vortices within the flow, transport of solid particles and sediments, and vortex hydrodynamic separators in particle removal.
A new methodology for the calibration of numerical models of flow in curved channels is presented and validated. The implemented numerical model is based on the mass conservation, the RANS equations and the RNG κ-ɛ model for turbulence closure. The numerical experiments were designed by following the factorial fractional DOE procedure to study the effects of the computational grid and other factors on the model results. The quality of the numerical results compared to the observed data was studied using goodness-of-fit criteria.
It was shown that the suggested methodology was successfully used to study the computational grid characteristics and the configuration of several factors related to the model boundary conditions. The results of this study showed that, after the proper factor selection, it is possible to define the optimal model configuration. The study showed also that the grid factors related to the longitudinal characteristics of the grid (NDE, NDS, NDC) have no significance regarding the answer variable. On the contrary, those factors related to the grid element distribution (SeE, SeS, SeW, SeH) have relevance on the model results in such a way that the optimal grid resulted in a uniformly distributed grid. This somehow unexpected result can be explained by the use of the RNG κ-ɛ turbulence closure model because near the solid walls a wall function law is used, avoiding the need for a finer grid in these regions.
The values of the statistical indexes RMSE, Xrms and RMSE2D, computed for the velocity field during the calibration stage, were of the same order of magnitude of those reported in the literature. This result was possible because of the numerical model optimal grid configuration obtained by applying the suggested methodology. Results were shown for the model non-optimal configuration, and its corresponding statistical error indexes were far inferior to those corresponding to the model optimal grid configuration.
Given the complexity of the secondary flow in the channel curve due to its tridimensional behaviour, and given the simulation conditions (steady state) used in this work to calibrate the model, the flow structure (helicity and intensity of the secondary flow) was used instead of the simulated velocity field. In some sense, the comparisons with measurements have an average meaning. Non-instant values were considered.
The methodology presented in this article provides a new tool for the calibration of numerical models, which facilitates the selection of the significant factors with safe statistical criteria to find the optimal combination of these factors, so that the results of the numerical simulation are similar to the measurements.
Despite the reasonable results obtained with this model configuration, more challenges need to be tackled in further research. When the numerical simulation is focused on the flow hydrodynamic characteristics like vortices, their size, their intensity, their rotation frequency, the suggested methodology may be improved with more sophisticated measuring and calibration techniques.