ABSTRACT
We developed and analyzed the performance of an ensemble forecasting system for the Madeira River basin, the largest sub-basin of the Amazon, with forecasts up to 30 days under different hydrometeorological conditions. We used outputs from the regional Eta model of precipitation and global climatological data as inputs to a large-scale hydrological model. Bias correction of precipitation through quantile mapping significantly improved the results, achieving a hit rate >70%. The system demonstrated the ability to discriminate between high, medium, and low flow conditions. Forecast performance is better for larger catchment areas. This system is expected to increase decision-making efficiency for flood and drought situations in the largest Amazon tributary.
HIGHLIGHTS
RCM-Eta + MGB satisfactorily reproduce the seasonal variability of hydroclimatic variables.
The dynamic downscaling and statistical produce skillful predictions of discharge, river level, and flooded area.
The statistical–dynamical downscaling substantially improves forecast skill relative to climatology and persistence.
INTRODUCTION
Floods and droughts are part of the hydroclimatic variability in the Amazon basin (Villar et al. 2009). In the last few decades, significant hydrological and climatological variabilities have been observed in the region (Marengo & Espinoza 2016; Espinoza et al. 2019), leading to the concept of an intensification of the hydrological cycle since 1980 (Gloor et al. 2013). Droughts and floods are among the most common risks worldwide, affecting millions of people every year and causing economic losses worth billions of dollars (Hirabayashi et al. 2013). According to the International Disasters Database (EM-DAT), Brazil experienced 65 major flood events during the period 2000–2018, representing 71% of all natural disasters in the country. This type of disaster is the most lethal in Brazil, with 2,435 deaths recorded over the 19-year period. Preventing the risk of inundation remains extremely challenging as the events are highly variable in space, and their impacts strongly depend on the preparation and resilience of society (Alfieri et al. 2018). Furthermore, under changing climatic conditions, the alterations in the occurrence and magnitude of extremes introduce new uncertainties (Arnell & Gosling 2016).
In the first two decades of the 21st century, the Amazon basin experienced four significant droughts, in 2005, 2010, 2015/2016, and 2023, with a periodicity of ∼5 years. For instance, Marengo et al. (2008) reported that the 2005 drought affected the western and southwestern portions of the basin (Frappart et al. 2012), leading to reduced water levels in the Madeira and Upper Solimões basins to the extent that navigation in those regions was suspended. However, this drought was not felt in the central and eastern portions of the basin. Conversely, during these same decades, four significant flood events, in 2009, 2012, 2014, and 2021, were also reported, affecting thousands of people.
More specifically, in the Madeira River, the largest sub-basin of the Amazon, severe floods associated with anomalous warming of the tropical Atlantic occurred in 1992, 1993, 1997, 2007, and 2008, and historical record floods occurred recently in 2014 and 2021. Espinoza et al. (2014) attributed the historical 2014 flood to warmer conditions in the Western Pacific and Indian Oceans combined with anomalous warming in the subtropical parts of the South Atlantic. This flood episode produced inundations and landslides in many cities of southwestern Amazonia, blocking highway transport, isolating communities, and causing economic and social losses, including propagation of diseases through water contamination. Severe droughts also occurred in the region in 2005, 2010, and 2015/2016 (Bourrel et al. 2009; Ovando et al. 2016; Molina-Carpio et al. 2017; Ronchail et al. 2018; Espinoza et al. 2022).
In light of the negative effects of the variability of extreme hydrological conditions in the Madeira River basin, flood and drought forecasts on intraseasonal timescales are of fundamental importance to society (Silva et al. 2006; Silva et al. 2007), for example, for the mitigation of disaster impacts. Hydroclimatic forecasts are also crucial for water resource management, hydropower generation, and fluvial transport (Tucci et al. 2003; Collischonn et al. 2005; Collischonn et al. 2007; Meller 2012).
Ensemble forecasting provides a range of possible evolutions by adding information about forecasting uncertainties. The ensemble forecast system has thus become the paradigm of hydrometeorological forecasts (Fan et al. 2014; Schwanenberg et al. 2015; Siddique & Mejia 2017). River discharge forecasts are obtained by forcing a hydrological model with different precipitation data sets, thus producing an ensemble of predictions. Precipitation data sets are obtained through atmospheric Numerical Weather Prediction (NWP) or Numerical Climate Prediction (NCP) models by slightly changing the initial conditions or by introducing small and random perturbations to the initial conditions (Meller 2012; Fan et al. 2014). This kind of probabilistic approach is widely utilized operationally in many developed nations (Fernández Bou et al. 2015), such as the European flood alert system (Bartholmes et al. 2009; Thielen et al. 2009). In Brazil, similar techniques have been tested for timescales of 3–15 days to support early warning systems (Siqueira et al. 2016; Casagrande et al. 2017; Tomasella et al. 2019; Siqueira et al. 2020; Siqueira et al. 2021). At the Center for Weather Prediction and Climate Studies (CPTEC), regional meteorological forecasts have been operationally issued for two decades, utilizing the regional Eta model (REM) (Chou et al. 2005, 2012; Bustamante et al. 2006, 2012). These forecasts have proven important for reducing and mitigating the adverse effects of hydroclimatic variability, including in the Amazon basin.
Despite the importance and usefulness of intraseasonal forecasts in the ensemble forecast system, evaluating their systematic errors, inherent to any forecast system, is a necessary first step for improving a warning system. There are still many challenges associated with ensemble forecast systems (Cloke & Pappenberger 2009; White et al. 2017), such as: (a) improving the performance of meteorological forecasts; (b) understanding prediction system uncertainties, including those related to initial data and the hydrological model; (c) verifying the performance of the ensemble technique with as many case studies as possible, especially in regions where severe flood and drought events are infrequent; and (d) reporting uncertainties in ensemble predictions region-wise.
In light of the challenges and limitations associated with ensemble predictions, the objective of the current study is to evaluate the performance and uncertainties in intraseasonal hydroclimatic forecasts for the Madeira River basin during the period 2002–2010. To accomplish this, we employ a statistical approach and probabilistic analysis to quantify the degree of uncertainty in the intraseasonal forecasts generated by a hydrological model driven by outputs from a global climate model downscaled by a regional climate model. Furthermore, there is a pressing need to assess the potential of ensemble hydroclimate forecasting systems at a regional scale, particularly in tropical regions. According to Wu et al. (2020), the majority of existing hydrological forecasting systems is concentrated in river basins of the Northern Hemisphere. In South America, this has primarily been approached from a global perspective (e.g., Alfieri et al. 2013) and through a few regional studies focusing on specific basins where monitoring is a priority or data are available from hydropower companies (e.g., Fan et al. 2014; Siqueira et al. 2016; Tomasella et al. 2019). However, significant knowledge gaps persist regarding how forecasting ability varies across different geographic locations, catchments, catchment areas, and climates. In this context, the novelty of this research lies in the methodological choices of testing different combinations of ensemble meteorological forecasts with multiple statistical preprocessing techniques in the Madeira River basin region, renowned for its complex and remote hydrologic landscape.
The study area, details about the selected case study basins, and the data sets, the methods used, including the distributed hydrological model, statistical preprocessing, and verification strategy are discussed in Section 2. The main results are summarized in Section 3. Section 4 discusses the study results, some limitations, and the main conclusions.
MATERIALS AND METHODS
Description of the study area
The Madeira River basin is located east of the Andes Cordillera in the southwestern part of the Amazon basin, covering 51% of its area in Bolivia, 42% in Brazil, and 7% in Peru. It encompasses 23% of the entire Amazon basin. With a drainage area of 1,324,727 km2 (Molina-Carpio et al. 2017), the Madeira River delivers an estimated annual mean discharge of 31,200 m3/s at its confluence with the Amazon (Molinier et al. 1995). Three of the Madeira's four principal tributaries, Beni, Madre de Dios, and Mamoré, originate in the Andes and significantly influence the hydrology, morphology, biogeochemistry, and ecology of the basin (Guyot et al. 1996; McClain & Naiman 2008). The fourth major tributary is the Iténez River (known as Guaporé in Brazil), which originates in the Brazilian uplands at elevations lower than 800 m asl. Stretching ∼3,300 km, the Madeira River is the longest tributary to the Amazon River and accounts for 15% of the total discharge of the Amazon River into the Atlantic Ocean (Molinier et al. 1995).
The basin's precipitation regime is shaped by a combination of large-scale physical and dynamical processes and regional-scale physical characteristics. With an average annual rainfall of 1,900 mm/year, the basin experiences a rainy season in DJF (December, January, February) associated with the South American monsoon circulation. During this period, a low-level jet east of the Andes and a low-pressure area known as the Chaco Low in northwestern Argentina, along with the South Atlantic Convergence Zone (SACZ), influence the basin. This basin hosts crucial waterways for transporting grain production, such as soybean and corn for international markets, which are essential for the region's social and economic development.
Location of the Madeira River basin (central panel) with mean annual precipitation data. River gauge stations are marked in black. The panels around the central panel show discharge seasonality for different gauges. Interval between 10 and 90 percentiles is shaded in blue. Maxima and minima daily discharges are shown in yellow.
Location of the Madeira River basin (central panel) with mean annual precipitation data. River gauge stations are marked in black. The panels around the central panel show discharge seasonality for different gauges. Interval between 10 and 90 percentiles is shaded in blue. Maxima and minima daily discharges are shown in yellow.
Eta regional climate model
The RCM-Eta regional climate model from the Center for Weather Prediction and Climate Studies (CPTEC/INPE) generates meteorological forecasts operationally (Chou et al. 2005, 2012, 2020). RCM-Eta contains a set of physical parameterizations for different physical processes. The scheme for radiative fluxes in the atmosphere was developed by the Geophysical Fluid Dynamics Laboratory (GFDL) in which shortwave radiation is based on the Lacis & Hansen's (1974) scheme and the longwave radiation is based on Fels & Schwarzkopf (1975). The CO2, O3, and initial albedo distributions are specified from climatology. The hydrology in Eta is based on the NOAH scheme (Chen et al. 1997). The model has four layers in the soil and 12 types of land cover. The Eta vegetation map includes changes due to accelerated human activities in the Amazon biome in the last few decades (Sestini et al. 2002). For convective rain formation, the model uses Betts–Miller–Janjić (Janjić 1994) and Kain–Fritsch (Kain 2004) schemes. Stratified rain formation is represented by cloud microphysics of Ferrier et al. (2002) and Zhao et al. (1997) which address all types of hydrometeors.
Ensemble intraseasonal forecast strategy
In the current study, an ensemble of intraseasonal (up to 30 days) forecasts are produced by varying the schemes of (i) convection (two different schemes) and (ii) microphysics (two schemes), and the (iii) global model that provides the initial conditions (two different models). In all, eight meteorological forecasts are produced to form an ensemble. The two global models are MCGA (Global Atmospheric General Circulation Model) and MCGOA (Global Coupled Ocean-Atmosphere Model), both from CPTEC/INPE. The eight ensemble forecasts are generated using all possible combinations of global models, convection schemes and microphysics over the Madeira basin, one at a time, are used to force the MGB hydrological model, described in the next section, to produce an ensemble of hydrological forecasts. The intraseasonal Eta–MGB data-based forecasts have been initialized with Climate Prediction Center (CPC) data from 1982 until the forecast initial date.
Bias correction
In order to eliminate or substantially reduce the systematic errors in the RCM-Eta precipitation estimate, three different methods of bias correction (BC) were adopted, namely, linear scaling (LS) by Lenderink et al. (2007), Empirical Quantile Mapping (EQM) and Gamma Quantile Mapping Parametric (PQM) by Piani et al. (2010), which were applied to every individual member of the ensemble on a daily basis. These methods are briefly described in Gomes et al. (2022). Hereafter, the precipitation bias correction from the RCM-Eta model estimates is also referred to as preprocessing.
The estimates of water level, river discharge, and inundated area for the intraseasonal range (up to 30 days) are provided by the MGB model. For details regarding ensemble hydrological predictions, we refer to Gomes et al. (2022). In summary, four ensemble forecasts are processed: (i) with raw precipitation data as input, called Eta–MGB–RAW, (ii) with bias-corrected precipitation data using EQM, called Eta–MGB–EQM, (iii) with bias correction using PQM, called Eta–MGB–PQM, and (iv) using precipitation preprocessed by LS, called Eta–MGB–LS.
Large-scale hydrological modeling: MGB model
Forecasts of river discharges, water levels, and inundated areas are performed with the MGB (Collischonn et al. 2007; Pontes et al. 2017) semi-distributed, large-scale hydrological model. It is a process-based model that represents multiple components of the water cycle (Pontes et al. 2017; Siqueira et al. 2018). Within MGB, the basin is discretized into unit-catchments, which represent the local drainage area of a given river reach. Unit-catchments are subdivided into Hydrological Response Units (HRUs) which are areas with homogeneous characteristics defined by a combination of soil and vegetation types (Kouwen et al. 1993). A great majority of parameters in the model is related to the physical characteristics of the basin, such as topography, vegetation, and soil types, which are obtained from satellite data, and digital elevation model data (DEM) (Getirana 2010). Other parameters are calibrated with the help of the MOCOM-UA algorithm (Yapo et al. 1998).
The current study utilized MGB with the one-dimensional hydrodynamic method (Pontes et al. 2017), which is based on a simplification of the Saint-Venant equations by neglecting the advective acceleration term in the momentum equation. This method allows the representation of hydrodynamic processes such as flood attenuation along floodplains and river backwater effects. Refer Pontes et al. (2017) and Siqueira et al. (2018) for further details about the model.
The model has two main types of parameters, associated with soil and vegetation. Vegetation parameters are leaf area index, albedo, surface resistance, and tree-top height. The soil parameters are the ones with a more conceptual nature and thus are subject to the calibration process. They consist of: soil storage capacity, relationship between storage and saturation, dry period discharge, residual storage, and base runoff, which can be modified for each sub-basin and URH (Collischonn et al. 2001; Collischonn & Tucci 2003). Here, MGB is calibrated at daily time step for the period 1982–2001 and validated for a period of 9 years, 2002–2010. The first 2 years of the calibration and validation are neglected in the analyses (spin-up period).
For validation and calibration of MGB for the Madeira basin, precipitation data for the period 1982–2010 are obtained from the CPC – Global Daily Unified Gauge-Based Analysis of Precipitation with a spatial resolution of 0.5° (Xie et al. 2010). We used monthly long-term meteorological data of the wind speed, solar radiation, relative humidity, air temperature, and air pressure from the Climate Research Unit (CRU) data set (Harris et al. 2014). The data were interpolated according to the inverse distance weighting (IDW) method using the distance from the unit-catchment centroid to nearby gauges. Daily observations of river discharges and water levels are obtained from the Sistema de Informações Hidrológicas (http://www.snirh.gov.br/hidroweb) of the Agência Nacional de Água (ANA) of Brazil, and are used to validate MGB. Monthly time series of observed inundated areas are obtained from the Global Inundation Extent from Multi-Satellites (GIEMS-2) data set, which captures the variability of the extent of surface water bodies over the whole globe (Prigent et al. 2007; Papa et al. 2010) utilizing observations of multiple satellites for the period 1992–2015 (Prigent et al. 2019).
Evaluation of ensemble forecasts
Model performance is evaluated for river discharges, water levels, and area of inundation forecasts through several statistical metrics, following Brown et al. (2010), along with deterministic measures, categories, and probabilistic ability of the forecasts. Besides the Nash–Sutcliffe Efficiency (NSE) index (Nash & Sutcliffe 1970), which is widely utilized in hydrological model performance evaluation, and its logarithmic version (NSE log), which is more sensitive to low discharges (Krause et al. 2005), we also used mean deviation of the members in the ensemble (BIAS) and mean absolute error (MAE). These metrics are capable of describing the mean characteristics of ensemble predictions and adequately assessing the performance of the predictions in intraseasonal timescales in particular (Kumar et al. 2014). Moreover, the mean degree (level) of continuously classified probability (CRPSS), with an extension proposed by Ferro (2014), is used for performance characterization of the ensemble (Hersbach 2000; Müller et al. 2005). In addition, we used the grading ability of Brier skill score (BSS) and performance diagram as an indicator of the reliability of the forecasts (Hopson 2014). Here, both the BSS and CRPSS are formulated as skill scores when streamflow observations climatology is used as reference. The relative operational characteristic (ROC) curves are used for examining the ability of the predictions to discriminate events according to their intensity. These metrics have been widely used and are well explained in several earlier studies on ensemble forecast systems (Thielen et al. 2009; Velázquez et al. 2009; Zalachori et al. 2012), and are documented in the articles and books by Brown et al. (2010), Bradley & Schwartz (2011), Jolliffe & Stephenson (2012), and Wilks (2019). Therefore, for brevity their mathematical formulation is not presented here.
Illustrative diagram of the adopted methodology, presenting the steps of preprocessing, ensemble hydrological simulations, and postprocessing.
Illustrative diagram of the adopted methodology, presenting the steps of preprocessing, ensemble hydrological simulations, and postprocessing.
RESULTS AND DISCUSSION
Hydrological model calibration and validation
MGB is able to satisfactorily simulate the magnitude and seasonality of river discharges at the three stations, for both calibration and validation periods (see Table S1 and Figure S1, Supplementary Material). Table S1 (Supplementary Material) summarizes the following performance metrics for assessing discharges: MAE, Bias, Anomaly Correlation Coefficient (ACC), Determination ofwq̀ Coefficient (r2), NSE, NSE log, and the Kling–Gupta efficiency (KGE) index. Except for Ariquemes, the r2 values are >0.6, with many stations presenting values over 0.8. The four efficiency indices (ACC, NSE, NSE log, and KGE) also show values around 0.8 or better. The indices in the Lower Madeira basin where the contribution from all the tributaries is integrated are better than that in the Upper Madeira basin. The biases are around 10% and, overall, the obtained metrics are considered encouraging. According to Moriasi et al. (2012), the results for the three stations shown in the table can be classified as ‘good’ and ‘very good’ (good when 2.5% < |BIAS| < 15% and 0.70 ≤ NSE ≤ 0.80; very good when |BIAS|< 2.5% and NSE > 0.80). The calibration and the validation of discharges are also considered satisfactory when compared with those obtained by Siqueira et al. (2018) and Wongchuig et al. (2019) for other MGB applications for the whole Amazon basin.
Verification of ensemble predictions of discharges with raw and preprocessed precipitation data input
Daily river discharges for selected stations based on all forecasts (30 day period). Observed: gray; ensemble mean with Eta–MGB–RAW: black; and ensemble mean with bias-corrected or preprocessed precipitation data by three methods: Eta–MGB–EQM: red, Eta–MGB–PQM: blue, and Eta–MGB–LS: green.
Daily river discharges for selected stations based on all forecasts (30 day period). Observed: gray; ensemble mean with Eta–MGB–RAW: black; and ensemble mean with bias-corrected or preprocessed precipitation data by three methods: Eta–MGB–EQM: red, Eta–MGB–PQM: blue, and Eta–MGB–LS: green.
Figure S2 (Supplementary Material) presents the BIAS (%) of the average discharge of the forced set with raw and preprocessed precipitation in relation to the discharge simulations using observed precipitation data obtained from the CPC. All the bias-corrected simulations reduced the BIAS in the discharge estimates when compared with the predictions with raw precipitation. The predictions using EQM (Eta–MGB–EQM) have smaller bias over practically the whole basin. However, some biases are observed even in the simulations with preprocessed precipitation data.
Figure S3 (Supplementary Material) shows the ensemble mean MAE of hydrological predictions considering all the forecasts issued in the period 2002–2010. For the Upper Madeira sub-basin (Figure S3, Supplementary Material, top row), we observe improvements of 53% in MAE after the bias correction of precipitation with EQM and PQM in relation to the simulations of Eta–MGB–RAW. For the middle (Figure S3, Supplementary Material, middle row) and Lower Madeira basins (Figure S3, Supplementary Material, bottom row), while errors increased with lead time in the simulations with raw data (Eta–MGB–RAW), errors in Eta–MGB–EQM simulations are reduced for lead times >15 days. In general, for the stations in the upper and lower regions of the basin, corrections with EQM and PQM show higher improvements than the LS method. At the same time, for the middle basin, the PQM method presents deterioration of performance, i.e., PQM does not improve the quality of simulations in the areas associated with large floodplain systems.
Figure S4 (Supplementary Material) presents the CRPSS metric for the whole distribution of the ensemble with Eta–MGB–RAW (black line), Eta–MGB–EQM (red line), Eta–MGB–PQM (blue line), and Eta–MGB–LS (green line) for some selected stations. CRPSS values are calculated with reference to climatological discharge values. In general, preprocessing improves the ability of discharge predictions in all sub-basins. Some sub-basins present improvements for shorter lead times, decreasing gradually for longer lead times, especially for stations in the upper portions of the basin (Mato Grosso and Pimenteiras). For other sub-basins the improvements are better for longer lead times (stations in the MIddle and Lower Madeira basin). Although larger sub-basins seem to have better predictions than the smaller ones, the variations are small although somewhat non-consistent.
CRPSS is probably very low for the Mato Grosso sub-basin because it is a small basin and the hydrological prediction skill reduces considerably with reduction in the catchment area, according to Sharma et al. (2017) and Siddique et al. (2015). Overall, the streamflow forecasts with Eta–MGB–EQM demonstrate larger improvements in the ability of predictions than the Eta–MGB–LS, Eta–MGB–PQM, and Eta–MGB–RAW schemes. However, the positive values for all the schemes with bias correction or without bias correction (raw) indicate that the prediction abilities are better than those of the climatology.
Taylor diagram displaying CC, centered mean square error, and the ratio of SDs in simulated and observed data sets at 13 stations across the Madeira basin (station locations depicted in Figure 1). Point colors denote forecasts of Eta–MGB–RAW (black), Eta–MGB–EQM (red), Eta–MGB–PQM (blue), and Eta–MGB–LS (green). The linear distance from the green square symbol corresponds to the centered root mean square error (CRMSE).
Taylor diagram displaying CC, centered mean square error, and the ratio of SDs in simulated and observed data sets at 13 stations across the Madeira basin (station locations depicted in Figure 1). Point colors denote forecasts of Eta–MGB–RAW (black), Eta–MGB–EQM (red), Eta–MGB–PQM (blue), and Eta–MGB–LS (green). The linear distance from the green square symbol corresponds to the centered root mean square error (CRMSE).
In addition to the general performance of the model described by the metrics, we analyzed the model's forecasting ability for extreme conditions. This analysis is necessary for risk assessment of potential adverse situations and their impact on the society. For this, we considered three categories of discharges, namely, low, medium, and high values, to verify the performance of predictions with ensemble forecast system corrected meteorological data. The metric is the probability of non-exceedance (Pr). Pr = 0.10 and Pr = 0.50, respectively, represent low and medium discharges. The high discharge category is represented by Pr > 0.90, which is the probability of exceeding the 90 percentile. These limits are selected to represent the base flow conditions. Discharge climatology is obtained from the available observational data at each station shown in Figure 1.
ROC curves for the ensemble mean forecast from Eta–MGB without bias correction and after bias correction by the three methods, EQM, PQM, and LS, for stations located in the Upper Madeira basin.
ROC curves for the ensemble mean forecast from Eta–MGB without bias correction and after bias correction by the three methods, EQM, PQM, and LS, for stations located in the Upper Madeira basin.
ROC curves for the ensemble mean forecast from Eta–MGB without bias correction and after bias correction by the three methods, EQM, PQM, and LS, for stations located in the Middle Madeira basin.
ROC curves for the ensemble mean forecast from Eta–MGB without bias correction and after bias correction by the three methods, EQM, PQM, and LS, for stations located in the Middle Madeira basin.
ROC curves for the ensemble mean forecast from Eta–MGB without bias correction and after bias correction by the three methods, EQM, PQM, and LS, for stations located in the Lower Madeira basin.
ROC curves for the ensemble mean forecast from Eta–MGB without bias correction and after bias correction by the three methods, EQM, PQM, and LS, for stations located in the Lower Madeira basin.
ROC diagram provides the end-users with information on the reliability of the forecasts. As the points on the plot indicate the hit and false-alarm rates (FARs) associated with the drought and flood conditions in probability intervals, they can be used to take informed decisions by selecting a probability limit for issuing bulletin about the event considered. For example, if a forecaster opts to emit a moderate drought advisory with a probability level of 10% and for 30 day lead time (Figure 5), he expects Eta–MGB–RAW for Cachuela Esperanza to reach a hit rate higher than double the FAR (∼100% hit rate and ≤50% FAR). The simulations with bias correction obtain an excellent mark with hit rates larger than 80% in comparison with FARs inferior to 10%. For medium streamflow, a limit of 50% probability (Pr = 0.5) for Eta–MGB–EQM reach hit rates of at least 70% in comparison with the FARs inferior to 20% in all the sub-basins. In general, it can be observed that lower probability limits produce better compromise between the hits and false alarms. It is to be remembered that the cost (economic) of false alarms and the benefits of true predictions have to be weighed while utilizing the predictions of extreme events (Roulin 2007).
The Middle and Lower Madeira sub-basins (Figures 6 and 7) present a better ability for discriminating the events in all the probability intervals, with hit rates higher than 80% and FARs lower than 20% for all predictions, with or without bias correction of the input precipitation data. However, at stations with larger catchment area such as Abuña, Porto Velho, and Manicoré (whose catchment area is >100 km2, according to Table S1, Supplementary Material), the predictions that utilized PQM and LS showed a tendency of reduced ability to discriminate peaks of streamflow.
BSS for the ensemble mean forecast from Eta–MGB–RAW, Eta–MGB–EQM, Eta–MGB–PQM, and Eta–MGB–LS for stations in the Upper Madeira basin.
BSS for the ensemble mean forecast from Eta–MGB–RAW, Eta–MGB–EQM, Eta–MGB–PQM, and Eta–MGB–LS for stations in the Upper Madeira basin.
BSS for the ensemble mean forecast from Eta–MGB–RAW, Eta–MGB–EQM, Eta–MGB–PQM, and Eta–MGB–LS, for stations in the Middle Madeira basin.
BSS for the ensemble mean forecast from Eta–MGB–RAW, Eta–MGB–EQM, Eta–MGB–PQM, and Eta–MGB–LS, for stations in the Middle Madeira basin.
BSS for the ensemble mean forecast from Eta–MGB–RAW, Eta–MGB–EQM, Eta–MGB–PQM, and Eta–MGB–LS, for stations in the Lower Madeira basin.
BSS for the ensemble mean forecast from Eta–MGB–RAW, Eta–MGB–EQM, Eta–MGB–PQM, and Eta–MGB–LS, for stations in the Lower Madeira basin.
Figure 8 shows the reduction of relative ability with lead time for predictions without and with bias correction. The forecasts show higher degree of ability than the observed climatology. Forecasts with Eta–MGB–EQM and Eta–MGB–LS present higher ability than both the forecasts with Eta–MGB–PQM and Eta–MGB–RAW.
In general, the ability of forecasts tends to diminish with lead time, in all intervals of discharges and with the size of the sub-basin. It also decreases in the regions closer to the Andes. Discharge in smaller basins is a direct and immediate response to precipitation, while in larger basins it is affected by sub-surface flow and flood propagation processes. Therefore, in smaller basins the ability of discharge prediction is more dependent on the ability of rainfall prediction in the basin.
The BSSs for the stations in the middle and lower basin are shown in Figures 9 and 10, respectively. Their behaviors are similar in the sense that basins with larger catchment areas (Abuña, Porto Velho, and Manicoré) present better performance in the simulations with Eta–MGB–EQM and Eta–MGB–LS. However, in contrast to the results for stations in the Upper Madeira the forecasts with Eta–MGB–PQM showed poorer performance relative to even the forecasts with no bias correction, Eta–MGB–RAW, for all lead times. This shows the deficiency of PQM in correcting the precipitation data that are fed to the MGB for regions with large catchment area. On the other hand, for regions with smaller catchment areas the PQM correction shows the same ability as EQM.
Performance diagram for the ensemble mean forecast from RCM-Eta without bias correction and after bias correction by the three methods, EQM, PQM, and LS. The x-axis shows the SR, i.e., one minus the false-alarm rate (1 − FAR). The y-axis shows the POD. Solid lines indicate the frequency bias and dashed curves show the CSI. Color designations of the points are shown in the inset in the top-left panel. There are 30 different symbols one each for forecasts with lead times ranging from 1 to 30 days for stations in the Upper Madeira basin. Please refer to the online version of this paper to see this figure in colour:
Performance diagram for the ensemble mean forecast from RCM-Eta without bias correction and after bias correction by the three methods, EQM, PQM, and LS. The x-axis shows the SR, i.e., one minus the false-alarm rate (1 − FAR). The y-axis shows the POD. Solid lines indicate the frequency bias and dashed curves show the CSI. Color designations of the points are shown in the inset in the top-left panel. There are 30 different symbols one each for forecasts with lead times ranging from 1 to 30 days for stations in the Upper Madeira basin. Please refer to the online version of this paper to see this figure in colour:
It can be seen that the dots representing the forecasts with bias correction are grouped nearer to the upper right corner than the predictions without bias correction. In general, the Eta–MGB–EQM ensemble marked with red circles is closer to the line of bias frequency equal to 1.
CSI for the stations in the upper basin (Figure 11) shows better performance for the Eta–MGB–EQM, followed by Eta–MGB–PQM and Eta–MGB–LS for all the thresholds analyzed. In terms of SR, it can be observed that for Pr = 0.1 and Pr = 0.5 the results are similar among the three methods of bias correction, but for higher Pr the BIAS values for Eta–MGB–PQM and Eta–MGB–LS were lower than 1.0, indicating that these methods fail in the high streamflow situations in this region. Relatively higher FAR value for the Cachuela Esperanza station, combined with BIAS larger than 1.1, indicates overestimation of events of discharge with Pr > 0.9.
Performance diagram for the ensemble mean forecast from RCM-Eta without bias correction and after bias correction by the three methods, EQM, PQM, and LS. The x-axis shows the SR, i.e., one minus the false-alarm rate (1 − FAR). The y-axis shows the POD. Solid lines indicate the frequency bias and dashed curves show the CSI. Color designations of the points are shown in the inset in the top-left panel. There are 30 different symbols one each for forecasts with lead times ranging from 1 to 30 days, for stations in the Middle Madeira basin. Please refer to the online version of this paper to see this figure in colour:
Performance diagram for the ensemble mean forecast from RCM-Eta without bias correction and after bias correction by the three methods, EQM, PQM, and LS. The x-axis shows the SR, i.e., one minus the false-alarm rate (1 − FAR). The y-axis shows the POD. Solid lines indicate the frequency bias and dashed curves show the CSI. Color designations of the points are shown in the inset in the top-left panel. There are 30 different symbols one each for forecasts with lead times ranging from 1 to 30 days, for stations in the Middle Madeira basin. Please refer to the online version of this paper to see this figure in colour:
Performance diagram for the ensemble mean forecast from RCM-Eta without bias correction and after bias correction by the three methods, EQM, PQM, and LS. The x-axis shows the SR, i.e., one minus the false-alarm rate (1 − FAR). The y-axis shows the POD. Solid lines indicate the frequency bias and dashed curves show the CSI. Color designations of the points are shown in the inset in the top-left panel. There are 30 different symbols one each for forecasts with lead times ranging from 1 to 30 days, for stations in the Lower Madeira basin. Please refer to the online version of this paper to see this figure in colour:
Performance diagram for the ensemble mean forecast from RCM-Eta without bias correction and after bias correction by the three methods, EQM, PQM, and LS. The x-axis shows the SR, i.e., one minus the false-alarm rate (1 − FAR). The y-axis shows the POD. Solid lines indicate the frequency bias and dashed curves show the CSI. Color designations of the points are shown in the inset in the top-left panel. There are 30 different symbols one each for forecasts with lead times ranging from 1 to 30 days, for stations in the Lower Madeira basin. Please refer to the online version of this paper to see this figure in colour:
Evaluation of water level ensemble predictions
Water (river) level anomaly. Gray: observation data; black: Eta–MGB–RAW simulated raw; red: bias-corrected Eta–MGB–EQM; blue: bias-corrected Eta–MGB–PQM; green: bias-corrected Eta–MGB–LS.
Water (river) level anomaly. Gray: observation data; black: Eta–MGB–RAW simulated raw; red: bias-corrected Eta–MGB–EQM; blue: bias-corrected Eta–MGB–PQM; green: bias-corrected Eta–MGB–LS.
Taylor diagram showing correlation, centered mean square error, and the ratio between the SDs of the simulated and observed river levels at nine stations distributed in the basin. The station locations are shown in Figure 1. Ensemble simulation without bias correction: black points. After three methods of bias correction: Eta–MGB–EQM in red, Eta–MGB–PQM in blue, and Eta–MGB–LS in green. The linear distance from the green square mark is proportional to CRMSE. Please refer to the online version of this paper to see this figure in colour:
Taylor diagram showing correlation, centered mean square error, and the ratio between the SDs of the simulated and observed river levels at nine stations distributed in the basin. The station locations are shown in Figure 1. Ensemble simulation without bias correction: black points. After three methods of bias correction: Eta–MGB–EQM in red, Eta–MGB–PQM in blue, and Eta–MGB–LS in green. The linear distance from the green square mark is proportional to CRMSE. Please refer to the online version of this paper to see this figure in colour:
Overall, the model performance is satisfactory for all analyzed sub-basins. The river level graphs presented in Figure 14 show the progression of the flood peaks and drought minima along the basin. The simulation of Eta–MGB–RAW overestimated the maxima and underestimated the minima in the water level variability, mainly in the sub-basins with larger catchment areas such as Cachuela Esperanza, Porto Velho, Humaitá, and Manicoré. However, the predictions with bias-corrected precipitation data show improvements in the forecasts of the peaks and minima. Performance metrics of the predictions are satisfactory, presenting SD and CC values near unity and smaller values for CRMSE (Figure 15). The EQM and PQM bias correction methods yield predictions closer to the observational data.
The performance of MGB forced by the outputs from RCM-Eta with and without bias correction in the representation of the extension of inundated area in the Madeira basin is shown in Figure 16. The simulations are compared with GIEMS-2 observational data. The maximum inundated area of >39.000 km2 occurred in the period February through April and the minimum occurred in the trimester August to October. The MGB represents the seasonality of inundated area satisfactorily. The simulations with bias-corrected data presented improvements over the simulations with raw data. The predictions from Eta–MGB–RAW underestimate the inundated area for the whole period, while the predictions from Eta–MGB–EQM are closer to the GIEMS-2 data.
SUMMARY AND CONCLUSIONS
This study presents an estimation of the potential ability of intraseasonal discharge forecasts for different probability categories over the largest sub-basin of the Amazon. Raw and bias-corrected precipitation given by RCM-Eta are used as inputs for the MGB-coupled hydrological–hydrodynamic model to produce ensemble of discharge forecasts. The model is validated and calibrated with the help of observed data at some select stations along the Madeira River and its tributaries, a region prone to floods and droughts. Forecasts obtained by with the raw precipitation data were able to capture the seasonality of high and low discharges, as well inundated areas with high values of temporal correlations. However, these forecasts showed large uncertainties.
The performance of the hydrometeorological forecasting system exhibited spatial, temporal, and lead time variability. The consistency of consecutive days with accurate forecasts, as well as the system's capacity to differentiate between low, moderate, and high discharges on an intraseasonal scale, varied across different locations. All performance metrics and quality criteria indicated that forecasts, both before and after bias correction, remained reliable up to a 30 day lead time. However, preprocessing the meteorological data fed into the hydrological model can enhance forecasts of discharge, river level, and inundated areas. These enhancements are particularly notable in scenarios of low and moderate river flow and for shorter lead times. Bias correction in precipitation data is consistently recommended, as the preprocessing procedure not only alleviates bias but also facilitates the quantification of forecast quality improvements. Of all the methods tested, EQM performed best in terms of transferability and robustness for projections of hydrological extremes.
We show that the ability of the forecasts to discriminate low, medium, and high discharge episodes is dependent on the size of the sub-basin. For some sub-basins the discriminating ability diminished with catchment area, even for longer lead times. The hit rate for the ROC curves is >70% which is considered adequate for the decision-makers. The forecast system developed here, i.e., the models, their coupling, the bias correction schemes, and combined statistical evaluations, shows its potential to obtain reliable forecasts in the Madeira basin. Furthermore, the system showed a robust ability to discriminate between different flow situations: high, medium, and low. It was observed that the model's performance is superior in larger catchment areas compared with smaller basins. This differential performance may be related to the spatial and temporal variability of precipitation inputs and the complexity of hydrological processes in smaller basins. Coughlan de Perez et al. (2015) suggested that an ensemble system that produces <50% false alarms helps the decision-makers in terms of economic consequences. In all conditions of low, moderate, and high discharge situations the CRPSS values for smaller basins are smaller than those for larger basins. This is perhaps an indication that the uncertainties in the meteorological parameters are larger than the uncertainties in the hydrological parameters. Our results suggest that large basin hydrology is affected more by the initial condition than the predicted rain and the predicted rain becomes important after a sufficiently longer time. The propagation of inundation is slow in basins with gentle slopes as is observed in the Madeira basin.
Our findings emphasize also the usefulness of dynamic downscaling and statistical procedure utilized in this study for forecasts along an important tributary in the Amazon basin, producing skillful forecasts of discharge, river level, and inundated area for lead times up to 30 days. Considering a basin where observations and forecasts are scant, the intraseasonal timescale forecasts can certainly improve the decision-making process in many spheres such as flood and drought warning. Useful forecasts of imminent floods or inundations should leave enough time for the warning to reach the people that will be eventually affected. This study emphasizes not only the necessity of performance evaluation against observations but also quantifying the uncertainties in the forecasts. The results obtained and explanations and interpretations provide sufficient evidence for building confidence in the forecast system for floods and droughts as a guiding tool for decision-making and can be integrated into existing operational systems.
The main results of the study and conclusion are: (1) the Eta + MGB ensemble predictions of the hydrological parameters for the Madeira basin, in general, show good performance; (2) the model performance is not uniform in the basin, showing differences in performance in the upper, middle, and lower portions of the basin; (3) removal or reduction of systematic errors or biases in the meteorological data before using them to force a hydrological model greatly improves the performance as seen from statistical metrics; (4) among the three schemes of bias correction utilized, EQM performs better; (5) the model has the capacity to discriminate between high, medium, and low streamflow situations; (6) the model performance is better for sub-basins with larger catchment areas; and (7) the hit rates exceed the false alarms in all probability ranges, approximately in the ratio of 70:20. In summary, we conclude that the ensemble prediction system developed here, coupling the regional atmospheric model RCM-Eta with the large-scale MGB hydrological model, can potentially be used for operational forecasts of discharges, river level, and inundated area in the Madeira River basin.
ACKNOWLEDGEMENTS
This study was financed in part by the Fundação de Amparo à Pesquisa do Estado do Amazonas (FAPEAM) – Finance Code: 01.02.016301.00268/2021. This work is developed in the Postgraduate Program in Climate and Environment (CLIAMB) jointly coordinated by the Amazon State University (UEA) and the National Institute of Amazonian Research (INPA). The first author is grateful to the FAPEAM for the doctoral grant. Prakki Satyamurty is supported by PVNS Grant No. 2308.019802/2018-7 from CAPES, Brazil and Research Productivity Grant No. 306486/2021-0 of CNPQ, Brazil. All authors are thankful to the Center for Weather Forecasting and Climate Studies – National Institute of Space Research (CPTEC/INPE) for making available the numerical integrations data sets and the Laboratory of Terrestrial Climate System Modeling (LABCLIM/UEA) for providing the computational infrastructure – TAMBAQUI Cluster.
DATA AVAILABILITY STATEMENT
Data cannot be made publicly available; readers should contact the corresponding author for details.
CONFLICT OF INTEREST
The authors declare there is no conflict.