Abstract
To prepare measures to respond to climate-induced extreme droughts, consideration of various weather conditions is necessary. This study tried to generate extreme drought weather data using the Weather Research and Forecasting (WRF) model and apply it to the Long Short-Term Memory (LSTM), a deep learning artificial intelligence model, to produce the runoff instead of using conventional rainfall–runoff models. Finally, the standardized streamflow index (SSFI), the hydrological drought index, was calculated using the generated runoff to predict extreme droughts. As a result, the sensitivity test of meteorological data to runoff showed that using similar types of meteorological data could not improve runoff simulations with a maximum difference of 0.02 in Nash–Sutcliffe efficiency. During the drought year of 2015, the runoff generated by WRF and LSTM exhibited reduced monthly runoffs and more severe SSFI values below −2 compared to the observed data. This shows the significance of WRF-generated meteorological data in simulating potential extreme droughts based on possible physical atmospheric conditions using numerical representations. Furthermore, LSTM can simulate runoff without requiring specific physical data of the target catchment; therefore, it can simulate runoff in any catchment, including those in developing countries with limited data.
HIGHLIGHTS
Applicability of the combination of a weather model (WRF) with a deep learning artificial intelligence model (LSTM) was tested for extreme droughts.
Meteorological data during severe drought periods can be simulated using a physics-based weather model based on diverse physical atmospheric conditions.
In the applications of LSTM, similar types of meteorological data are not conducive to the improvement of LSTM runoff.
LSTM with WRF can generate diverse severe droughts that may occur.
INTRODUCTION
Owing to the impact of climate variability, severe weather conditions (e.g. floods and droughts) have become more frequent and serious worldwide (Sheffield & Wood 2008; Ali et al. 2020). In the future, severe droughts are expected to occur worldwide due to climate change (Dai 2013). Aridity is a climatic characteristic that appears permanently due to meteorological conditions of long-term low precipitation and high temperature for a specific region such as Iran (Araghi et al. 2018). On the other hand, drought is a temporary but recursive anomaly caused by weather changes from normal weather conditions, and can occur in any region with a different climate (Karamouz et al. 2013). Korea experienced a severe drought in 2015, which caused several critical agricultural and economical damages. In the near future, more serious droughts can occur because of deteriorating climatic conditions (Lee et al. 2019). To respond to such future extreme droughts, it is necessary to generate meteorological data for various atmospheric conditions that may occur in the future and calculate the runoff during the drought period using hydrological models. The generated runoff data can be used to determine the intensity and duration of hydrological droughts and be applied in policy making.
Methods for estimating future droughts caused by climate change vary from far-future predictions using general circulation models (GCMs) (predicting weather data up to 100 years) to near-future predictions using artificial intelligence models (Burke et al. 2006; Dikshit et al. 2021; Zhang et al. 2021). Based on the future CO2 reduction scenarios, GCMs predict future meteorological data but have various uncertainties because of the uncertainties in GCMs and scenarios (Woldemeskel et al. 2014). GCMs are incapable of presenting the regional behavior of droughts because of their coarse resolution (Vicente-Serrano et al. 2004; Maule et al. 2013). As an alternative, regional climate models (RCMs) can provide a finer resolution (∼30 km) but also have diverse uncertainties because of various parameter configurations, nesting procedures, and boundary conditions (PaiMazumder & Done 2014; Ojeda et al. 2017). Therefore, with consideration of uncertainties, the drought prediction results obtained using GCMs and hydrological models can be used to set general far-future policy directions (e.g. Shin et al. 2016). For near-future predictions, the drought index can be estimated by artificial intelligence applications using various weather data (Dikshit et al. 2021); however, the predicted drought may not be the extreme local drought that can occur.
To estimate the possible extreme regional droughts in the future, a physics-based weather model can be used to generate possible meteorological data. This meteorological data during drought period can then be applied to a rainfall–runoff model to calculate the possible runoffs that are used to estimate the drought index. In particular, for presenting drought conditions, precipitation and temperature data are the major components that can be determined using a regional-scale weather model (Barrera-Escoda et al. 2014; Bowden et al. 2016). In this way, some researchers have applied the Weather Research and Forecasting (WRF) model for drought prediction. Ojeda et al. (2017) used two drought indices (standardized precipitation index (SPI) and standardized precipitation evapotranspiration index (SPEI)) computed by WRF outputs with observational comparisons. The WRF outputs presented improved results with higher and longer timescales. Lee et al. (2022) used the WRF-Hydro model, a process-based hydrologic model, to compare three different drought concepts, namely hydrological, agricultural, and meteorological droughts. Bowden et al. (2016) used the WRF model as an RCM for the dynamical downscaling of reanalysis fields in SPI comparisons. The timing and intensity of moderate to wet and dry periods were improved by WRF in homogenous terrain compared with larger-scale reanalysis data. In another study, Ahmadi et al. (2015) developed a support vector machine and predicted precipitation using sea level pressures, difference in sea level pressure and sea surface temperature data. However, not enough studies have attempted to generate various weather conditions (rainfall, wind, and temperature) using different physical representations of the WRF model with a recurrent neural network for predicting severe droughts. Runoff can be simulated and drought assessed using the Soil and Water Assessment Tool (SWAT) (Arnold et al. 1998) model, a physically based semi-distributed model. Liang et al. (2021) used the SWAT model to simulate the hydrological variables needed to calculate three univariate drought indices for the historical data period. That is, they used the SWAT model as a basic hydrological model to evaluate drought in a watershed located in northwest China. Yousefi & Moridi (2022) developed a simulation–optimization model using the SWAT model and nondominated sorting differential evolution algorithm to develop strategies for mitigation and adaptation to climate change, and performed optimal reservoir operation. Unlike physical based (semi-)distributed hydrological models (e.g. SWAT) or conceptual rainfall–runoff models, recurrent neural network models have a great advantage in that they can simulate the rainfall–runoff process for any target catchment because there are no restrictions on input data selection for runoff simulation (Ditthakit et al. 2023; Sayed et al. 2023). Therefore, simulating runoff by employing a recently widely used artificial intelligence model rather than a general hydrological model can be appropriate for various regions, including developing countries.
The purpose of this study is to prepare for future water resource shortages by predicting extreme droughts that can occur due to diverse physical atmospheric conditions. The novelty of this study is that it newly tried to generate extreme drought weather data using the WRF model and apply to the Long-Short Term Memory (LSTM), a deep learning artificial intelligence model applicable to less data available catchments, to produce the runoff instead of using a general rainfall–runoff model. This study generated various possible meteorological data during a drought period using WRF as a physics-based regional meteorological generator. Then a possible extreme low flow is generated by the LSTM model using the generated meteorological data. Finally, the hydrological drought index is calculated using the observed and generated low flow to estimate how severe a drought is likely to occur. Therefore, this study demonstrates a novel method to calculate possible extreme hydrological droughts using the LSTM and WRF with various weather conditions.
MATERIALS AND METHODS
Study area and data
The Weather Research and Forecasting model
To generate possible existing atmospheric conditions, the WRF model was applied in this study. In general, WRF is used for mesoscale weather prediction and regional weather simulation using diverse parameterizations (Skamarock et al. 2008). Physical representations and parameterizations produce spurious oscillations and numerical smoothing of the mass, momentum, and entropy of the atmosphere using prognostic equations (Pattanayak & Mohanty 2008). For drought evaluation, quantitative precipitation simulation is a substantial component. Precipitation is represented by the atmospheric heat and moisture flux status in microphysics and cumulus parameterizations.
The microphysics parameterization provided cloud properties and structures on the meso- and cloud-scale using numerical representations. In precipitation development procedures (e.g. generation, growth, and fall), diverse status of ice, such as cloud ice, graupel/hail, and snow, in the cloud are used. The microphysics parameterizations present ice-phase hydrometers in mid-latitude profound convection clouds and smaller grid simulations (Jung & Lin 2016). In the present study, three different microphysics parameters, namely the Purdue Lin scheme (Lin et al. 1983, Lin–Farley–Orville scheme; Chen & Sun 2002), Ferrier (New Eta) microphysics scheme (Rogers et al. 2001), and Single-Moment 6 – Class microphysics scheme (WSM6, Hong & Lim 2006), were adopted to generate plausible atmospheric conditions. For cumulus parameterizations presenting the temperature and humidity of the atmosphere, cloud tendency profiles, and convective rainfall, the Betts–Miller–Janjić (BMJ) scheme (Betts & Miller 1993; Janjić 1994) was selected. Based on the given parameters, we used three combinations of microphysics and a cumulus parameterization for the WRF simulations (Table 1). Our study had a final domain grid size smaller than 4 km; thus, cumulus parameterization could be ignored because the smaller grid size contained mesoscale dynamics and local sub-grids with physical functions (Gilliland & Rowe 2007). However, we followed the recommendation of using cumulus parameterization simultaneously with microphysics parameterizations by Gerard (2007).
Combination numbering . | Microphysics . | Cumulus parameterizations . |
---|---|---|
0202 | Purdue Lin scheme (02) | Betts–Miller–Janjić (BMJ) scheme (02) |
0502 | Ferrier (New Eta) microphysics scheme (05) | |
0702 | Single – Moment 6 – Class microphysics scheme (07) |
Combination numbering . | Microphysics . | Cumulus parameterizations . |
---|---|---|
0202 | Purdue Lin scheme (02) | Betts–Miller–Janjić (BMJ) scheme (02) |
0502 | Ferrier (New Eta) microphysics scheme (05) | |
0702 | Single – Moment 6 – Class microphysics scheme (07) |
To initialize and update the boundary conditions of the WRF simulation for three years (2014–2016), the National Centers for Environmental Prediction (NCEP) Final (FNL) Operational Global Analysis data downloaded from the NCEP data archive were applied. The FNL data are based on every six hours global size (1° × 1° grid size) simulations using the Global Forecast System (GFS) with observational data updates (https://www.mmm.ucar.edu/models/wrf).
Long Short-Term Memory
To learn LSTM, hyperparameters must be set. However, the method for setting hyperparameters is unclear (Chollet & Allaire 2018). We set the hyperparameters through trial and error. The number of hidden units in the LSTM layer was set to 100 to use a sufficient number for learning. If the number of hidden units is too large, the model performance for the validation or testing period may be lowered by overfitting the data of the learning period. To prevent overfitting, we used the callback and dropout functions. Callback prevents overfitting of parameters by terminating the training process early if the validation result is no longer improved even after learning as many times as the arbitrarily set repetitions (here, 10), and dropout prevents overfitting by randomly removing multiple output features in a layer at an arbitrarily set ratio (here, 0.5) during the learning process (Srivastava et al. 2014). The batch size (number of data) that divides the entire data into small sample groups for efficient learning was set to 10. For parameter optimization, we applied Adam (Kingma & Ba 2014), which is currently being widely used in the deep learning field (Le et al. 2019), and the learning rate was set to a default value of 0.001. For parameter optimization, the mean absolute error was used as the objective function. In addition, the predictive ability of LSTM was evaluated using a testing period independent from the training and validation periods. To perform LSTM modeling, the R package Keras (Falbel et al. 2019) was applied.
Hydrological drought index
Classification . | SSFI value . | Probability (%) . |
---|---|---|
Extremely wet | SSFI > 2 | 2.3 |
Very wet | 1.50 < SSFI < 1.99 | 4.4 |
Moderately wet | 1.00 < SSFI < 1.49 | 9.2 |
Near normal | −0.99 < SSFI < 0.99 | 68.2 |
Moderately dry | −1.49 < SSFI < −1.00 | 9.2 |
Severely dry | −1.99 < SSFI < −1.50 | 4.4 |
Extremely dry | SSFI < −2.0 | 2.3 |
Classification . | SSFI value . | Probability (%) . |
---|---|---|
Extremely wet | SSFI > 2 | 2.3 |
Very wet | 1.50 < SSFI < 1.99 | 4.4 |
Moderately wet | 1.00 < SSFI < 1.49 | 9.2 |
Near normal | −0.99 < SSFI < 0.99 | 68.2 |
Moderately dry | −1.49 < SSFI < −1.00 | 9.2 |
Severely dry | −1.99 < SSFI < −1.50 | 4.4 |
Extremely dry | SSFI < −2.0 | 2.3 |
To calculate the SSFI, determining a probability distribution suitable for streamflow is essential. After applying the observed runoff from the Andong dam catchment to the Weibull, gamma, log-normal, and log-logistic distributions, the log-logistic distribution was found to be the most appropriate. Vicente-Serrano et al. (2012) conducted a study on the accurate calculation of SSFI and suggested that the log-logistic distribution is a more suitable probability distribution than the log-normal distribution, which has been widely used to fit monthly streamflow. Therefore, in this study, the SSFI of the observed runoff was calculated using a log-logistic distribution. In addition, the SSFI of the generated runoff for each weather scenario was calculated using the estimated log-logistic distribution coefficient for the observed runoff.
When calculating the SSFI, the time period can be set as short (3 or 6 months) or long (12 or 24 months). If the time period is short, the SSFI frequently moves above and below the zero value; thus, the frequency of drought events increases (McKee et al. 1993). If the time period is too short, it cannot properly reflect the actual drought. As a result of analyzing the observed flow in the Andong dam catchment, the drought phenomenon that occurred in 2015 could be properly reflected when the duration was 12 months rather than 3 or 6 months. Therefore, a duration of 12 months was used for the drought analysis. In this study, the R package ‘SPEI’ (Beguería & Vicente-Serrano 2015) was used to estimate the hydrological drought index.
Method
RESULTS
LSTM preparation – sensitivity analysis and optimal data set selection
A sensitivity analysis of monthly weather data was performed for the following three cases. Case 1 used five types of data (precipitation, maximum temperature, minimum temperature, wind speed, and sunshine duration), Case 2 used four types of data (precipitation, maximum temperature, minimum temperature, and wind speed), and Case 3 used three types of data (precipitation, average temperature, and wind speed).
Case . | NSE . | RMSE (mm) . | ||||
---|---|---|---|---|---|---|
Training . | Validation . | Testing . | Training . | Validation . | Testing . | |
Case 1 (five inputs: precipitation, maximum and minimum temperature, wind speed, and sunshine duration) | 0.83 | 0.91 | 0.84 | 26.5 | 27.5 | 24.2 |
Case 2 (four inputs: precipitation, maximum and minimum temperature, and wind speed) | 0.81 | 0.91 | 0.84 | 28.0 | 27.8 | 24.3 |
Case 3 (three inputs: precipitation, average temperature, and wind speed) | 0.81 | 0.90 | 0.84 | 28.2 | 29.2 | 24.5 |
Case . | NSE . | RMSE (mm) . | ||||
---|---|---|---|---|---|---|
Training . | Validation . | Testing . | Training . | Validation . | Testing . | |
Case 1 (five inputs: precipitation, maximum and minimum temperature, wind speed, and sunshine duration) | 0.83 | 0.91 | 0.84 | 26.5 | 27.5 | 24.2 |
Case 2 (four inputs: precipitation, maximum and minimum temperature, and wind speed) | 0.81 | 0.91 | 0.84 | 28.0 | 27.8 | 24.3 |
Case 3 (three inputs: precipitation, average temperature, and wind speed) | 0.81 | 0.90 | 0.84 | 28.2 | 29.2 | 24.5 |
Applications of generated WRF data
Comparative analysis of hydrological drought index
DISCUSSION
As mentioned in Section 2, WRF was used for generating possible weather data sets using physically represented parameters. In the field of meteorology, weather predictions and other atmospheric phenomena with various parameter adaptations have mostly focused on WRF applications. As Powers et al. (2017) pointed out, several studies have focused on hurricanes (Chen et al. 2011; Moon & Nolan 2015), systematized convection (Meng et al. 2012; Morrison et al. 2012; Akter 2015; Xu et al. 2015), mesoscale weather events (Brewer et al. 2013; DuVivier & Cassano 2013; Mass et al. 2014; Parish et al. 2015), and cyclones/fronts/jets related synoptic and mesoscale developments (Schultz & Sienkiewicz 2013; Thompson & Eidhammer 2014; Ganetis & Colle 2015; Lu & Deng 2015; Rostom & Lin 2015). In particular, parameter selection is critical for predicting heavy storms, cyclones, and diverse weather events (Baki et al. 2022). Most studies have searched the utmost parameter sets that matched well with certain events in selected regions (Jung & Lin 2016; Baisya et al. 2017; Chakraborty et al. 2021; Chinta et al. 2021; Baki et al. 2022). However, this study changed these concepts to examine the plausible weather data generations using diverse parameters as Choi & Jung (2022). In the case of meteorological predictions, weather forecasters provide the percentage of possible rainfall and events based on diverse simulations of different parameter sets that commence on a similar concept.
LSTM was able to reasonably simulate the observed runoff using only three meteorological variables (see Case 3 in Table 3), which was equivalent to the simulated runoff obtained using five meteorological variables (Case 1). This means that the type of input data is more important than the number of input variables, as argued by Le et al. (2019). In other words, the difference between Cases 3 and 1 was the use of average temperature in Case 3 and maximum temperature, minimum temperature, and sunshine hours in Case 1. Sunshine hours, maximum temperature, and minimum temperature are the same type variables because sunshine hours are closely related to temperature. The type of three temperature-related variables in Case 1 was the same as that of average temperature variable in Case 3. Therefore, there was no significant difference in the simulation results of the two cases. In addition, with a large amount of data of the same type, the total amount of measurement error (e.g. McMillan et al. 2011) in the data can also increase, which in turn can increase the uncertainty in the simulation results. Therefore, it is necessary to reduce uncertainty by preventing the duplication of data types.
This study suggested the possible occurrence of extreme droughts in the future using various WRF scenarios and LSTM. Previously, a meteorological drought in Spain (Ojeda et al. 2017) and a hydrological drought in Korea (Lee et al. 2022) have been investigated using WRF. However, these studies had not investigated the possible droughts caused by various weather conditions. The advantage of the WRF is that it can generate physically probable meteorological data sets using a variety of meteorological conditions. This study differs from previous studies in that it utilized the advantages of WRF.
The limitation of this study using WRF is the simulation grid size because of simulation time constraints. Simulating on a smaller grid size requires more time. For this study, a 4 km grid size was used for the final domain to ensure the most accurate simulation. If long simulation time is a constraint, approximate drought simulation results can be obtained by increasing the grid size in WRF. It is possible to use WRF on catchments that are smaller than the ones used in this study. The other limitation of this study is that it analyzed drought in one catchment in Korea as a pilot study. However, the research method can be applied to other catchments in the future, and useful drought prediction results can be derived. Additionally, stakeholders and policymakers in the regions of interest can use the prediction results of possible future droughts to develop policies for managing extreme droughts.
SUMMARY AND CONCLUSIONS
Severe weather conditions (e.g. floods and droughts) induced by climate change require predictions to prepare countermeasures using available science and technologies. Korea was no exception to climate change and experienced a severe drought in 2015, which caused severe agricultural and economic damages. In this study, physically possible meteorological data for drought simulations were generated using a physics-based weather model, WRF, and applied to a deep learning artificial intelligence model (LSTM) to produce runoff data in the Andong dam catchment during the drought period. LSTM was employed in this study because it has the advantage of having no restrictions on input data selection for runoff simulation compared to conceptual or physically based rainfall–runoff models. In addition, the generated runoff data sets for each meteorological scenario were applied to SSFI, a hydrological drought index, to predict extreme droughts that may occur in the future.
The sensitivity of the meteorological data to runoff shows that the average temperature can replace three types of data related to temperature (i.e. maximum and minimum temperatures, and sunshine hours) with a maximum difference of 0.02 in NSE. Therefore, it is appropriate to simplify the meteorological input data using only the representative average temperature. As a result of extreme drought generations using physically possible weather conditions from the WRF, more severe extreme droughts occurred in 2015 than the observations. Very severe droughts with SSFIs less than −2 can occur in the future; therefore, countermeasures against such extreme droughts are needed. WRF and LSTM can be used to predict possible severe droughts in the near future. Additionally, LSTM can simulate runoff without requiring specific physical data of the target catchment; therefore, it can simulate runoff in any catchment, including those in developing countries. Hence, WRF and LSTM can be applied to prepare alternatives for future extreme droughts by policymakers and stakeholders.
ACKNOWLEDGEMENTS
This work was supported by the National Research Foundation of Korea (NRF) grant funded by the Korea government (Ministry of Science and ICT) (NRF-2022R1A2C1092215).
DATA AVAILABILITY STATEMENT
All relevant data are included in the paper or its Supplementary Information.
CONFLICT OF INTEREST
The authors declare there is no conflict.