Flooding in cold regions, particularly driven by snowmelt and climate variability, presents complex challenges for accurate prediction and effective risk management. The current study aims to bridge these gaps by integrating all three components – cold region river systems, long short-term memory (LSTM) modeling, and a broader set of climate variables. Specifically, we incorporate additional climate parameters, including air temperature, humidity, solar radiation, and wind speed, into the LSTM model. Using 12 years of hourly water level data (2007–2019) from three USGS stations along the Red River – Pembina, Drayton, and Grand Forks – the model predicts water levels at lead times of 6, 12 h, 1 day, 3 days, and 1 week. The incorporation of climate variables significantly improved short-term prediction accuracy, achieving R2 values of 0.999 for 6-h forecasts across all stations, demonstrating the potential for real-time flood warning systems. For 1-week predictions, R2 values were 0.778 for Grand Forks, 0.816 for Drayton, and 0.864 for Pembina, reflecting a decrease in accuracy with longer prediction horizons. These findings highlight the effectiveness of LSTM models for short-term flood forecasting in snowmelt-prone regions and underscore the need for further refinement to address long-term hydrological forecasting challenges, particularly under variable climatic conditions.

  • Long short-term memory excels in forecasting water levels and predicting floods in snowmelt-prone cold regions.

  • The analysis is strengthened by a comprehensive dataset spanning 2007–2019 from the Grand Forks, Drayton, and Pembina stations.

  • Highlights the importance of selecting appropriate climate parameters.

  • The study evaluates flood predictions from 6 h to 1 week in advance.

Forecasting water levels in rivers and lakes is vital for flood prediction and water resource management. Time-series hydrological prediction models analyze past data from hydrological stations to accurately forecast future water levels, aiding in disaster prevention and efficient water resource management. The Red River discharge varies annually and seasonally due to factors like economic development, population growth, and climate change (de Loë 2009). Floods occur when water levels rise over riverbanks, influenced by prolonged precipitation, spring snowmelt, ice jams, flat topography, and severe winters, making the region particularly vulnerable to spring-melt floods.

Research by Hirsch & Ryberg (2012) and Rice et al. (2015) indicates that the frequency of floods in the Red River basin is increasing dramatically (Hirsch & Ryberg 2012; Rice et al. 2015). Early flood forecasting is crucial for providing communities with early warnings to protect homes and lands and mitigate flood impacts. There are three main methods for forecasting streamflow.

Physically based models, such as Soil and Water Assessment Tool and HEC–HMS, rely on detailed hydrological processes but require extensive datasets and significant expertise, limiting their utility for short-term predictions (Nayak et al. 2005; Costabile & Macchione 2015; Kim et al. 2015). Similarly, mathematical models such as IHACRES and HSPF depend on complex parameterization and field observations, making them less suitable for large-scale regional flood assessments (Rajib et al. 2020; Ji et al. 2021; Kovalenko et al. 2022; Nadeem et al. 2022; El-Bagoury & Gad 2024). In contrast, data-driven approaches, including machine learning and deep learning models, provide an effective solution for flood prediction by capturing complex hydrological patterns without requiring extensive parameterization or detailed physical knowledge (Mosavi et al. 2018; Sankaranarayanan et al. 2020; Motta et al. 2021).

Recent research by Belvederesi et al. (2022) reviewed river flow forecasting models for cold climates, highlighting challenges due to significant seasonal variations and limited data availability. The study emphasized the accuracy of empirical models, including machine learning, in forecasting using observed data while also recognizing the importance of process-based models to infer unmeasured hydrological processes. The authors underscored the need for advancements in model structure, user-friendly interfaces, and integration of empirical and process-based approaches to improve forecasting in cold regions (Belvederesi et al. 2022).

Goldberg et al. (2020) investigated flood mapping, monitoring, and prediction from ice jams and snowmelt using operational weather satellites, specifically VIIRS and GOES-R data. The study utilized satellite-derived flood maps, addressing detection challenges by deriving water fraction differences pre- and post-flooding. The authors highlighted the effectiveness of VIIRS and ABI flood products for dynamic monitoring, emphasizing their utility in high-latitude regions due to extensive spatial coverage, frequent revisits, and near-real-time data availability. Integrating satellite imagery with temperature data was suggested to enhance quantitative predictions of flood timing and locations (Goldberg et al. 2020).

Parisouj et al. (2020) used support vector regression (SVR), extreme machine learning, and srtificial neural networks for streamflow prediction in various climatic zones. SVR outperformed other models for monthly and daily streamflow predictions. However, models struggled with the Carson River, a snowmelt-dominated basin, due to challenges in accurately modeling snowmelt–runoff relationships using standard climate parameters, such as precipitation (P), maximum temperature (Tmax), and minimum temperature (Tmin), and their lags (Parisouj et al. 2020).

Sarafanov et al. (2021) introduced a novel composite model approach for short-term flood forecasting, focused on the Lena River basin. The study found that the composite models, validated over 2 years with ten hydro gauges, achieved a Nash–Sutcliffe efficiency value of 0.80, outperforming traditional models such as AR, ARIMA, and stand-alone SRM. The proposed ensemble is particularly effective for snowmelt-induced flooding, demonstrating potential applicability to other rivers with similar regimes. This research highlights the role of automated machine learning in enhancing flood prediction accuracy in cold regions (Sarafanov et al. 2021). Belvederesi et al. (2020) proposed a simple yet effective flow difference model (FDM) for short-term river flow forecasting in cold regions. The study demonstrated that FDM achieved promising accuracy, comparable to more complex mechanistic models, emphasizing the potential of data-driven approaches even in challenging environments (Belvederesi et al. 2020).

Overall, these studies highlight the importance of accurate flood prediction methodologies in cold regions. Using a range of techniques – from empirical models to deep learning and composite modeling – their common goal has been to enhance forecasting accuracy through diverse data integration and advanced modeling. While these studies have made notable progress in predicting floods in cold and harsh environments, there is still a need for new approaches to further improve accuracy.

To address the recent developments in model-free, uncertain, deep artificial intelligence algorithms, we have referenced new studies that illustrate the evolution of these techniques. Tutsoy & Sumbul (2024) presented a deep machine learning algorithm incorporating dimensionality and size reduction, effectively diagnosing thyroid cancer despite challenges from randomly missing multi-dimensional data (Tutsoy & Sumbul 2024). Similarly, Tutsoy & Koç (2024) employed adaptive feature elimination and self-supervised feature weighting for health risk classification using blood test data, significantly enhancing the model's efficiency and accuracy (Tutsoy & Koç 2024). These advancements highlight the applicability of deep AI models across various fields, including the current study's focus on integrating sophisticated algorithms for flood prediction.

Recently, researchers have increasingly adopted deep learning techniques to advance flood prediction and management. These methods have shown promising results in enhancing our understanding of floods and supporting more effective management strategies (Gude et al. 2020; Chen et al. 2022; Kumar et al. 2023), although none of them have yet applied these techniques in cold regions with harsh winters. The following studies illustrate how deep learning and modeling strategies are being used to forecast river flows and manage flood events.

The first research by Bentivoglio et al. (2022) reviews existing applications of deep learning methods in flood mapping and identifies future research directions. They highlight the advantages of deep learning models, such as convolutional layers, in accurately and rapidly mapping flood susceptibility, inundation, and hazard. The study emphasizes the need for further work in real-time flood warning and risk estimation, as well as the potential of graph neural networks and probabilistic models to improve flood mapping accuracy (Bentivoglio et al. 2022).

Building on this foundation, the second research by Kumar et al. (2023) provides a comprehensive review of deep learning applications in flood forecasting and management. While the paper does not detail the methodology used, it outlines current challenges and potential research directions. Kumar et al. emphasize the importance of accurate flood forecasts and control measures, and they identify opportunities for enhancing deep-learning models to improve flood prediction precision and effectiveness (Kumar et al. 2023). Recent studies emphasize the growing significance of long short-term memory (LSTM) in advancing flood prediction and management practices.

Across diverse geographical locations, including Vietnam, China, the United States, and beyond, LSTM models have consistently demonstrated their efficacy in enhancing flood forecasting accuracy and reliability. Gude et al. (2020) laid a crucial foundation by showcasing the superiority of LSTM models in the Meramec River at Valley Park, St. Louis County, Montana, in accurately predicting gauge height, offering real-time insights with reduced uncertainty intervals. Their work underscores the potential of data-driven approaches in flood forecasting. The LSTM model demonstrated the capability for real-time gauge height predictions, capturing all data values within the 95% confidence interval using the data sub-selection method (Gude et al. 2020). In addition, in the study focusing on flood forecasting in Vietnam by Xuan-Hien Le et al. (2019a), the LSTM model showcases remarkable stability and accuracy, particularly in predicting flood flow at the Hoa Binh Station (Le et al. 2019a). Similarly, the research conducted by Chen et al. (2022) in Xi County, China, emphasizes the effectiveness of LSTM-based spatial deep learning networks in improving flood prediction accuracy, highlighting the model's ability to incorporate spatial features into time-series forecasting (Chen et al. 2022).

Building on these previous studies, our current research seeks to further advance flood prediction by implementing a comprehensive approach. Specifically, we aim to integrate LSTM modeling with diverse climate parameters to enhance flood forecasting accuracy in cold regions, with a particular focus on the Red River basin.

Research on flood prediction in cold regions, such as the Red River of the North, has highlighted unique challenges posed by snowmelt-induced flooding and harsh winter conditions. While these studies have made notable progress, they often rely on traditional methods, which constrains their ability to accurately capture the complex hydrological processes needed for improved flood prediction in these challenging environments.

In parallel, other research efforts have demonstrated the effectiveness of advanced methods such as LSTM models in flood prediction. Studies, including Gang Zhao et al. (2022), have shown that LSTM models are highly effective across diverse geographical locations and flood scenarios, such as large-scale flash floods in China (Zhao et al. 2022). However, to date, LSTM models have not been extensively applied in cold regions with challenging winter climates.

Our previous research on the Red River of the North addressed some of these gaps by applying LSTM models to predict water levels, demonstrating superior performance compared to traditional methods such as SARIMA and RF (Atashi et al. 2022, 2023a). However, that study was limited to using a single climate variable – water level data or discharge.

The current study aims to bridge these gaps by integrating all three components – cold region river systems, LSTM modeling, and a broader set of climate variables. Specifically, we incorporate additional climate parameters, including air temperature, humidity, solar radiation, and wind speed, into the LSTM model. By expanding the scope of input variables, this study enhances the robustness and accuracy of flood predictions for cold regions like the Red River basin, where snowmelt-induced flooding is a significant concern. The Red River's unique south-to-north flow, which complicates spring thaw processes, underscores the need for this comprehensive modeling approach.

The present study utilizes a dataset spanning from 2007 to 2019, aligning with prior research to facilitate robust comparative analysis and trend evaluation. Specifically, data from three key USGS stations – Grand Forks, Drayton, and Pembina – were examined, and selected for their extensive and reliable water level records. Additionally, climate data were sourced from three proximate stations: Grand Forks, Grafton, and Humboldt, chosen for their comprehensive climate observations. This study evaluates water levels at various lead times, including 6, 12 h, 1 day, 3 days, and 1 week in advance. The objective is to enhance the accuracy of flood predictions and provide critical insights for managing snowmelt-induced flooding in cold climate regions, particularly the Red River basin.

Study area

The Red River of the North flows through the Red River Valley, which is a relic of glacial Lake Agassiz (Figure 1(a)). Tributary streams surrounding the valley contribute water to the former lake bed, resulting in persistent flood conveyance challenges during periods of heavy snowmelt or intense regional rainfall. Spring snowmelt often triggers annual flooding, with the severity influenced by various factors such as fall precipitation, frost depth, snow accumulation, snowmelt rate, river ice conditions, and spring rain-on-snow events (Todhunter 2001).
Figure 1

(a) Location of Red River study area (b) location of USGS stations on Red River in Grand Forks, Drayton, and Pembina.

Figure 1

(a) Location of Red River study area (b) location of USGS stations on Red River in Grand Forks, Drayton, and Pembina.

Close modal

The Red River of the North, a vital vessel within the basin, meanders for approximately 545 river miles from Wahpeton, North Dakota (ND), to its final destination at Lake Winnipeg (Figure 1(b)). This sinuous watercourse serves as a defining geographical and political boundary between ND and Minnesota (MN) (Lim & Voeller 2009). The hydrological rhythm of the basin follows a seasonal pattern, with peak streamflow typically occurring during spring and early summer. This surge in water volume is primarily driven by snowmelt, rainfall, or a combination of both. Consequently, the region is prone to flooding during these periods, particularly during wet seasons when precipitation is abundant (Board 2000). The flat terrain of the basin, coupled with its climatic predispositions, amplifies the risk of major flood events in the Red River and its tributaries. As such, understanding and managing these hydrological dynamics are paramount for mitigating the impacts of flooding on communities and ecosystems within the basin.

The Pembina River, a tributary of the Red River of the North, is the major source of water in south-central Manitoba. It joins the Red River from the west just south of Pembina, ND, approximately 2 miles (3 km) south of the US-Canadian border. At Pembina, the height of the water flowing down the Red River is recorded by a stream gauge.

The Pembina River, a significant tributary of the Red River of the North, plays a crucial role in the water flow of south-central Manitoba. Its confluence with the Red River, located just south of Pembina, ND, is monitored by a stream gauge maintained by the USGS ND Water Science Center. This gauge, identified as ‘RED RIVER OF THE NORTH AT PEMBINA, ND’ (USGS-05102490), records the water level data crucial for hydrological analysis. With a drainage area of 40,200 square miles, this site provides valuable information for understanding the flow dynamics of the region.

The streamflow records at the Drayton station have been continuous since 1942, with specific conductance measurements starting in 1970. This long-term data allows for the examination of trends in streamflow and water quality. Located in Pembina County, ND, the site, identified as ‘RED RIVER OF THE NORTH AT DRAYTON, ND’ (USGS-05092000), covers a drainage area of 34,800 square miles. Its coordinates are 48.57° latitude and −97.147° longitude (NAD83 datum), with an elevation of 756.06 feet.

The upstream gauge station on the Red River of the North in Grand Forks was established in 1882 by the US Engineers, now the US Army Corps of Engineers. In 1901, Charles M. Hall, a geology professor at ND Agricultural College, added a station above the original gauge with the aim of exploring floodwater storage possibilities for various needs (Dakota Water Science Center). Today, this gauge, identified as ‘RED RIVER OF N AB RED LAKE R AT GRAND FORKS, ND’ (USGS-05071500), maintained by the USGS North Dakota Water Science Center provides continuous records of stream gauge height, discharge, velocity, and water quality parameters, along with real-time web data.

Grand Forks, ND, has a history of enduring flooding events, with notable occurrences documented in the literature. The winter and spring of 1996–1997 witnessed a catastrophic regional snowmelt flood, estimated to have a return period of 100–200 years (Macek-Rowland et al. 2001). Given the area's susceptibility to flooding and the severity of past events, the Grand Forks gaging station serves as an ideal location for examining the assumptions underlying the use of deep learning and machine learning methods for flood prediction.

In the investigation of the cold region river (Atashi et al. 2023b), a comprehensive approach was adopted, unveiling the susceptibility of the study area to flooding. Notably, the focal point for recurrent flooding during spring seasons, especially in wet years, lies at the heart of the Red River basin near Grand Forks, ND, and Emerson, ND. This area's vulnerability stems from its low-lying topography and close proximity to the Red River, rendering it prone to inundation during periods of heightened water levels.

In addition to the aforementioned reasons, the region's significance for studying this area is further evidenced by: (1) its sparse distribution of flow-control structures provides an optimal environment for applying machine learning methods to predict floods. (2) The river's narrow main stem channel poses unique challenges for accurately estimating stage levels using satellite altimetry. (3) The extensive network of well-established USGS gaging stations along its main tributaries offers invaluable field-based data for verifying river flow and stage measurements. (4) Despite experiencing significant property losses during years of heavy snowfall, there remains a notable gap in the understanding of the basin's hydrologic response to climate variability. Frequent flooding has been an issue for the Red River of the North at Grand Forks, ND, most notably the major floods of 1882, 1897, 1950, 1996, 1997, 2006, 2009, and 2011, and that is why Grand Forks stream gage data is essential to flood protection for the cities of Grand Forks, ND, and East Grand Forks, MN. Our selection of the Grand Forks station was strategic, coupled with two other stations downstream of Grand Forks, which were instrumental in facilitating a comprehensive analysis of flood dynamics within the region.

Data representation and preprocessing

Water level data for these stations were obtained from USGS hourly gauge height records spanning from 1 November 2007 to 31 December 2019. During data preprocessing, missing values were addressed using linear interpolation for periods of less than eight consecutive hours. Periods with more than eight consecutive hours of missing data were excluded from the dataset. Among the three selected stations, the Pembina station on the Red River, situated downstream, lacks river flow discharge data. The characteristics of the three datasets are summarized in Table 1.

Table 1

Characteristics of the water level time series at three hydrology stations of the Red River

Station No.Station namePeriodNo. of samplesFrequency
Pembina 2007–2019 104616 Hourly 
Drayton 2007–2019 100140 Hourly 
Grand Forks 2007–2019 105117 Hourly 
Station No.Station namePeriodNo. of samplesFrequency
Pembina 2007–2019 104616 Hourly 
Drayton 2007–2019 100140 Hourly 
Grand Forks 2007–2019 105117 Hourly 

In addition to water level data collected from each USGS station, climate parameters, including air temperature, relative humidity, wind speed, and solar radiation, were sourced from the North Dakota Agricultural Weather Network (NDAWN), which consists of 155 stations distributed across ND and border regions of surrounding states (https://ndawn.ndsu.nodak.edu/, accessed on 29 February 2024) (North Dakota Agricultural Weather Network (NDAWN); Macek-Rowland et al. 2001; Atashi et al. 2023b). Specifically, data from the Grand Forks NDAWN station (47.836, −97.07), located approximately 7.6 miles from the Grand Forks USGS station, were utilized. Similarly, the NDAWN station in Grafton (48.41, −97.186), situated roughly 13 miles away, provided climate data for Drayton. Lastly, climate data for the Pembina USGS station were obtained from the NDAWN station in Humboldt (48.88, −97.15), located approximately 8 miles distant.

Air temperature plays a critical role in determining the rate of snowmelt, particularly during the spring thaw. In cold regions such as the Red River Valley, rising temperatures during the transition from winter to spring lead to rapid melting of accumulated snow, resulting in increased runoff into the river system. This surge in water significantly raises river levels, heightening the risk of flooding (Todhunter 2001). Relative humidity directly influences both precipitation and evaporation processes. High humidity levels are often correlated with heavy rainfall events, which can substantially increase river flow and the risk of flooding (Belvederesi et al. 2022). Wind speed impacts the redistribution of snow and affects the melting dynamics of snowpacks. Strong winds can accelerate the melting process by increasing the surface area of snow exposed to air, thereby enhancing both sublimation and evaporation rates. This increased exposure leads to a more rapid reduction of snowpacks, influencing runoff patterns (Goldberg et al. 2020). Solar radiation is a primary driver of snowmelt by providing the energy needed for the phase change from ice to water. During late winter and early spring, high solar radiation levels accelerate snowpack melting, which significantly contributes to river discharge and increases the potential for flooding (Le et al. 2019b).

Figure 2 shows the climatic and hydrological patterns over a 12-year period, from the end of 2007 to the end of 2019, highlighting the interactions between temperature, relative humidity, wind speed, solar radiation, and water levels. The graphs provide a comprehensive analysis of the local climate and hydrological conditions during a 12-year period from the end of 2007 to the end of 2019. The data provides a clear picture of a climate with distinct seasonal patterns and reasonably consistent long-term trends.
Figure 2

Climatic and hydrological patterns from 2007 to 2019: temperature, humidity, wind speed, solar radiation, and water levels.

Figure 2

Climatic and hydrological patterns from 2007 to 2019: temperature, humidity, wind speed, solar radiation, and water levels.

Close modal

The average air temperature exhibits pronounced seasonal fluctuations, with peaks in summer and troughs in winter. The temperature range spans approximately −37 to 96.8 °F, showcasing the area's considerable seasonal temperature variation. Notably, there is no discernible long-term warming or cooling trend over this period, suggesting relative climate stability in terms of temperature. Relative humidity in the region fluctuates widely, typically ranging between 8.6 and 100%. A seasonal pattern is evident, with higher humidity levels in winter months and lower levels in summer. This inverse relationship with temperature is a common characteristic in numerous climates. The fact that humidity rarely drops below 20% indicates a generally moist climate throughout the year. Wind speeds in the area mostly range from 0 to about 34 mph, with occasional spikes reaching up to 35 mph. Unlike temperature and humidity, wind speeds show no clear seasonal pattern and appear relatively consistent across the years, without any obvious long-term trends. Solar radiation demonstrates strong seasonal patterns, closely following the temperature trends. It peaks in summer, with maximum values reaching about 84 Langleys, likely corresponding to clear summer days. In winter, values approach zero, reflecting shorter days and potentially increased cloud cover. The water level data presents an interesting contrast to the more regular patterns of the atmospheric variables. It shows significant variability, with sporadic high peaks reaching about 50 ft and long periods of lower levels around 15–20 ft. The absence of a clear seasonal pattern in water levels suggests that they are influenced by factors beyond regular seasonal changes, such as specific precipitation events or water management practices.

Interrelationships between these factors are noticeable. The inverse correlation between temperature and relative humidity is a classic climate relationship. The close alignment of solar radiation with temperature patterns underscores the strong influence of seasonal solar input on local climate. Over the 12-year period, all variables show consistent patterns without significant shifts, indicating a relatively stable climate system. However, the data does capture several extreme events, including winter temperatures dropping well below −20 °F and significant water level spikes that could indicate flood events or sudden releases from water management structures. The consistent patterns observed over this extended period provide a solid baseline for future comparisons and trend analyses.

Long short-term memory

The primary objective function used for training the LSTM model is mean squared error (MSE) and mean absolute error (MAE), which help minimize the error between predicted and actual values. Specifically, MSE provides a quadratic penalty for errors, making it highly sensitive to larger deviations, while MAE measures the average magnitude of errors, offering a more linear perspective.

The LSTM model uses gradient descent with backpropagation through time to learn parameters, involving iterative optimization to adjust the weights and biases to minimize the loss function. The Adam optimizer, which is a variant of gradient descent, was selected for its ability to efficiently handle large datasets and adaptive learning rates.

Motivated by the success of non-parametric deep learning models, such as LSTM (Le et al. 2019a; Gude et al. 2020; Chen et al. 2022; Zhao et al. 2022). The LSTM approach was used to capture the components of the time series separately. This method was tested on real datasets from the Red River for hourly water level forecasting. This selected method is discussed in the following section. Unlike conventional recurrent neural networks, LSTM networks excel in capturing temporal dependencies, making them ideal for modeling nonlinear time-series data like hydrological phenomena. To be more specific, RNN includes a memory cell that grabs data until the training data sequence is completed. With memory cells regulating information flow through input, forget, and output gates, LSTM networks maintain both short-term and long-term memory.

It is important to note that two types of activation functions are commonly used in the LSTM network: the sigmoid function (σ) and the tanh function. The sigmoid function is used to determine scalars for amplification or reduction of input values (as a gating mechanism), while the tanh function normalizes data into a consistent encoding, enabling more effective feature learning. In this research, the tanh function was primarily used for input data normalization.

The model iteratively applies the same process to all elements in the sequence. However, there are gradient issues to train long time lags, which is required to predict time series or hydrology (Akar & Güngör 2012). Consider the , , are input, output, and forget gate at the time of t.

Figure 3 depicts the LSTM architecture, adapted from the work of Le et al. (2019b). In this diagram, variables xt and ht represent the input and state at time t, respectively, while Ct and ht represent the long-term and short-term (hidden) memory within the cell. The diagram helps visualize the sequence of operations within the network, facilitating long-term learning.
Figure 3

Memory block with the memory cell Ct.

Figure 3

Memory block with the memory cell Ct.

Close modal
Each input is multiplied by its respective weight matrix ( or ) and then summed, followed by adding a bias. The result is passed through a sigmoid function, squashing it between 0 and 1, making it suitable as a scalar for amplification or reduction. This value is then used in point-wise multiplication to adjust the information flow. The subsequent equations illustrate the computation of Ct and ht at tth step in this process. Equations for LSTM gate operations are:
(1)
(2)
(3)
(4)
(5)
(6)

Where ,,, are the weight matrices associated with the input xt. ,,, are the weight matrices associated with the previous hidden state . ,,, , are the bias terms for the input, forget, output gates, and candidate cell state. and are matrices for weight; bi is the bias; σ is a sigmoid activation function is the candidate for the cell state value.

Research suggests that LSTMs are more effective at capturing long-term predictions and temporal dependencies (Fukuoka et al. 2018; Lu et al. 2020). This study was conducted in five stages using a methodology that is shown in Figure 4. Implemented in this study was the time delay model using ‘Keras: The Python Deep Learning library’. Following established protocols, we partitioned the dataset into training and testing subsets, dedicating 70% to training, 15% for validation, and the remaining 15% for testing.
Figure 4

Workflow diagram for flood prediction.

Figure 4

Workflow diagram for flood prediction.

Close modal

Choosing a model, activation functions, an optimizer, and the number of layers and neurons in each layer are all important aspects in designing a deep learning architecture. In this phase of the model training process, the optimal optimizer and loss function were investigated and specified during compilation. During model fitting, the batch size and epoch count were set to control the training process. For the optimizer, learning rate, and loss function of the models, MSE, Adam, and a learning rate of 0.0001 were selected.

MSE, MAE, and root mean square error (RMSE) were used to assess the prediction accuracy of the created models (Chatfield 2000). Evaluation measures including MAE, MSE, and RMSE are frequently used in time-series forecasting to evaluate a model's ability to predict. The average squared difference between the actual and predicted values is measured by MSE. The square root of the average squared difference between the actual and forecasted values is measured by RMSE, and the average absolute difference between the actual and predicted values is measured by MAE. Lower values of these indicators demonstrate superior predictive performance. The following is the computation formula displayed:
(7)
(8)
(9)
where N is the number of observations, is the observed/actual data, and is the predicted data.

In this section, the performance of LSTM networks for predicting water level data is analyzed. The input features include water level, air temperature, relative humidity, wind speed, and solar radiation, covering a period from December 2007 to December 2019. To ensure robust model training and evaluation, the dataset was divided into three parts: 70% for training, 15% for validation, and 15% for testing. The testing phase focuses on predictions for December 2017 to December 2019. The LSTM model was trained to predict water levels at different future intervals: 6 , 12 h, 1 day, 3 days, and 1 week ahead. These prediction horizons enable a comprehensive evaluation of the model's performance for both short- and long-term forecasts. Table 2 summarizes the prediction accuracy across these time intervals using metrics such as RMSE, MSE, MAE, and R2.

Table 2

Evaluation of the performance of LSTM models at three USGS stations (RMSE, MSE, MAE, and R2 between the predicted and observed water level data in the testing phase)

6 h12 h1 day3 days1 week
Grand Forks      
 RMSE 0.15 0.221 0.401 1.412 2.842 
 MSE 0.022 0.049 0.168 1.994 8.0.82 
 MAE 0.103 0.136 0.211 0.700 1.342 
R2 0.999 0.999 0.995 0.925 0.778 
Drayton      
 RMSE 0.168 0.275 0.480 1.365 3.267 
 MSE 0.028 0.076 0.230 0.186 10.67 
 MAE 0.139 0.194 0.264 0.695 1.714 
R2 0.999 0.999 0.996 0.968 0.816 
Pembina      
 RMSE 0.180 0.267 0.457 1.611 3.552 
 MSE 0.033 0.071 0.209 2.595 12.619 
 MAE 0.142 0.205 0.304 0.943 2.106 
R2 0.999 0.999 0.998 0.972 0.864 
6 h12 h1 day3 days1 week
Grand Forks      
 RMSE 0.15 0.221 0.401 1.412 2.842 
 MSE 0.022 0.049 0.168 1.994 8.0.82 
 MAE 0.103 0.136 0.211 0.700 1.342 
R2 0.999 0.999 0.995 0.925 0.778 
Drayton      
 RMSE 0.168 0.275 0.480 1.365 3.267 
 MSE 0.028 0.076 0.230 0.186 10.67 
 MAE 0.139 0.194 0.264 0.695 1.714 
R2 0.999 0.999 0.996 0.968 0.816 
Pembina      
 RMSE 0.180 0.267 0.457 1.611 3.552 
 MSE 0.033 0.071 0.209 2.595 12.619 
 MAE 0.142 0.205 0.304 0.943 2.106 
R2 0.999 0.999 0.998 0.972 0.864 

Figures 57 offer a comprehensive analysis of water level predictions for different time horizons at three locations along the Red River: Grand Forks (Figure 5), Drayton (Figure 6), and Pembina (Figure 7). These figures illustrate predictions from downstream to upstream. In each figure, panel (a) represents predictions 6 h ahead, panel (b) represents predictions 1 day ahead, panel (c) represents predictions 3 days ahead, and panel (d) represents predictions 1 week ahead. Across all three locations, Grand Forks, Drayton, and Pembina, a clear trend emerges: the accuracy of predictions decreases as the forecast period extends. This is evidenced by the increasing values of RMSE, MSE, and MAE, and the decreasing R2 values as we move from 6-h to 1-week predictions.
Figure 5

Visual comparison of (a) 6 h, (b) 1 day, (c) 3 days, and (d) 1 week-ahead predicted values using LSTM forecasting methods with true values on the Grand Forks series.

Figure 5

Visual comparison of (a) 6 h, (b) 1 day, (c) 3 days, and (d) 1 week-ahead predicted values using LSTM forecasting methods with true values on the Grand Forks series.

Close modal
Figure 6

Visual comparison of (a) 6 h, (b) 1 day, (c) 3 days, and (d) 1 week-ahead predicted values using LSTM forecasting methods with true values on the Drayton series.

Figure 6

Visual comparison of (a) 6 h, (b) 1 day, (c) 3 days, and (d) 1 week-ahead predicted values using LSTM forecasting methods with true values on the Drayton series.

Close modal
Figure 7

Visual comparison of (a) 6 h, (b) 1 day, (c) 3 days, and (d) 1 week-ahead predicted values using LSTM forecasting methods with true values on the Pembina series.

Figure 7

Visual comparison of (a) 6 h, (b) 1 day, (c) 3 days, and (d) 1 week-ahead predicted values using LSTM forecasting methods with true values on the Pembina series.

Close modal

Across all three locations, Grand Forks, Drayton, and Pembina, a clear trend emerges: the accuracy of predictions decreases as the forecast period extends. This is evidenced by the increasing values of RMSE, MSE, and MAE, and the decreasing R2 values as we move from 6-h to 1-week predictions.

For short-term predictions (6 h), panel (a) in Figures 57, all three locations shows that remarkably high accuracy. The R2 values are consistently 0.999, indicating that the models explain 99.9% of the variability in water levels. The low RMSE, MSE, and MAE values further confirm the high precision of these short-term forecasts. This suggests that for immediate operational decisions, these models are highly reliable across all three locations. Additionally, RMSE values remain low (e.g., 0.15 ft at Grand Forks), confirming the high precision of short-term forecasts. These results demonstrate the reliability of the model for immediate operational decision-making.

For 1-day predictions, accuracy slightly declines, with R2 values ranging from 0.995 to 0.998 across the three stations. Despite a minor increase in error metrics, the model remains highly reliable for day-ahead planning. In contrast, 3-day and 1-week predictions show more pronounced reductions in accuracy. R2 values for 3-day predictions range from 0.925 to 0.972, while 1-week predictions show R2 values between 0.778 and 0.864. The RMSE also increases significantly over time; for example, the 1-week RMSE is 2.842 ft at Grand Forks, nearly 18 times higher than the 6-h RMSE of 0.15 ft. The visual comparisons in Figures 57 corroborate the statistical findings. For all locations, the 6-h and 1-day predictions closely align with actual values, while the 3-day and 1-week predictions show increasing divergence, particularly during periods of rapid water level changes or extreme events.

Notably, Grand Forks consistently demonstrates the lowest RMSE values across all time horizons, indicating higher prediction accuracy compared to Drayton and Pembina. However, Pembina shows higher R2 values for longer-term predictions, suggesting the model captures overall trends more effectively at this location. For example, Grand Forks' 1-week RMSE is 19.99% lower than Pembina's (2.842 vs. 3.552), but its R2 value is slightly lower (0.778 vs. 0.864). This discrepancy highlights the interplay between absolute error magnitude (RMSE) and trend-capturing ability (R2), illustrating how different metrics provide unique insights into model performance.

The differences in model performance across locations can be attributed to their geographical and hydrological contexts. Grand Forks, located upstream, benefits from simpler hydrological conditions, as it is less influenced by downstream tributaries or cumulative uncertainties. In contrast, Drayton and Pembina experience added complexities from tributary inputs, flood wave propagation, and the flat terrain of the Red River Valley, contributing to higher RMSE values. Pembina's higher R2 values for long-term predictions suggest that while the model captures trends well downstream, the increased variability in water levels makes absolute error metrics (e.g., RMSE) higher. This pattern underscores the challenges of forecasting for downstream locations, where hydrological complexity increases due to cumulative flows and the river's broader floodplain.

Seasonal dynamics, particularly spring snowmelt, further complicate predictions. Factors such as fall precipitation, frost depth, snow accumulation, snowmelt rate, river ice conditions, and spring rain-on-snow events exacerbate hydrological uncertainties, especially for longer-term forecasts. These complexities align with the observed increases in RMSE values for 3-day and 1-week predictions. Additionally, Pembina's downstream location and its proximity to international boundaries may introduce cross-border hydrological factors, such as tributary contributions from Canadian watersheds, which are not fully captured by the model.

Historical flooding events, particularly in Grand Forks, underscore the importance of accurate predictions and the challenges inherent in modeling such a complex hydrological system. The 1997 Red River flood, for instance, demonstrated how upstream forecasts directly influence preparedness for downstream communities. While Grand Forks benefits from its upstream position and simpler flow conditions, the downstream complexities faced by Drayton and Pembina emphasize the need for region-specific calibration to enhance predictive accuracy.

In summary, the LSTM model demonstrates exceptional accuracy for short-term predictions and robust performance for medium-term forecasts. However, accuracy declines for longer-term predictions due to increasing hydrological uncertainties and regional complexities. While Grand Forks consistently provides the most accurate predictions overall due to its upstream location, Pembina's higher predictability for long-term trends highlights the model's capacity to capture downstream patterns despite increased error metrics. These findings emphasize the value of LSTM networks for real-time flood forecasting and underscore the importance of refining models to better address long-term prediction challenges and the unique characteristics of different river segments.

From Figure 8, which depicts water level predictions for Pembina during a one-month period from mid-August to mid-September 2017, notable differences between the 6-h and 3-day ahead predictions can be observed.
Figure 8

Visual comparison of (a) 3 days and (b) 1 week-ahead predicted values using LSTM forecasting methods with true values on the Pembina series for a 1-month period.

Figure 8

Visual comparison of (a) 3 days and (b) 1 week-ahead predicted values using LSTM forecasting methods with true values on the Pembina series for a 1-month period.

Close modal

The 6 h ahead predictions (Figure 8(a)) show a close alignment with the actual water levels, accurately capturing both the overall trend and smaller fluctuations. The predicted values follow the observed data closely, with minimal deviations, highlighting the model's strong performance for short-term forecasts.

In contrast, the 3-day-ahead predictions (Figure 8(b)) exhibit more significant discrepancies. While the general trends are captured, differences in both the magnitude and timing of water level changes are evident. The predicted values display a smoother pattern, occasionally lagging behind or anticipating changes in water levels compared to the observed data.

From mid-August to mid-September 2017, Pembina's 6-h predictions closely align with observed water levels, capturing both trends and fluctuations with high precision. In contrast, 3-day predictions display smoother trends with noticeable delays and reduced accuracy in magnitude. For instance, during the 25th August decline (12.5–12.2 ft), the 6-h predictions track the drop precisely, while the 3-day forecasts show a delayed and less pronounced response. Similarly, during the 5–10 September rise (11.7–12.2 ft), the 6-h predictions accurately reflect the increase, whereas 3-day forecasts lag and underestimate the magnitude.

Despite these differences, the 3-day predictions still provide valuable insights into overall water level trends. For instance, both prediction timeframes accurately capture the general pattern of rising and falling water levels, even if the 3-day predictions are less precise.

This study demonstrates the efficacy of LSTM networks in improving flood prediction accuracy for the Red River of the North, USA, by integrating multiple climate parameters – air temperature, relative humidity, wind speed, and solar radiation – alongside water level data spanning from 2007 to 2019. The LSTM model proved highly effective for short-term forecasts, achieving R2 values of 0.999 for 6-h predictions across all stations and a mean RMSE of 0.15 ft. These results highlight the model's suitability for real-time flood warning systems, enabling timely decision-making to mitigate flood damage and enhance community safety.

The integration of diverse climate parameters represents a significant advancement over traditional methods that often rely on single-variable inputs. By capturing the complex interplay of climatic factors, this study enhances the understanding of snowmelt-driven flood dynamics, particularly in cold regions. The findings reveal that incorporating these inputs leads to more reliable and accurate flood forecasts, which is crucial for regions vulnerable to snowmelt-induced flooding. For medium-term predictions, the model achieved an R2 value of 0.995 for 1-day forecasts and 0.925 for 3-day forecasts, demonstrating strong performance even as prediction horizons extend.

However, challenges emerged for long-term forecasts, with the R2 value decreasing to 0.778 and RMSE increasing to 2.842 ft for 1-week predictions. These findings underscore the need for further refinement of LSTM models to address hydrological uncertainties associated with extended time horizons, particularly under variable climatic conditions. Despite these limitations, the model consistently captured overall flood trends, indicating its robustness for operational use in early warning systems and general flood prediction tasks.

Notably, the analysis of station-specific results revealed significant differences. Grand Forks consistently demonstrated lower RMSE values but lower R2 compared to Drayton and Pembina, especially for medium- to long-term predictions. This suggests that while the model captured overall trends effectively at Grand Forks, it struggled more to explain variability compared to the downstream locations. Pembina's higher R2 values for longer-term forecasts indicate stronger trend-capturing ability, despite higher RMSE, likely due to the cumulative effects of tributary inputs and downstream hydrological complexities. These findings highlight the importance of localized calibration to account for site-specific factors and improve prediction accuracy uniformly across river segments.

This research advances flood prediction science by pioneering an LSTM-based approach that integrates multifaceted climate data for cold region hydrological modeling. By synthesizing advanced deep-learning techniques with intricate climate parameters, the study transcends traditional forecasting limitations, offering a robust, data-driven methodology for predicting snowmelt-induced flooding. The proposed framework improves prediction accuracy and provides actionable insights for infrastructure resilience, emergency preparedness, and risk mitigation strategies, bridging critical gaps between machine learning techniques and practical climate adaptation needs.

A key limitation of the study is the increased model complexity due to the inclusion of multiple climate variables, which made achieving optimal long-term prediction performance challenging. Furthermore, while the model performed well for the Red River of the North, its applicability to other regions remains untested. Future research should aim to refine LSTM models to enhance long-term prediction accuracy, explore real-time data integration for dynamic forecasting, and validate the model's effectiveness across diverse geographical regions and river systems.

Future studies will focus on integrating snowpack and soil moisture data obtained from satellite sources, such as NASA's Soil Moisture Active Passive and MODIS, to enhance the accuracy of flood predictions during snowmelt events. Additionally, hybrid models will be implemented by merging physically based models, such as the Soil and Water Assessment Tool, with LSTM-based deep learning approaches to improve both short-term and long-term prediction accuracy. Ensemble learning methods, such as stacking LSTM with gradient boosting or random forest, will also be investigated to increase model robustness across different climatic and geographic conditions. Finally, the methodology will be expanded to river systems beyond the Red River, such as the Missouri River, to gain insights into the adaptability of the LSTM approach to other snowmelt-dominated hydrological environments.

All relevant data are included in the paper or its Supplementary Information.

The authors declare there is no conflict.

Akar
Ö.
&
Güngör
O.
(
2012
)
Classification of multispectral images using random forest algorithm
,
Journal of Geodesy and Geoinformation
,
1
(
2
),
105
112
.
Atashi
V.
,
Gorji
H. T.
,
Shahabi
S. M.
,
Kardan
R.
&
Lim
Y. H.
(
2022
)
Water level forecasting using deep learning time-series analysis: a case study of Red River of the North
,
Water
,
14
(
12
),
1971
.
Atashi
V.
,
Kardan
R.
,
Gorji
H. T.
&
Lim
Y. H.
(
2023a
) '
Comparative study of deep learning LSTM and 1D-CNN models for real-time flood prediction in Red River of the North, USA
',
2023 IEEE International Conference on Electro Information Technology (eIT)
.
IEEE
.
Atashi
V.
,
Mahmood
T. H.
&
Rasouli
K.
(
2023b
)
Impacts of Climatic Variability on Surface Water Area Observed by Remotely Sensed Imageries in the Red River Basin
.
Geocarto International, Lewis University
,
Romeoville, IL
, pp.
1
18
.
Belvederesi
C.
,
Dominic
J. A.
,
Hassan
Q. K.
,
Gupta
A.
&
Achari
G.
(
2020
)
Short-term river flow forecasting framework and its application in cold climatic regions
,
Water
,
12
(
11
),
3049
.
Belvederesi
C.
,
Zaghloul
M. S.
,
Achari
G.
,
Gupta
A.
&
Hassan
Q. K.
(
2022
)
Modelling river flow in cold and ungauged regions: a review of the purposes, methods, and challenges
,
Environmental Reviews
,
30
(
1
),
159
173
.
Bentivoglio
R.
,
Isufi
E.
,
Jonkman
S. N.
&
Taormina
R.
(
2022
)
Deep learning methods for flood mapping: a review of existing applications and future research directions
,
Hydrology and Earth System Sciences
,
26
(
16
),
4345
4378
.
Board
R. R. B.
(
2000
)
Inventory Team Report: Hydrology
.
Moorhead, MN, USA
:
Red River Basin Board
.
Chatfield
C.
(
2000
)
Time-Series Forecasting
.
Chapman and Hall/CRC
,
Boca Raton, Florida
.
Chen
C.
,
Jiang
J.
,
Liao
Z.
,
Zhou
Y.
,
Wang
H.
&
Pei
Q.
(
2022
)
A short-term flood prediction based on spatial deep learning network: a case study for Xi County, China
,
Journal of Hydrology
,
607
,
127535
.
Costabile
P.
&
Macchione
F.
(
2015
)
Enhancing river model set-up for 2-D dynamic flood modelling
,
Environmental Modelling & Software
,
67
,
89
107
.
Dakota Water Science Center
Red River of the North at Grand Forks, North Dakota - 129 Years
.
de Loë
R.
(
2009
)
Sharing the Waters of the Red River Basin: A Review of Options for Transboundary Water Governance
.
Guelph, Canada
:
Prepared for International Red River Board, International Joint Commission. Rob de loë Consulting Services
.
Fukuoka
R.
,
Suzuki
H.
,
Kitajima
T.
,
Kuwahara
A.
&
Yasuno
T.
(
2018
)
Wind speed prediction model using LSTM and 1D-CNN
,
Journal of Signal Processing
,
22
(
4
),
207
210
.
Goldberg
M. D.
,
Li
S.
,
Lindsey
D. T.
,
Sjoberg
W.
,
Zhou
L.
&
Sun
D.
(
2020
)
Mapping, monitoring, and prediction of floods due to ice jam and snowmelt with operational weather satellites
,
Remote Sensing
,
12
(
11
),
1865
.
Hirsch
R. M.
&
Ryberg
K. R.
(
2012
)
Has the magnitude of floods across the USA changed with global CO2 levels?
,
Hydrological Sciences Journal
,
57
(
1
),
1
9
.
Le
X.-H.
,
Ho
H. V.
&
Lee
G.
(
2019b
)
River streamflow prediction using a deep neural network: a case study on the Red River, Vietnam
,
Korean Journal of Agricultural Science
,
46
(
4
),
843
856
.
Lim
Y. H.
&
Voeller
D. L.
(
2009
)
Regional flood estimations in Red river using L-moment-based index-flood and bulletin 17B procedures
,
Journal of Hydrologic Engineering
,
14
(
9
),
1002
1016
.
Lu
W.
,
Li
J.
,
Li
Y.
,
Sun
A.
&
Wang
J.
(
2020
)
A CNN-LSTM-based model to forecast stock prices
,
Complexity
,
2020
(
1
),
1
10
.
Macek-Rowland
K. M.
,
Burr
M. J.
&
Mitton
G. B.
(
2001
)
Peak Discharges and Flow Volumes for Streams in the Northern Plains, 1996–97
.
US Geological Survey
,
Denver, CO
.
Mosavi
A.
,
Ozturk
P.
&
Chau
K.-w.
(
2018
)
Flood prediction using machine learning models: literature review
,
Water
,
10
(
11
),
1536
.
Motta
M.
,
de Castro Neto
M.
&
Sarmento
P.
(
2021
)
A mixed approach for urban flood prediction using machine learning and GIS
,
International Journal of Disaster Risk Reduction
,
56
,
102154
.
Nadeem
M.
,
Waheed
Z.
,
Ghaffar
A.
,
Javaid
M.
,
Hamza
A.
,
Ayub
Z.
,
Nawaz
M.
,
Waseem
W.
,
Hameed
M.
&
Zeeshan
A.
(
2022
)
Application of HEC-HMS for flood forecasting in hazara catchment Pakistan, south Asia
,
International Journal of Hydrology
,
6
(
1
),
7
12
.
Nayak
P.
,
Sudheer
K.
,
Rangan
D.
&
Ramasastri
K.
(
2005
)
Short-term flood forecasting with a neurofuzzy model
,
Water Resources Research
,
41
(
4
),
W04004
.
North Dakota Agricultural Weather Network (NDAWN)
Rajib
A.
,
Liu
Z.
,
Merwade
V.
,
Tavakoly
A. A.
&
Follum
M. L.
(
2020
)
Towards a large-scale locally relevant flood inundation modeling framework using SWAT and LISFLOOD-FP
,
Journal of Hydrology
,
581
,
124406
.
Rice
J. S.
,
Emanuel
R. E.
,
Vose
J. M.
&
Nelson
S. A.
(
2015
)
Continental US streamflow trends from 1940 to 2009 and their relationships with watershed spatial characteristics
,
Water Resources Research
,
51
(
8
),
6262
6275
.
Sankaranarayanan
S.
,
Prabhakar
M.
,
Satish
S.
,
Jain
P.
,
Ramprasad
A.
&
Krishnan
A.
(
2020
)
Flood prediction based on weather parameters using deep learning
,
Journal of Water and Climate Change
,
11
(
4
),
1766
1783
.
Sarafanov
M.
,
Borisova
Y.
,
Maslyaev
M.
,
Revin
I.
,
Maximov
G.
&
Nikitin
N. O.
(
2021
)
Short-term river flood forecasting using composite models and automated machine learning: the case study of Lena River
,
Water
,
13
(
24
),
3482
.
Todhunter
P. E.
(
2001
)
A hydroclimatological analysis of the red rwer of the north snowmelt flood catastrophe of 1997 1
,
Journal of the American Water Resources Association
,
37
(
5
),
1263
1278
.
Zhao
G.
,
Liu
R.
,
Yang
M.
,
Tu
T.
,
Ma
M.
,
Hong
Y.
&
Wang
X.
(
2022
)
Large-scale flash flood warning in China using deep learning
,
Journal of Hydrology
,
604
,
127222
.
This is an Open Access article distributed under the terms of the Creative Commons Attribution Licence (CC BY 4.0), which permits copying, adaptation and redistribution, provided the original work is properly cited (http://creativecommons.org/licenses/by/4.0/).