ABSTRACT
This study introduces an improved version of the harmonic oscillator seasonal trend (HOST) model framework to accurately simulate medium- and long-term changes in extreme events, focusing on streamflow droughts in the Mobile River catchment. The updated model incorporates new mathematical formulas and waveform synthesis, resulting in improved performance. The updated framework successfully captures long-term and seasonal patterns with a Kling -Gupta efficiency exceeding 0.5 for seasonal fluctuations and over 0.9 for trends. The best-fit model explains around 98% of long-term and approximately 55% of seasonal variance. Test sets show slightly lower accuracies, with about 20% of nodes underperforming due to the absence of drought during the test phase resulting in false-positive model forecasts. The newly developed weighted occurrence classification outperforms the binary classification occurrence model. In addition, application of an automatic period multiplier for decomposition using the seasonal trend decomposition using LOESS (locally estimated scatterplot smoothing) method improves test dataset performance and reduces false-positive forecasts. The improved framework provides valuable insights for extreme flow distribution, offering potential for improved water management planning, and the combination of the HOST model with physical models can address short-term drivers of extreme events, enhancing drought occurrence forecasting and water resource management strategies.
HIGHLIGHTS
An improved harmonic oscillator seasonal trend model framework is introduced.
The updated framework successfully captures long-term and seasonal patterns of streamflow.
The best-fit model explains around 98% of long-term and 55% of seasonal variance.
The model can be used for simulation of other processes characterized by repeatability.
INTRODUCTION
The occurrence of extreme phenomena is considered a random process with underlying factors stemming from other processes such as precipitation, temperature, and humidity, which affect evaporation (Apurv et al. 2017; Zhao et al. 2019). The water balance, expressed as the ratio of precipitation to evaporation, allows for estimating the risk of occurrence of hydrological extremes like floods and droughts (Rateb et al. 2021; Jang et al. 2022). While floods or river water level surges are usually associated with an excess of water supply, whether in the form of precipitation, snowmelt, or surface runoff (Wehner et al. 2017), droughts occur due to a deficit in water supply when evaporation exceeds precipitation, such as when there is no rainfall for an extended period.
In recent years, more studies have focused on modeling droughts due to their increasing magnitudes and importance in shaping water availability and impacting human activities, risk management, and ecosystem stability (Mishra & Singh 2011). Hydrological extreme events such as droughts are inherently stochastic phenomena (Mishra & Desai 2005), influenced by transient factors such as precipitation and evaporation. Droughts are characterized by their long-term development process, related to large-scale atmospheric flow and climate patterns (Brunner et al. 2021); however, in recent years, the term ‘flash drought’ has been adopted to define fast-developing temporal droughts, which are becoming more common in an era of changing climate (Mo & Lettenmaier 2015; Otkin et al. 2018). This term describes rapidly occurring processes in the environment leading to the sudden onset of drought; therefore, the dynamics of the phenomenon are variable, and increasing attention is paid to the rate of process occurrence and the factors that condition them (Blauhut et al. 2016; Huang et al. 2017).
The occurrence of droughts exhibits periods with varying probabilities, attributed to larger environmental processes, all of which impact subsequent water availability (Chiew & McMahon 2002; Morid et al. 2006). These processes can be categorized into three scales: temporary, seasonal, and long-term, reflecting different event time frames and influencing factors. For this reason, scientists have been trying to understand the patterns of drought development and estimate the impact of individual factors on their intensity for several decades, leading to the recognition of several patterns in their development (WMO & GWP 2016; Wehner et al. 2017). The most regarded factor is the lack of precipitation, such that for a long time it was believed that droughts developed as a result of below-average atmospheric water supply. This type of drought, which starts with meteorological drought and subsequently progresses to agricultural and hydrological drought, is considered the primary type of drought (Wilhite & Glantz 1985; Wang et al. 2016; Minea et al. 2022). In addition to this classical pattern of drought development, increasing attention is being given to multiple other characteristics such as geologic conditions. In recent years, more evidence has emerged suggesting that the geology of an area, at least in some regions of the world, plays a more significant role than precipitation. Worldwide, there is an emerging phenomenon in which hydrological drought, associated with low surface water levels, is observed despite atmospheric conditions with adequate precipitation (Staudinger et al. 2014; Raczyński & Dyer 2020; Zhao et al. 2020; Peterson et al. 2021).
These observations have prompted scientists to take a closer look at the repeatability of hydrological extreme occurrences, and, although research in this area is only available to a limited extent, more and more studies are recognizing the cyclicality of the phenomenon (McMahon & Finlayson 2003; Rad et al. 2016; Marx et al. 2018; Forootan et al. 2019). This fact has been confirmed to such an extent that some definitions of drought include this characteristic within the definition of the phenomenon, stating that drought is a recurring process of water deficits occurring in the environment (Mahandrakumar 2022; Singh et al. 2023)
Research conducted on climate change suggests that extreme events will become more common in the coming years (Fang et al. 2018; Grillakis 2019; Cook et al. 2020; IPCC 2022); however, it is unclear where and in which direction these changes will occur. Studies available for the same regions show that, depending on the adopted rules and parameters, drought progression directions and forecasts are ambiguous (Beniston et al. 2007; Orlowsky & Seneviratne 2012; Piniewski et al. 2017). The discrepancies in results not only introduce uncertainty regarding the future but also hinder efficient water resource management by administrators and decision-makers.
The fundamental assumption in all emerging works related to hydrology, until the first decade of the 21st century, was stationarity. However, in the era of changing climate, increasing human influence on flow regimes, and irregular water supply, this characteristic is diminishing (Milly et al. 2008; Obeysekera & Salas 2016). Trend analyses performed to assess changes in the distributions of extreme events increasingly rely on methods associated with change-point detection (Lee & Yeh 2019; de Freitas 2020; Rolim & de Souza Filho 2020; Chelu et al. 2022). These points indicate moments in time series when a reversal of trends occurs, which also explains the discrepancies in results obtained from different studies. Naturally, the results of these methods are highly dependent on the sensitivity of the method itself. In some studies, time series are divided into two periods based on such points, while in others, the division consists of multiple periods. Interestingly, different regions on Earth at a certain point indicate a similar convergence of these changing periods. For example, research conducted in Central Europe indicates the existence of regions with an approximately 10-year cycle of recurring low flow condition (Raczyński & Dyer 2020), and similar long-term recurrence patterns have been identified in other regions (e.g. Moreira et al. 2015; Yerdelen et al. 2021). In many cases, authors emphasize the significance of regional factors like geology or land cover in shaping long-term changes in ongoing process trends (Bormann & Pinter 2017; Cervi et al. 2017; Guzha et al. 2018; Van Loon et al. 2019; Wang et al. 2019; Raczyński & Dyer 2021) or major oscillation impacts (Chiew & McMahon 2002; Ryu et al. 2010; Forootan et al. 2019; Abdelkader & Yerdelen 2022).
The reoccurrence and, above all, parameterization of the phenomenon of extreme events are essential for several reasons. First, it allows for determining the general direction of trends occurring in a given period. It supports decision-making processes by considering long-term factors in water resource models or the occurrence of extremes such as droughts and their potential scales, where determining the frequency and duration of recurring periods facilitates proper management of water resource strategies and the development of regional demand systems (Tabari 2021). Unfortunately, most current studies in the field of extreme events do not discuss the issue of frequency (included in 12% of papers) or duration (included in 26%) (McPhillips et al. 2018).
Different approaches are used to address the changing characteristics and different scales of extreme hydrologic events, including a wide range of solutions such as physical modeling dealing with land surface hydrological models, statistical models focused on retrospective forcing datasets to support hydrologic reanalyses, and near-real-time satellite-based monitoring and analyses related to vegetation and evapotranspiration (Wood et al. 2015; Hao et al. 2018).
Prediction of temporal conditions falls within the domain of physical modeling, which can help assess the magnitude of temporal change or the triggering (or ending) of the phenomenon in the shortest time-step. This is due to the direct impact of meteorological factors like precipitation on streamflow. Physical models, however, struggle to address long-term processes and variability resulting from general water scarcity changes due to complex influences from both natural and anthropogenic factors. These processes are of key importance when modeling and forecasting the future of droughts, especially in perspectives that would allow for more precise water management planning (Mishra & Singh 2011). These limitations can be addressed with the application of wider, mainly statistical approaches.
Based on these assumptions, the harmonic oscillator seasonal trend (HOST) model was developed that uses harmonic functions for parameterizing recurring periods of extremes, presented as consecutive periods of the harmonic function, while also enabling the prediction of future patterns (Raczyński & Dyer 2023). The primary limitation of the initial version of the model was the introduction of only basic wave functions, which cannot capture more complex relationships such as variable function periods, changing intensity, or attenuation. In addition, in its initial form the model allows for the analysis of flow values and drought occurrence in a binary classification form, which does not allow for the consideration of the distribution of the phenomenon's intensity. As a result, the aim of this study is to address these limitations and expand the functionality of the HOST model by introducing new forms of functions and changing the flow of information within the model through the introduction of waveform synthesis into the workflow methodology. The study presents the performance assessment of the improved model in the context of streamflow drought modeling; however, the model application can be directly transferred to other processes characterized by repeatability, such as floods, precipitation extremes, volcanic or seismic activity, and others.
MATERIALS AND METHODS
Streamflow datasets
Along with the simulated NWM data, observed daily streamflow values from 58 US Geological Survey (USGS) gauging stations located in the Mobile River basin were used to perform model efficiency verification (Figure 1). The maximal available timeframe for all gauges was January 1995 to December 2018; however, as these data were only used for comparison against the NWM data, this timeframe was considered sufficient for inclusion. Similarly, only series with less than 5% of missing data were used, with missing values supplemented with the use of a moving average between the previous and next day's values.
For both the simulated and observed data, the series were split into training and testing sets using an 80/20 split, meaning 80% of the series (series beginning) were used for model training purposes and 20% (series end) for testing.
Defining low streamflow events
A period of flows below a specified threshold level is considered a streamflow drought, according to the threshold level method that is widely applied in hydrologic sciences (Yevjevich 1967; Van Loon 2015). To avoid subjective determination of a threshold, an objective threshold method was applied such that the flow duration curve breakpoint located in the lower part (35% of lowest values) of the distribution indicates the optimal level for the threshold value. This point indicates an environmentally driven decision point that relates to the moment of the change of river supplementation from precipitation (normal conditions) to groundwater (dry conditions) (Raczyński & Dyer 2022). The breakpoint is determined by means of the Fisher–Jenks natural breakpoint algorithm.
HOST model workflow
By decomposing the time series using the STL method, the effective capture and separation of the different components of the data is possible, enabling a robust foundation for accurate analysis of the underlying patterns and trends (Apaydin et al. 2021; He et al. 2022). Another advantage of using STL is its ability to capture the nonlinearity of the trend component.
After performing the decomposition on the time series describing short-term and long-term changes, a harmonic function is fitted using Equation (2). The modification introduced in this study expands the range of functions to include modified waves, which are described by the following equations:
Amplitude and frequency modulated, period changing wave function:
Each of the presented functions (Equations (3)–(10)) includes an additional term that describes the linear variability of the values. This modification is important because it allows for capturing a linear trend that may be present in the decomposed series representing the long-term changes. In such cases, the function considers the rate of change of the values based on the identified slope α. Since the seasonal data are detrended, this term is zeroed for models describing this component.
The fitting of each model variant is performed using the least squares method. Among all the models describing both components, the one with the highest percentage of explained variance or the highest efficiency index (according to the user's choice) is selected. In the latter case, two implemented efficiency indices are the kge and the Nash–Sutcliffe efficiency. These indices provide a quantitative measure of how well the model fits the observed data and guides the selection of the best-fitting model.
The subsequent computational steps depend on the analyzed event parameter: minimal (or maximal) values, occurrence (binary), or magnitude. In the case of analyzing minimum flow values, the model's efficiency is evaluated because the direct function values represent simulations of those flows. For the parameter representing streamflow drought occurrence, the direct function values do not provide explanatory information; therefore, a topological analysis is performed where successive function values are treated as decision thresholds. Values of the function above this threshold indicate event occurrence while values below it indicate its absence, after which the goodness-of-fit of the model to the original data is evaluated. The best-fit threshold level, represented by the highest f1 score value, is taken as the final decision threshold.
RESULTS
Model performance was estimated for all calculation variants: flow analysis, where direct values of the minimum flows were modeled in aggregation steps (monthly); occurrence of streamflow drought, as a binary classifier, using the objective threshold approach; and magnitude of streamflow drought.
For the seasonal component, only three models achieve a median kge (and r2) around 0.5 (0.45), including the sine, damped, and frequency modified models. Although generally indicating results above the acceptance level, the remaining models show much lower goodness-of-fit measures (Figure 4). All underperforming models show a similar characteristic of varying amplitude, suggesting that the seasonal component is characterized by recurring, constant magnitude in the region. This behavior is observed in all three studied parameters.
Model performance is dependent on input data quality. Studies of Jachens et al. (2018), Johnson et al. (2019), or Wan et al. (2022) show that lower order (<4) streams relate to low model performance, and higher order streams should be used due to their better representation of real observations. In the studied sample, data for stream orders 3 and 4 performed similarly, and while for each higher order the results show lower efficiency variability and higher goodness-of-fit metrics, and the increase is not substantial. Although, for occurrence, the IQR for each model f1 score starts above 0.5, the lower 25th percentile of test efficiency for the magnitude parameter drops below the acceptance level for fourth order nodes.
Within the training sets, accuracy statistics follow a left-skewed distribution, indicating higher fits, with a mode around 0.8 (0.975 for recall). For testing sets, the distribution is more equalized, with the mode centering between 0.6 and 0.8. Higher goodness-of-fit statistics for binary classification in the training sets do not translate to better results when compared to the testing sets. The distribution of accuracy statistics in the testing sets is closer to the training data for magnitude-based occurrence, and at the same time the mode or mean values are higher due to distributions closer resembling a normal distribution (Figure 9). This behavior suggests a slight overfitting of binary classified occurrences; however, for both models, there is a group of nodes indicating values of 0 for which no events occurred during the testing period, resulting in false-positive predictions, which confirms previous assumptions. This group is larger when using magnitude-based occurrence, which in turn suggests that the drop in precision from binary classification is transferred toward higher recalls, reducing the number of false-positive predictions.
Occurrence models can be constructed in three different ways: as binary classification (month with event present marked as 1 and without as 0), nonbinary classification (month with event is scaled based on the number of days with the event), and magnitude-based occurrence (the weight of the month constitutes the magnitude of the event in that period).
The primary occurrence model treats a streamflow drought event as a binary positive classification (true value: 1) and a period with no drought as a negative classification (false value: 0). There is no impact to modeled characteristics due to the form of length of the event in the aggregation step. This model reaches an f1 score of about 0.9 for training data (median of acc > 0.8) and f1 = 0.75 for testing (acc = 0.63). A decrease in accuracy statistics of 0.02–0.1 is observed when the magnitude-based occurrence model is used instead of binary classification. While this model uses weights, the direct characteristic is the magnitude of drought and not the length of drought. This supports accurate distribution of model weights between aggregation period, as nonbinary occurrence models that use drought length as weight indicate the lowest metrics. The median of the f1 score for both training and testing sets in the nonbinary occurrence model is below 0.5 and 0.25, respectively, and accuracy oscillates around 0.5. In addition, this model has the lowest recalls, with IQR spanning all values (0–1), while for the remaining two models it is usually between 0.75 and 1.
While natural processes are characterized by certain repeatability, due to climate change, some processes might become more or less severe. The disappearance of the streamflow droughts in the HOST model is reflected by applying the slope component to the trend data, as well as application of damped harmonics reflecting stronger signals of disappearance. This behavior might not be desired for forecasting purposes. There is a slight increase in all four accuracy statistics (between 0.04 and 0.1) describing occurrence models without damped functions included, which refers only to studied nodes with damped models found in regular fitting. Flow model variants that include damped models perform slightly better, with r2 values higher by around 0.02–0.05. Component-wise comparison reveals that models that include damped functions deal better with simulating long-term changes, where r2 distribution is mainly within 0.9–1.0 (0.58–0.98 for damped excluded). The seasonal models reach a closer distribution, however, when damped waves are excluded with overall scores about 5–10% lower.
SUMMARY AND CONCLUSIONS
Drought forecasting remains a challenge as the processes driving the phenomenon are associated with multiple factors across a range of different time scales (Carrao et al. 2018; Hao et al. 2018). Subseasonal, seasonal, or decadal processes affect drought progression, combined with changes in the environment driven by both natural and anthropogenic impacts, which create challenges in drought prediction on longer timeframes (Kam et al. 2014; Hao et al. 2018). Factors such as circulation patterns, snow, or groundwater connectivity play crucial roles in the development of the hydrologic cycle in various regions of the world (AghaKouchak et al. 2015; Golian et al. 2015; Hao et al. 2018; Raczyński & Dyer 2020; Dyer et al. 2022), as confirmed in simulations for hypothetical catchments by Apurv et al. (2017). In addition, the increasing role of human interventions like urbanization, population growth, new infrastructure, and agricultural water use is increasingly incorporated into the hydrologic cycle (Yuan et al. 2017). Such a wide range of factors increases the complexity of models designed to detect changes in water availability as they impact streamflow on different scales.
For example, in one region, the seasonality of precipitation or long-term climatic variability associated with changes in sea surface temperatures might impact the annual recurrence of drought, while in other regions, groundwater supplementation to rivers might prevent drought formation even during unfavorable meteorological conditions (Schubert et al. 2004; Haslinger et al. 2014; Woodhouse et al. 2016; Hanel et al. 2018; Raczyński & Dyer 2022). Interestingly, Van Loon et al. (2014) found that the seasonality of climate does not impact meteorological droughts but strongly relates to hydrological droughts through the seasonality of precipitation and temperature. It is often found that no single factor is solely responsible for a significant portion of the variance, and only combined factors included in the model provide enough information to perform valuable simulations (Baker et al. 2008).
To address these challenges, a range of methods has been developed over the years, which can generally be classified into three categories: statistical methods, dynamic modeling, and hybrid approaches (Mishra & Singh 2011; Mariotti et al. 2013; Pozzi et al. 2013). These methods can address drought formation processes on different scales. Slowly varying components of the climate system, such as sea surface temperature or land surface characteristics, affect long-range processes (Goddard et al. 2001; Schubert et al. 2004, 2016; Smith et al. 2012; Roundy & Wood 2015), while, for short-term forecasts, the initial state of the atmosphere plays a more important role (Wedgbrow et al. 2002). Generally, statistical models are better equipped to simulate long-term relations, while dynamic modeling is used for short-term forecasts.
Statistical models utilize empirical relationships from historical records to discern patterns and extrapolate them into the future (Hao et al. 2018). Among the most commonly used statistical methods are time series models, regression models, Markov chains, or machine learning models, as well as probabilistic modeling such as Bayesian or copulas (Chowdhary & Singh 2010; Hao et al. 2018; Madadgar & Moradkhani 2013; Yuan et al. 2013; Shin et al. 2020). Time series models, such as AutoRegressive Integrated Moving Average (ARIMA), are well-suited for predicting droughts based on indices with long-term characteristics, like precipitation patterns (Hao et al. 2018); however, this approach assumes a linear relationship between variables as it relies on persistence (Hao et al. 2018; Brunner et al. 2021). Evidence suggests that models like ARIMA or Seasonal AutoRegressive Integrated Moving Average with eXogenous regressors (SARIMAX) are reasonably good for forecasting in 1- to 2-month periods, but their accuracy diminishes for longer forecasts (Mishra & Desai 2005; Durdu 2010; Han et al. 2013; Band et al. 2022). Other popular techniques involve machine learning, such as different types of neural networks, which can model complex interactions; however, these models may require large datasets, and all transformations are hidden, limiting their broader application to different regions without replicating the entire building and fitting procedure, and they are generally prone to overfitting (Mishra & Singh 2011; Rolim da Paz et al. 2011; Belayneh et al. 2014; Sigaroodi et al. 2014; Hosseini-Moghari & Araghinejad 2015; Ali et al. 2018; Hao et al. 2018; Wu et al. 2022). Furthermore, many of these models do not allow access to their internal parameters, hindering adjustments to accommodate changing conditions.
The dynamical models used for short-term modeling primarily aim to reflect the physical processes of the atmosphere and land, providing accurate information about water resource responses to instantaneous conditions. In drought forecasting, these models are typically hydrological and driven by climate forecasts (Hao et al. 2018). Limitations of dynamical models are often associated with biases from long-term patterns not reflected in the physical formulas, and different results yielded by different models, even with the same forcing variables (Wang et al. 2011; Hao et al. 2018).
With these limitations in mind, the HOST model was designed as a pattern recognition framework that could be used either as a statistical method for pattern description or for long-term bias correction in hybrid modeling approaches. The model enables accurate simulations of long-term and seasonal processes and parameters of extreme events, including direct values, magnitudes, and occurrences. Simultaneously, the model provides access to its parameters, allowing for statistical analysis of spatial and temporal aspects of the results. The new model iteration includes several new numerical representations of wave functions that accurately capture short- and long-term distributions and address the issues of linear time progression of variables.
A limitation of the model is its inability to address factors affecting streamflow momentarily, as in dynamic modeling; however, this limitation provides an opportunity to create a link between the two approaches to develop a new hybrid modeling method capable of reflecting both long- and short-term patterns in streamflow, as well as temporal changes from varying atmospheric conditions.
The improved HOST model framework introduced in this work allows for accurate simulations of both long-term and seasonal patterns of drought. Seasonal fluctuations are explained by the newly introduced models with a kge greater than 0.5, while trends reach a kge greater than 0.9. Once the best-fit model is identified, it explains approximately 98% of the long-term variance and around 55% of the seasonality variance.
When applied to test sets, the models generally exhibit lower (though not significant) accuracies; however, about 20% of nodes show significant underperformance on test datasets. This is attributed to the lack of drought occurrences during the testing period, leading the models to generate false-positive forecasts. This dataset-specific characteristic should be considered during the modeling stage for specific case scenarios in the future. There is only a slight improvement in model statistics when using USGS observation data as input compared to NWM simulations, nor is there a significant difference in the model's performance between different river orders (considering nodes of order 3–8).
The binary classification for event occurrence yields the best goodness-of-fit metrics, followed by magnitude-based occurrence models; however, the nonbinary models that use period length as weights show significantly lower performance compared to the previous two. The application of an automatic period multiplier for STL decomposition improves the model's performance on test datasets and reduces the number of false-positive forecasts.
The presented model can accurately simulate seasonal and long-term patterns in the distribution of extreme flows, presenting further opportunities for application and development. The increasing number of datasets, both from land observations and simulations, as well as aerial and satellite measurements combined with big data techniques, have allowed for the construction of multivariable, combined models (Kao & Govindaraju 2010; Hao & AghaKouchak 2013; Sellars et al. 2013; AghaKouchak et al. 2015; Hao & Singh 2015). These models are often computationally expensive, but they can provide integrated and robust simulations of dry conditions thanks to their diverse range of forcing factors, thus increasing their predictive power in different ecosystems (Mishra & Singh 2011; AghaKouchak et al. 2015). The main advantage of the HOST model is its open structure that allows for generating the detailed parameters of simulated processes. The HOST framework is delivered as Python code and allows for a wide analysis of recurrence processes within any time series. This greatly improves applicability to other regions, as well as to applications beyond low streamflow analysis. It must be noted, however, that the performance of the method applied to different datasets has not yet been assessed, and some future changes may be required to adjust the method to other datasets. An additional limitation to the scalability of the method over different datasets is its vulnerability to complicated, multicomposed datasets, where decomposition into three components (long-term, short-term, and residuals) might be insufficient if additional patterns exist within a decomposed series. Future work on expanding the HOST model to other data sources will help address this issue.
The main limitation of the HOST model is its inability to address the shortest timeframe in water resources modeling, representing instantaneous conditions. To address this limitation, incorporating the HOST model as part of such a modeling system, where separate models are responsible for simulations at short time scales while long-term and seasonal fluctuations are considered within the HOST model framework, can be explored to address long-term changes in water availability. Such a combined modeling framework could serve as a new foundation for precise flow simulations and drought occurrence forecasting, improving current simulations and providing stakeholders with tools for better water management and planning strategies. The next stage of model development should include constructing such a modeling environment and testing it for applied usage for other scientific applications.
FUNDING
This research was funded by the National Oceanic and Atmospheric Administration (NOAA), grant number NA19OAR4590411.
AUTHOR CONTRIBUTIONS
KR contributed to conceptualization, methodology, software, formal analysis, investigation, and visualization. JD contributed to resources, supervision, project administration, and funding acquisition. KR and JD both contributed to validation, data curation, writing the original draft, and reviewing and editing the manuscript. All authors read and agreed to the published version of the manuscript.
DATA AVAILABILITY STATEMENT
All relevant data are available from an online repository or repositories. Publicly available datasets of NOAA National Water Model Retrospective v2.1 were analyzed in this study. These data can be found at: https://registry.opendata.aws/nwm-archive/. The current version of the HOST model framework software can be found at: https://github.com/chrisrac/hostmodel.
CONFLICT OF INTEREST
The authors declare there is no conflict.