Abstract
Nzoia River Basin is an important water resources development and management unit not only in Kenya but also in the region. The basin drains into Lake Victoria, frequently experiences severe flooding, and the catchment area has recently been dominated by unsustainable human activities. The resulting perturbations coupled with a changing climate could have far-reaching consequences in the near future. Efforts to understand the influence of the changes are complicated by a dwindling hydro-meteorological network. This study explored the efficacy of satellite data to estimate run-off time series for the basin. NAM model in Mike 11 was set-up, calibrated and validated. A run-off time series was then generated. The Goddard Earth Observing System model version 5 (GEOS-5) satellite data from NASA was then used to derive run-off series when calibrated and when not. The results were compared to identify which approach resulted to better estimates. The optimal runoff model parameters observed were 12.7 for Umax, 136 for Lmax, 0.2 for CQOF, 1100 for CKIF, 61.65 for CK1, 2, 0.248 for TOF, 0.198 for TIF, 0.0495 for TG and 2825 for CKBF. These gave a water balance of about 0.98 and root mean square error of about 460 with R2 of 0.9 and 0.8 for calibration and validation, respectively. Calibrated GEOS-5 data performed reasonably well in runoff estimation for Nzoia basin. While non-calibrated satellite data was poor with an r-squared of 0.6 in relation to observed discharge, calibrated data performed way better with an r-squared of 0.96 for daily values. Satellite-derived runoff was found to overestimate the discharge especially the peak discharges. This study showed that satellite data such as NASA's GEOS-5 when adequately calibrated provide suitable estimates for discharges in basins that are sparsely gauged.
HIGHLIGHTS
Nzoia River Basin is a critical water resources development and management unit but useful decisions cannot be made since the hydro-meteorological network is declining.
Satellite data has been explored to access its efficacy in runoff generation.
Surface-groundwater interaction is the main aspect that influences runoff generation.
Better estimates were derived from calibrated GEOS-5.
Graphical Abstract
INTRODUCTION
Poor spatial and temporal scales of meteorological data would typically result in lower quality of hydrological estimates. In a classic case of domino effect, poor runoff estimation results in recurring inefficiency of early flood warning systems, overtopping of hydraulic structures, among other effects. These may lead to mass loss of human life and property as witnessed in the Mississippi lowlands and the Indus valley in 2010 (Gajbhiye 2015). For instance, in the Indus basin, 175 million hectares (about 66% of the land in India) suffered from detritus effects of improper management resulting from inaccurate runoff data. Consequently, during peak flows, floods are common in the aforementioned basin. The Federal Emergency Management Agency (FEMA) reports that flooding is the most common natural disaster, leading to more loss of life, property and economic activity (FEMA 2020).
Developing countries face similar issues. Floods occur in Kenya every 2–7 years (Onecan et al. 2016). Between October 2015 and January 2016, 112 lives were lost and more than 100,000 people were displaced (Davies 2016). Due to frequent recurrence in flooding, those living in the flood plains still remain extremely vulnerable even when the magnitude is low. One major factor leading to this phenomenon is the lack of early flood warning systems. Moreover, an assessment carried out on a community in the Lake Victoria basin alluded that citizens living in flood prone areas (high risk areas) perceived floods as ‘inevitable and fairly unpredictable’ (Nyakundi 2010). This view seems outrageous considering the severity and frequency of floods in these areas. However, this only underpins the drudgery that flooding has persistently inflicted on these communities to an extent of imagining that nothing can possibly be done.
Availability of adequate runoff data for early warnings is critical if communities at risk are to be protected from devastation occasioned by floods. The frequency in flooding incidences is expected to increase in the near future (GoK 2012) since areas likely to receive the highest rainfall are the flood-prone areas in Western Kenya around Mount Elgon, Elgeyo escarpment, and Cherengani hills (Onecan et al. 2016). This region is the drainage basin for River Nzoia, which drains into Lake Victoria. The Nzoia basin is of particular interest because it continuously experiences flood-related disasters (Muku & Nyandwaro 2013).
Although weather stations in the Nzoia basin seem to provide reasonably accurate data on ground (Shilenje & Ogwang 2015), the stations are sparsely distributed suggesting they are insufficient for runoff estimation. A sparse rain gauge network cannot reflect rainfall variability caused by topography and orography and will result in erroneous estimates of areal rainfall (AndrÉassian et al. 2001). An alternative source of rainfall, preferably one that circumvents the problem of limited areal distribution of weather station data, is therefore required. In addition to poor spatial distribution, data quality from the few operating weather stations is poor. Due to various logistical issues, weather stations in Nzoia basin have so many missing values that, as realized during the study, only 10 stations in the entire basin have 90% complete data.
The use of satellite images such as the Goddard Earth Observing System model version 5 (GEOS-5) satellite data is a reliable alternative. Like all remotely sensed data, calibration with the available station data is mandatory. This study used calibrated GEOS-5 satellite data to estimate runoff in Nzoia River Basin, an important water management region in Kenya with a decreasing quantity and quality of station data.
Use of satellite data for runoff estimation is an application of the use of remote sensing in estimating runoff. Currently, remote sensing devices that estimate runoff from direct observation are not commonly used. The alternative that is popularly taken when using remote sensing for runoff estimation is to capture two sets of data: (1) the runoff model input data and (2) model parameter data. A mathematical model which represents model parameter data is then used to translate the input data into the runoff for the basin (Schultz 1996). Of all the methods that can be used to acquire remote sensing data, use of satellites offers the best alternative to complement observed runoff. Once set up, satellites can record data continuously if required, and it is even possible to set up the system so that the data collected is stored in a cloud for immediate access to the interested parties. However, satellite data (as with all remote sensing data) have attenuation and backscatter errors that may mean that the spatial adequacy of data provided by satellites might be inadequate for use in runoff analysis. To solve this inadequacy and take full advantage of the temporal adequacy, this satellite data is calibrated before use in runoff estimation.
The runoff model used in the study was the NAM MIKE 11 runoff model. With the NAM runoff model, it is possible to treat a complex river system with numerous channels within the same modelling framework, as done by (Singh et al. 2014). It is also established that the NAM MIKE 11 model is suitable for use in R2 metrics (Teshome et al. 2020). The combination of these characteristics makes it particularly suitable for use in Nzoia basin.
Hydrological models available in current software employ the use of a cascade/series of linear reservoirs where every reservoir/storage empties into the next until runoff is obtained.
Common software applied for hydrological modelling includes (Table 1):
Hydrological runoff modelling can be carried out using NAM MIKE 11 with only rainfall and temperature data. By varying input and storage parameters till observed discharge resembles simulated discharges, the model is calibrated for the basin. It requires 3 input data; temperature, rainfall and observed discharge. These input parameters were readily available to the researcher. Therefore, the NAM MIKE 11 model presented a suitable model for the study. It was therefore chosen as the preferred model to carry out runoff analysis.
NAM MIKE 11 is a deterministic conceptual runoff model that balances the inputs and storages of a basin to determine the runoff from a basin. In the case of this project, NAM was prepared with 9 parameters representing the surface zone, root zone and ground water storages.
In their study, Brirhet & Benaabidate (2016) found that use of lumped and conceptual models were both suitable for use in the same watershed. The accuracy of using the two datasets was not affected if they were interchanged. Therefore, it would be satisfactory to use a lumped model on a spatially distributed model. For Nzoia basin, the use of a lumped model was desired due to data scarcity. This makes the use of NAM MIKE 11 particularly attractive for this study.
MATERIALS AND METHODS
Study area
From a physiographic and land use point of view, the basin has four distinct ones: a highland zone (2,300–2,850 m Above Sea Level {ASL}), a plateau zone (1,900–2,300 m ASL), a transition zone (1,600–1,900 m ASL) and a lowland zone (1,000 m–1,600 m ASL). From the digital elevation model used in the study, anything above 2,850 m ASL was taken to represent areas occupied by the mountain peaks. These areas had the maximum elevation for the watershed of up to 4,266 m ASL. The highest peak to the north-western part of the basin is Mount Elgon while to the north-eastern part is the Cherangani Hills.
Physiologically, the highland zone is forested. However, it suffers severe land degradation. The plateau zone is the major maize and dairy farming area in the basin. The transition and lowland are characterized by a mix of sugarcane and small scale farming. The lowland areas are generally flat and consequently flood prone and swampy.
Data collection
NASA satellite data and weather station data of between 1982 and 2009 were collected and checked for quality by statistical analysis. A runoff model was then used to estimate runoff for Nzoia basin.
Satellite data was obtained from the NASA observatory, NASA Prediction of Worldwide Energy Resources (POWER NASA). On the other hand, weather station data was collected from 14 stations as shown in Table 2. The stations were chosen based on completeness of data. Except for Kaptagat station which was selected to ensure a good spatial distribution, all other stations had less than 10% of missing data. Temperature data from only three stations, viz. Kakamega, Eldoret and Kitale, was available.
Software . | Description . |
---|---|
SMART | Includes agricultural, subsurface drainage flow, in addition to soil and ground water reservoirs to simulate flow path contributions to streamflow. |
Vflo | Uses radar rainfall and GIS data to generate physics based, distributed runoff simulation. |
WEAP | Models runoff and percolation from climate and land-use data, using a choice of linear and non-linear data. |
RS MINERVE | Simulates formation of free surface runoff flow and its propagation in rivers or channels. |
NAM MIKE 11 | Balances input and storages on a basin to determine runoff from a basin. |
MIKE SHE | Builds and simulates ground water and surface water flow by simulating entire land phases and allowing their customization depending on local needs. Especially suitable for analyzing groundwater effects on surface water. |
SWAT tool | A watershed to river basin scale that is used to simulate the surface and groundwater quality. The tool can be used to determine environmental impact of changes in land use and is typically used for study of soil erosion and pollution. |
Software . | Description . |
---|---|
SMART | Includes agricultural, subsurface drainage flow, in addition to soil and ground water reservoirs to simulate flow path contributions to streamflow. |
Vflo | Uses radar rainfall and GIS data to generate physics based, distributed runoff simulation. |
WEAP | Models runoff and percolation from climate and land-use data, using a choice of linear and non-linear data. |
RS MINERVE | Simulates formation of free surface runoff flow and its propagation in rivers or channels. |
NAM MIKE 11 | Balances input and storages on a basin to determine runoff from a basin. |
MIKE SHE | Builds and simulates ground water and surface water flow by simulating entire land phases and allowing their customization depending on local needs. Especially suitable for analyzing groundwater effects on surface water. |
SWAT tool | A watershed to river basin scale that is used to simulate the surface and groundwater quality. The tool can be used to determine environmental impact of changes in land use and is typically used for study of soil erosion and pollution. |
Statistical analysis
Statistical analysis was a very vital and core part of the project. It not only enabled quality control of the input and output data obtained, but it also enabled the simplification of the data for easy and accurate analysis. It was not only used as a measure to ensure quality control, but was also a means of organizing, sorting, analyzing and testing the viability of the data at every step throughout the process.
Statistical operations that were carried out during the study are outlined in Table 3 below:
Processing rainfall and temperature data
In this study, monthly data was used. A regression between concurrent monthly satellite and weather station data gave factors that were used to calibrate the satellite data.
For rainfall data, factors were calculated on a monthly basis for all the stations and then averaged out to represent the entire basin. For the entire watershed, factors were obtained by dividing the monthly total averages of the observed data by similar data from satellites. Therefore, any fraction (value below 1) represented an overestimation by satellite data and any value above 1 represented an underestimation.
(1 represents January, 2 February and so on and so forth)
For both maximum and minimum temperatures (Figure 3 and Figure 4), factors were obtained by finding the differences between observed (station) and satellite (NASA) data. This means that positive values tend to an underestimation, negative values suggest an overestimation while 0 represents similar values.
The satellite data for maximum temperatures tends to overestimate in January, and this increases to a maximum of −2 degrees in March. This gradually reduces to a minimum of almost no errors in May. Then, it crosses over to underestimations at a maximum of 0.5 degree in July, and in the same pattern, reduce to about 0 in September and then exhibiting minimum overestimation from October to December.
Minimum temperatures exhibit a unanimous overestimation, with a maximum of about 3 degrees in September.
Model development
This study used the one dimensional (1D) unsteady flow hydrodynamic NAM runoff model available in the MIKE 11 model from the Danish Hydraulic Institute (Danish Hydraulic Institute 2007). This is a deterministic, lumped and conceptual rainfall-runoff model accounting for water storage in up to 4 different storages. During the calibration process, selected parameters (Table 4) were varied until suitable estimates of runoff for the basin were obtained. The calibration period was 1982–1986 with a warm-up period of two years while the validation period was 1987–1989.
Station ID . | Station Name . | Latitude . | Longitude . | % completeness . |
---|---|---|---|---|
8934161 | Busia | 0.483 | 34.133 | 97.3 |
8934059 | Uholo | 0.185 | 34.304 | 95.7 |
8934134 | Bungoma | 0.583 | 34.567 | 98.9 |
8934096 | Kakamega | 0.283 | 34.767 | 99.7 |
8934072 | Nandi | 0.150 | 34.933 | 93.3 |
8934061 | Malava | 0.446 | 34.851 | 90.3 |
8934016 | Lugari | 0.667 | 34.900 | 97.9 |
8935181 | Eldoret | 0.533 | 35.283 | 99.9 |
8935170 | Lukamanda | 0.633 | 35.050 | 90.6 |
8834013 | Adc | 1.033 | 34.800 | 91.3 |
8934008 | Kiminini | 0.900 | 34.917 | 96.6 |
8834098 | Kitale | 1.000 | 34.983 | 99.3 |
8935104 | Kaptagat | 0.867 | 35.500 | 81.6 |
8935137 | Timboroa | 0.067 | 35.533 | 91.8 |
Station ID . | Station Name . | Latitude . | Longitude . | % completeness . |
---|---|---|---|---|
8934161 | Busia | 0.483 | 34.133 | 97.3 |
8934059 | Uholo | 0.185 | 34.304 | 95.7 |
8934134 | Bungoma | 0.583 | 34.567 | 98.9 |
8934096 | Kakamega | 0.283 | 34.767 | 99.7 |
8934072 | Nandi | 0.150 | 34.933 | 93.3 |
8934061 | Malava | 0.446 | 34.851 | 90.3 |
8934016 | Lugari | 0.667 | 34.900 | 97.9 |
8935181 | Eldoret | 0.533 | 35.283 | 99.9 |
8935170 | Lukamanda | 0.633 | 35.050 | 90.6 |
8834013 | Adc | 1.033 | 34.800 | 91.3 |
8934008 | Kiminini | 0.900 | 34.917 | 96.6 |
8834098 | Kitale | 1.000 | 34.983 | 99.3 |
8935104 | Kaptagat | 0.867 | 35.500 | 81.6 |
8935137 | Timboroa | 0.067 | 35.533 | 91.8 |
Parameter . | Equation used . | Definition of variables . |
---|---|---|
Arithmetic mean (Xavg) | n is the total number of terms. Xi is the ith value of variable X in the population sample. | |
Standard deviation (σ) | ||
Coefficient of variation (Cv) | ||
Coefficient of skewness (Cs) | N is the number of years (35 yrs. for data in the study). | |
Probability of exceedance | f is the number of each event in order of magnitude b is the number of years in record. | |
Highest outlier (yhigh) | For a sample size of 35 years, Kn = 2.628 (Chow et al. 1988). | |
Lowest outlier (ylow) |
Parameter . | Equation used . | Definition of variables . |
---|---|---|
Arithmetic mean (Xavg) | n is the total number of terms. Xi is the ith value of variable X in the population sample. | |
Standard deviation (σ) | ||
Coefficient of variation (Cv) | ||
Coefficient of skewness (Cs) | N is the number of years (35 yrs. for data in the study). | |
Probability of exceedance | f is the number of each event in order of magnitude b is the number of years in record. | |
Highest outlier (yhigh) | For a sample size of 35 years, Kn = 2.628 (Chow et al. 1988). | |
Lowest outlier (ylow) |
The split method where data is divided into two separate periods for calibration and validation was not possible for this study. Data scarcity meant that most of the data past 1990 was missing. For viable calibration and validation therefore, the only option left was to take the data from 1982 to 1989.
Table 4 shows the catchment parameters and their allowable ranges used in this study.
Parameter . | Description . | Units . | Range . |
---|---|---|---|
Umax | Describes interception storage, depression storage and storage in uppermost layer of the soil. | mm | 5–35 |
Lmax | Describes maximum soil moisture content in the root zone per area that is available for transpiration by vegetation. | mm | 50–350 |
CQOF | Determines ratio of overland flow to infiltration. | N/A | 0–1 |
CKIF | Describes interflow amounts. | Hours | 500–1,000 |
CK1,2 | Determines shapes of hydrograph peaks. | Hours | 3–72 |
TOF | Describes value of moisture content in root zone above which overland flow is generated. | N/A | 0–0.99 |
TIF | Describes value of moisture content in root zone above which interflow is generated | N/A | 0–0.99 |
TG | Describes value of moisture content in root zone above which groundwater recharge is generated | N/A | 0–0.99 |
CKBF | Time constant for routing base flow. | Hrs. | 500–6,000 |
Parameter . | Description . | Units . | Range . |
---|---|---|---|
Umax | Describes interception storage, depression storage and storage in uppermost layer of the soil. | mm | 5–35 |
Lmax | Describes maximum soil moisture content in the root zone per area that is available for transpiration by vegetation. | mm | 50–350 |
CQOF | Determines ratio of overland flow to infiltration. | N/A | 0–1 |
CKIF | Describes interflow amounts. | Hours | 500–1,000 |
CK1,2 | Determines shapes of hydrograph peaks. | Hours | 3–72 |
TOF | Describes value of moisture content in root zone above which overland flow is generated. | N/A | 0–0.99 |
TIF | Describes value of moisture content in root zone above which interflow is generated | N/A | 0–0.99 |
TG | Describes value of moisture content in root zone above which groundwater recharge is generated | N/A | 0–0.99 |
CKBF | Time constant for routing base flow. | Hrs. | 500–6,000 |
Through an iterative process, the optimum parameters were associated with the least root mean square error, parameters within the prescribed range and a water balance that is within 5%.
RESULTS AND DISCUSSION
Basin characteristics
The results showed that the basin experiences two rainfall peaks every year. The first peak comes from April to June while the second occurs from July to September. The basin receives an average annual rainfall of between 1,000 and 1,500 mm. The Nzoia River is about 330 km from source to its outfall into Lake Victoria. From this study, the annual discharge into the lake was estimated to be about 1,780 M m3.
The average annual maximum and minimum temperatures in Nzoia River Basin were 27 °C and 12 °C, respectively. The highest temperatures were observed in April and the minimum temperatures occurred either in July or September, depending on the location within the basin.
Model calibration and sensitivity analysis
Table 5 contains the default parameters, optimized parameters and the maximum percent change in flow obtained from the sensitivity analysis.
Parameter . | Default parameters . | Calibrated optimal value . | % change of flow effected . |
---|---|---|---|
Umax | 10 | 12.7 | 7.9 |
Lmax | 100 | 136 | 4.5 |
CQOF | 0.5 | 0.2 | 30.4 |
CKIF | 1,000 | 1,100 | 0.5 |
CK 1,2 | 10 | 61.65 | 24.3 |
TOF | 0 | 0.248 | 17.8 |
TIF | 0 | 0.198 | 0.8 |
TG | 0 | 0.0495 | 6.7 |
CKBF | 2,000 | 2,825 | 9.3 |
Parameter . | Default parameters . | Calibrated optimal value . | % change of flow effected . |
---|---|---|---|
Umax | 10 | 12.7 | 7.9 |
Lmax | 100 | 136 | 4.5 |
CQOF | 0.5 | 0.2 | 30.4 |
CKIF | 1,000 | 1,100 | 0.5 |
CK 1,2 | 10 | 61.65 | 24.3 |
TOF | 0 | 0.248 | 17.8 |
TIF | 0 | 0.198 | 0.8 |
TG | 0 | 0.0495 | 6.7 |
CKBF | 2,000 | 2,825 | 9.3 |
From the sensitivity analysis, CK 1,2 and CQOF effected maximum changes in streamflow while CKIF and TIF had the least effect. This suggested that surface-groundwater interaction is the key factor determining response to rainfall input. Changing the ratio of overland flow to infiltration (CQOF) or the shape of the hydrograph (CK 1, 2) is bound to effect maximum changes in streamflow. On the other hand, interflow effects (TIF and CKIF) have a delayed and gradual effect on runoff and thus the effects of changing TIF and CKIF are minimal.
Validation of the runoff
The period 1987–1989 was used for validation. The comparison is illustrated in Figure 6. The R2 and the water balance (simulated/observed) obtained were 0.8 and 0.99, respectively. This was considered a reasonable estimate and a viable validation of the runoff model.
Runoff estimation – calibrated vs non-calibrated satellite data
Calibrated satellite weather data provided a better estimate of the observed discharge when used for runoff estimation (Table 6). This suggested that use of calibrated GEOS-5 satellite data provided better estimates compared to weather station data.
Rainfall data set . | R- squared . | Water Balance . |
---|---|---|
Station data | 0.91 | 0.98 |
Non-calibrated GEOS-5 satellite data | 0.61 | 0.60 |
Calibrated GEOS-5 satellite data | 0.96 | 0.99 |
Rainfall data set . | R- squared . | Water Balance . |
---|---|---|
Station data | 0.91 | 0.98 |
Non-calibrated GEOS-5 satellite data | 0.61 | 0.60 |
Calibrated GEOS-5 satellite data | 0.96 | 0.99 |
From Figure 7, the unprocessed satellite data greatly overestimated the discharge. This is especially true for the peaks.
Runoff estimation – calibrated satellite vs calibrated station data
In this case, calibrated station data represents runoff data that is generated by the runoff model when the inputs are the raw station temperature and rainfall data. From the study, we acknowledge that station data exhibits temporal scarcity. Therefore, we expect the runoff data generated by using station data (calibrated station data) to perform poorly compared to data generated from inputting calibrated satellite data (calibrated satellite data). Therefore, calibrated satellite data for this case represent runoff generated from improved satellite rainfall and temperature data.
From Figure 8 calibrated satellite data gave better estimates of runoff compared to station data. However, both the measure of R2 and water balance are considered acceptable.
CONCLUSION
NAM was used to perform rainfall-runoff analysis for Nzoia River Basin. The optimal runoff model parameters observed were 12.7 for Umax, 136 for Lmax, 0.2 for CQOF, 1100 for CKIF, 61.65 for CK 1, 2, 0.248 for TOF, 0.198 for TIF, 0.0495 for TG and 2825 for CKBF. These gave a water balance of about 0.98 and RMSE of about 460 with R2 of 0.9 and 0.8 for calibration and validation, respectively.
Calibrated GEOS-5 data performed reasonably well in runoff estimation for Nzoia basin. While non-calibrated satellite data was poor with an r-squared of 0.6 in relation to observed discharge, calibrated data performed way better with an r-squared of 0.96 for daily values. Satellite-derived runoff was found to overestimate the discharge especially the peak discharges. This study showed that satellite data such as NASA's GEOS-5 when adequately calibrated provide suitable estimates for discharges in basins that are sparsely gauged.
DATA AVAILABILITY STATEMENT
All relevant data are included in the paper or its Supplementary Information.
CONFLICT OF INTEREST
The authors declare there is no conflict.