Abstract
A promising future development area to improve the accuracy of satellite rainfall estimates (SREs) is accessing merits from different sources of data through combining algorithms. The main objective of this study is to assess the accuracy and importance of the fused multistage approach of bias correction. Accordingly, two versions of resampled and spatially bias-corrected Climate Hazards Group Infrared Precipitation (CHIRP) estimates were merged with ground measurements using a conditional merging procedure. Results of applied performance measures (i.e. seven) on corrected and merged CHIRP SREs show that the Percent of Detection (POD) and Percent Volume Error (PVE) have improved. Depending on the combination of coupled stations for validation, up to 70 and 50% PVE improvement was achieved at some stations for wet and dry periods, respectively. Moreover, the bias-corrected and conditionally merged CHIRP SREs have outperformed the estimates by resampling CHIRP with station dataset (CHIRPS) over the sparsely populated western part of the watershed. However, the devised method was limited in considering dry-day events during bias correction, which in turn has affected the performance of the bias correction of the CHIRPS product. Finally, future research should concentrate on such methods of fusing to understand the benefits of various approaches and produce more precise rainfall records.
HIGHLIGHTS
The research provides a fused multi-staged approach for reducing errors in CHIRP and CHIRPS satellite rainfall estimates.
Application of parametric QM for spatial bias correction followed by conditional merging improves the quality of CHIRP SREs.
Bias-corrected and conditionally merged CHIRP estimate outperforms the estimates by CHIRPS.
Incorporating additional ground station records improves the estimates of SREs.
INTRODUCTION
There are a large number of satellite precipitation data sources (Ziarh et al. 2021), which are used in the assessment of hydrological extremes and processes modeling, including climate change impact studies (Ebrahimi et al. 2017). Nevertheless, each source has strengths and gaps in representing the records collected by ground stations (Wang & Zhao 2022). Satellite rainfall products are acclaimed for their wide coverage and cost minimization related to large area coverage compared to in situ measurements (Suliman et al. 2020; Wang & Zhao 2022). On the other hand, the spatial coverage, accessibility, and density of ground stations are limiting factors when considered for use (Katiraie-Boroujerdy et al. 2020).
Systematic errors and random errors are common in satellite rainfall products (Goshime 2020). Error sources are mostly related to imperfection of retrieval algorithm, data source, and postprocessing procedures (Dubovik et al. 2021; Zhang et al. 2021). Compared to some meteorological variables like temperature, which has a steadier geographical and temporal pattern, bias correction of satellite rainfall data is thought to be the most difficult (Soo et al. 2020). Almost all available methods focus on correcting systematic errors while intrinsically trying to correct random errors as well (Dinku et al. 2011). One can argue that it is mandatory to correct patterns (i.e. both spatial and temporal) as much as it is important to correct magnitude (Iqbal et al. 2022). From available methods distribution mapping (DM) tends to address bias by correlating patterns of different rainfall magnitudes (i.e. by relating cumulative distribution function (CDF) of control data to CDF of satellite rainfall products using transfer function) (Valdés-Pineda et al. 2016; Katiraie-Boroujerdy et al. 2020; Soo et al. 2020).
The imperfection of conventional quantile mapping (QM) is that it was proven to perform poorly on a daily time scale. Other modified versions like censored-shifted Gamma DM have shown improvement over the standard QM technique, but still lack the ability to capture extreme events (low and/or high) (Lafon et al. 2013; Ma et al. 2018; Ma et al. 2019). Significant results were also gained by fitting fused distribution functions with different capabilities (Ma et al. 2019), for specific local cases. One alternative that can be applied for DM is the option of non-parametric QM, where CDFs are created only for available magnitudes. However, such methods will be impossible to apply for magnitudes greater than training/calibration data (Lehner et al. 2020). Additionally, transferring information from grid points with ground observation to unrepresented grid points is difficult using such methods.
Two well-known approaches to compare satellite products with rain gauge readings are the point to grid and grind to grid (Ebrahimi et al. 2017). Ebrahimi et al. (2017) assessed the accuracy of Tropical Rainfall Measuring Mission (TRMM) 3B42 v7 rainfall product by initially implementing the nearest neighbor (NN) and weighted bilinear interpolation (WBL) techniques to create co-located pairs of observations, where the study concluded the large errors occur due to spatial mismatch and WBL performed better to reduce such mismatch. A possible future development area to improve the accuracy of satellite rainfall forecasts is accessing good attributes from different sources of data through merging techniques (Kimani et al. 2018). Xie & Xiong (2011) discussed how two-stage correction, or first applying statistical bias correction method and then integrating satellite rainfall outputs with ground observations, can significantly enhance performance. On the contrary, application of merging techniques alone was preferred by some scholars (Jongjin et al. 2016; Gebremedhin et al. 2021) and still managed to decrease errors compared to the raw product.
Accurate depth of point rainfall is collected with a dense network of rain gauges, where continuous calibration and follow up are provided (Xie & Xiong 2011). Utilizing the complementary strengths of each source through the use of a merging process allows for the reduction of bias in satellite rainfall products (Jongjin et al. 2016). Merging techniques are valuable in reducing the biases from multiple factors (factors such as rainfall formation type, topography, and others) (Dinku et al. 2011; Xie & Xiong 2011). Based on the assumption that rain gauge data over China are bias-free and Climate Prediction Center morphing method (CMORPH) satellite rainfall products are capable of simulating spatial patterns of rainfall, Xie & Xiong (2011) developed a two-stage conceptual model to produce an enhanced rainfall map. The process involved bias correction (using statistical method) followed by a merging technique (i.e. optimal interpolation technique) (Xie & Xiong 2011). The result indicates that the conceptual method has improved the pattern agreement with independent observation and potentially reduced bias in the satellite product (Xie & Xiong 2011).
Another work by Jongjin et al. (2016) proved that merging techniques of Conditional Merging (CM), Geographical Differential Analysis and Geographical Ratio Analysis (GRA) are capable of improving the accuracy of satellite-based rainfall products without applying bias correction techniques. The research finding revealed that for a sparse rain gauge network, the CM technique performed better compared to the other two methods (Jongjin et al. 2016). Gebremedhin et al. (2021) illustrated a promising result gained by implementing the geographical weighting regression method of merging for meso-scale catchments of the Upper Tekeze basin. The method relies on accessing the freely available topographical information (explanatory variable) and incorporating them into the merging procedure (Gebremedhin et al. 2021). The main assumption of the method is a correlation of non-stationary in space of the dependent and independent variable in addition to spatial variability of the regression coefficient (Hu et al. 2019).
Working with multiple data sources with higher spatial resolution and longer records is a challenge to conduct data merging as well as perform bias correction. Such problems can be alleviated with the use of simple algorithms. Being an interpreted language, Python enables the easy development of algorithms with multiple open-source packages (Grus 2019). One such work was developed by Gupta et al. (2019), for climate data bias correction. Another work by Guenzi et al. (2017) developed an open source algorithm for the conventional CM technique, to ease the merging process of radar observation with gauge records. However, there are few available open-source algorithms that can support multiple tasks.
Multiple studies (Gebere et al. 2015; Gella 2019; Goshime 2020; Adane et al. 2021) have been done to assess the accuracy of satellite products in different parts of Ethiopia. These studies have demonstrated that raw satellite rainfall outputs overestimate or underestimate rainfall events compared to rain gauge observations. Whereas, other researchers (Valdés-Pineda et al. 2016; Omondi 2017; Gella 2019; Adane et al. 2021) investigated the accuracy of bias correction approaches employing two or more bias correction methods. The difficulty in correcting satellite-based precipitation data with ground data, however, is still a significant issue in the deployment of such data for various uses (Ziarh et al. 2021). Additionally, given scanty records in ground observation, integrating several sources could be useful in addressing propagated errors.
Therefore, the main objective of this study is to assess the accuracy and importance of the fused multistage approach of bias correction. Specific objectives assigned to answer the main objectives include the following: (1) assess the significance of conventionally applied NN and bilinear (BL) resampling techniques, (2) evaluate simple merging method of CM, (3) evaluate DM-based bias correction method followed by CM technique, and (4) develop simple algorithm of spatial bias correction and CM using python programming language.
METHODOLOGY
Study area
Dataset
The ground observed rainfall data were obtained from the National Ethiopian Meteorological Agency (EMA); whereas the Climate Hazards Group Infrared Precipitation (CHIRP) satellite rainfall product (SRP) and its modified version with ground station data, Climate Hazards Group Infrared Precipitation with station dataset version 2.0 (CHIRPS 2.0), for the time period from January 2008 to August 2018 on a daily time scale, were obtained from https://data.chc.ucsb.edu. However, it should be noted that the ground observation is available for different lengths of time since the establishment of each station; and that the specified length of period was selected to assure maximum overlapping length of observation from available stations with minimum missing data (on average up to 20%). In the study watershed, the formation of snow is less likely to occur, hence the satellite precipitation product is referred to as SRP.
Freely available CHIRPS 2.0 and CHIRP SRP, with a resolution of 0.05° for a time scale of monthly, decadal, pentadal, and daily (Funk et al. 2015), were downloaded from https://data.chc.ucsb.edu at a daily time scale. CHIRPS is a quasi-global product dependent on the algorithm which fuses estimates from infrared Cold Cloud Duration (CCD) observations and ground observation (Funk et al. 2015).
According to the station location information gathered from the EMA only six stations are found within the watershed. To complement missing data and represent the western part of the watershed as opposed to densely populated station distribution in the eastern, an additional six stations surrounding the watershed were considered. Further analysis of missing data rate and seasonal contribution (Table 1) was conducted to indicate the status of collected data.
Preliminary assessment of missing data rainfall record and rainfall contribution, in dry (November–March) and wet (April–October) periods
Station . | Station code . | . | Rainfall percentage . | Missing data . | ||
---|---|---|---|---|---|---|
Elevation (m) . | Wet period (%) . | Dry period (%) . | Wet period (%) . | Dry period (%) . | ||
Kuyera | KU | 1,936 | 85.2 | 14.8 | 10.5 | 11.1 |
Belela | BE | 1,866 | 88.5 | 11.5 | 7.9 | 16.3 |
Leku | LE | 1,879 | 80.8 | 19.2 | 6.5 | 7.1 |
Kokossa | KOK | 2,400 | 77.3 | 22.7 | 32.8 | 36.5 |
Shamena_Kedida | SHK | 2,072 | 82.1 | 17.9 | 6.7 | 11.2 |
Wondo_genet | WO | 1,880 | 81.6 | 18.4 | 7.8 | 1.7 |
Tula | TU | 1,873 | 78.1 | 21.9 | 4.0 | 3.5 |
Hawassa | HA | 1,750 | 83.7 | 16.3 | 0.1 | 3.1 |
Hawassa_Tabor | HW | 1,750 | 82.8 | 17.2 | 5.2 | 1.9 |
Wateraressa | WT | 2,631 | 83.3 | 16.7 | 16.1 | 19.8 |
Shashemene | SHA | 1,927 | 80.7 | 19.3 | 17.0 | 23.7 |
Kofele | KOF | 2,620 | 78.1 | 21.9 | 5.4 | 6.7 |
Station . | Station code . | . | Rainfall percentage . | Missing data . | ||
---|---|---|---|---|---|---|
Elevation (m) . | Wet period (%) . | Dry period (%) . | Wet period (%) . | Dry period (%) . | ||
Kuyera | KU | 1,936 | 85.2 | 14.8 | 10.5 | 11.1 |
Belela | BE | 1,866 | 88.5 | 11.5 | 7.9 | 16.3 |
Leku | LE | 1,879 | 80.8 | 19.2 | 6.5 | 7.1 |
Kokossa | KOK | 2,400 | 77.3 | 22.7 | 32.8 | 36.5 |
Shamena_Kedida | SHK | 2,072 | 82.1 | 17.9 | 6.7 | 11.2 |
Wondo_genet | WO | 1,880 | 81.6 | 18.4 | 7.8 | 1.7 |
Tula | TU | 1,873 | 78.1 | 21.9 | 4.0 | 3.5 |
Hawassa | HA | 1,750 | 83.7 | 16.3 | 0.1 | 3.1 |
Hawassa_Tabor | HW | 1,750 | 82.8 | 17.2 | 5.2 | 1.9 |
Wateraressa | WT | 2,631 | 83.3 | 16.7 | 16.1 | 19.8 |
Shashemene | SHA | 1,927 | 80.7 | 19.3 | 17.0 | 23.7 |
Kofele | KOF | 2,620 | 78.1 | 21.9 | 5.4 | 6.7 |
The observed data for most stations have fairly minimum missing data (Table 1), which is depicted below. Analysis of the seasonal rainfall patterns was done separately for the dry period (November–March), when only 18% of the rainfall typically falls, and the wet period (April–October), when 82% of the rainfall typically falls. The observation from Shashemene, Wateraresa, and Kokossa stations have significant missing data, especially for dry periods. This connotes the importance of an efficient method to fill the missing observation.
General method
Ground measurements of rainfall are usually susceptible to errors, which creates additional burden on the modeling efforts. Furthermore, the ground observations are usually set in accessible areas (i.e. close to urban areas, and road side), which does not represent the distribution in remote and inaccessible areas (Dinku et al. 2014; Dile et al. 2018). This is where the remote sensing data will be important. Unfortunately, these data types also have errors. Therefore, we need separate data for bias correction and validation of satellite rainfall products (Gebremedhin et al. 2021). The ground observations are thoroughly assessed for any temporal or spatial inconsistency.
The satellite data were bias-corrected (BC) and spatially disaggregated. Then, the available ground rainfall records are used for validation by dropping some stations at a time and re-iterating each process. All stations in the watershed including those surrounding the watershed are used. Bias correction was performed using a parametric empirical QM procedure. While downscaling was performed for 1 km resolution by applying NN and bilinear (BL) interpolation methods. Different methods have been introduced to merge satellite rainfall products with in situ observation. These include geographically weighted regression (GWR), CM, geographic differential analysis (GDA), and GRA.
Flow chart of the general method; method implemented in each stage is stated in brackets.
Flow chart of the general method; method implemented in each stage is stated in brackets.
One of the biggest challenges working with large data, both spatially and temporally, is the availability of comprehensive tools. There are multiple tools like Arc GIS, statistical softwares, and others to execute individual and/or multiple processes mentioned below (Figure 3). However, it was a challenge to find one tool that can execute all the tasks with minimum computation time and storage requirement. Hence, for this study simple algorithms dependent on existing different modules written in Python language were developed. Some of the open access libraries applied in these process include; PYKRIGES: a Kriging Toolkit for interpolation tasks (Murphy 2014); Python Data Analysis (PANDAS) Library: a data analysis and manipulation including statistical analysis (McKinney 2011); Geospatial Data Abstraction Library (GDAL/OGR), a translator library for raster and vector geospatial data formats (GDAL/OGR Contributors 2020); Numerical Python (NUMPY): for accessing, manipulating and operating on data in vectors, and arrays (Harris et al. 2020); RASTERIO: for reading writing raster files (Gillies 2019); XLRD: a library for reading data and formatting information from Excel files; SHAPELY: manipulation and analysis of planar geometric objects (Gillies 2013); and Scientific Python (SCIPY): a library for optimization, mathematical calculations, statistics and others (Jones et al. 2001).
Bias correction and CM algorithm
The developed algorithm is composed of four steps. The main stages of the program include preprocessing of input data, at station bias correction and fitting distribution parameters, spatial bias correction, and CM. The main inputs to the program are raw daily satellite rainfall estimates (SREs) in ‘.tif’ file format and labeled with the date of the observation, and the daily gauge record prepared in ‘.csv’ file format; furthermore the length of the gauge record should be equal to the length of SRP. Additionally, a simple Graphical User Interface (GUI) is prepared using the PyQt widget toolkit (Riverbank Computing 2016) and includes four separate tabs for each stage.
Input data preprocess: at this stage the program helps to prepare the SRP. This includes projection, creating boundary shape file, clipping, and resampling. Since most of the procedures require interpolation the spatial reference system should be projected to metric systems, such as Transverse Mercator (UTM). Whereas the clipping and boundary file creation are required to reduce the number of grid points to be processed.
Bias correction and distribution fitting: the pre-processed raster SRP files and ground observation data in ‘.csv’ file format are used as input for this stage, where a distribution is fitted to the monthly non-zero events at collocated rain gauge locations. There are two widely used distribution options, Gamma and Exponential, where it is possible to choose both or either one. The gauge record is fitted to the selected distribution by excluding missing and non-zero events for all months separately. At this stage, the program is equipped to run bias correction (Equations (1)–(4)) either including performance measures at each rain gauge location or without performance measures (Equations (6)–(13)). The output at this stage includes fitted distribution parameters (i.e. to rain gauge records) on monthly bases, performance measure results, and BC rainfall data at all gauge locations.
All Grid Bias Correction: Using the boundary extent, fitted distribution parameters are interpolated by applying the OK technique for all months separately. The mapped/gridded parameter is then used to conduct the bias correction at all grid points for the projected raster SREs. A K-fold validation process is also optional at this stage. Hence, working extent, resolution, number of validation stations at each iteration, and projection method need to be specified in this stage. Otherwise, one can also choose to only perform spatial bias correction without the performance measures. The outputs include corrected raster file, and performance indicators in ‘.csv’ file of each iteration.
CM: The corrected raster file and rain gauge records are inputs in this stage. With the specified procedure (i.e. in the following section) the CM is conducted using the inputs. As a cross validation mechanism, the program allows us to randomly drop a specified number of stations and conduct the merging process, while performing validation for the hidden stations. The process can be repeated for a specified number of iterations. For practical application, the program also allows the CM to be conducted without performance check. Based on the choice of simulation the output includes merged raster files and performance measures for each iteration.
Resampling
Recent studies recommended the use of information gathered from multiple sensors and/or satellites to increase the chance of retrieving the best representation from different sources (Dinku et al. 2011; Dubovik et al. 2021). However, incompatible scale and resolution of different satellite observations is a major challenge in mostly used retrieval algorithms (Omondi 2017). Based on the method applied, usually resampled raster data suffer from the inability of maintaining the originally stored information (Omondi 2017). Hence, one should expect that different methods might have different error propagation patterns.
One of the widely utilized methods of resampling is the NN method. This method is advantageous for its less computational time requirement, ease of application, and ability to maintain original stored information (Brandsma & Können 2006). However, Omondi (2017) indicated that this method is the least accurate for it distorts results and sometimes omits or duplicates multiple cell values. With their ability to smoothen output raster and for up-sampling, BL resampling methods are used for averaging values in the nearest four grids (Baboo & Devi 2010).
Rain gauge observations are usually perceived as a true value of rainfall and used for correcting satellite rainfall products. On the contrary, one can argue that most ground observations are full of errors, especially using manual rain gauges in developing countries (Beyene et al. 2018; Dile et al. 2018). Aside from these modest assumptions spatial mismatch (i.e. between rain gauge point observation and pixel value of SRP) is another eminent source of error, especially for low spatial resolution (large pixel size) products (Dinku et al. 2011; Gebere et al. 2015). As a solution to this challenge, many studies (Jongjin et al. 2016; Gebremedhin et al. 2021) have depended on interpolating ground observations to create matching gridded rainfall maps. Readers are recommended to review Hu et al. (2019) and Li & Heap (2008), for an elaborate discussion on available methods of interpolation including the pros and cons of the methods. Conducting the merging process requires creating a similar resolution between SRP and interpolated ground station measurements. Hence, both NN and BL methods of resampling were tested for performance and use in this study.
Bias correction



DM techniques were originally developed to bias correct regional and global climate models (McGinnis et al. 2015; Switanek et al. 2017). One of these DM techniques, QM, has been proven to perform in correcting both regional and global climate circulation modes (Teutschbein & Seibert 2012; Heo et al. 2019). Additionally, QM techniques are chosen to bias correct satellite rainfall products for their ability to adjust the standard deviation of daily satellite rainfall products while preserving other moments as well (Katiraie-Boroujerdy et al. 2020).








For a given station and month the ground observation (without missing data) and resampled SRP at coinciding grid point was fitted to an exponential distribution, using the Maximum Likelihood method. Additionally, the distribution was fitted to non-zero events and excluded missing data. Furthermore, a standard check of the χ2 test and Kolmogorov–Smirnov test was conducted to confirm a good fit of the distribution to each dataset. Once the distribution parameters at collocated grid points were determined, parameters were transferred to grid points with no ground observation through interpolation (OK). Hence, the bias correction was conducted for all grid points with and without ground control observation. This would not have been possible with the non-parametric QM technique. Additionally, parametric QM methods allow us to correct maximum events (not recorded by ground stations) that may require extrapolation while transferring distribution parameters.
Conditional merging
Implementing a merging technique using two sources enables the capturing of complementary merits of each source, which could reduce bias in satellite rainfall products (Jongjin et al. 2016). Based on OK interpolation, CM is a conventionally used merging technique, which is used to merge radar data and meteorological data (Guenzi et al. 2017). OK is a regression technique that allows for value interpolation while minimizing mean squared error (Guenzi et al. 2017). While performing CM, it is anticipated that a theoretical semivariogram based on datasets of regularly distributed precipitation is fitted to the experimental semivariogram for the operation of OK (Jongjin et al. 2016).
The experimental semivariogram was calculated, and it was then fitted to the theoretical semivariogram using weighted least squares to determine the appropriate variogram parameters like nugget, sill, and range (Jongjin et al. 2016). In this study, CM was applied following the procedure applied by Jongjin et al. (2016). The initial step was the interpolation of the ground station record to a 1 km resolution field using OK; the second step requires extracting the collocated (i.e. with ground observation stations) resampled (1 km resolution) SRP grid cell observations; thirdly, the extracted SRP grid cells will be interpolated to 1 km resolution field using OK; in the fourth step error is calculated by deducting rainfall map created in the third step from resampled SRP observation; lastly this error field is added to the rainfall map created in the first step.
The merging procedure was conducted in two stages. The first stage was before applying the bias correction technique to the original SRP dataset. Next, it was applied after applying the bias correction described in the previous stage. Both products were tested for performance, in which the necessity of the procedure devised in this study is signified.
Performance evaluation methods
The validation was conducted at several stages, which enables it to signify the importance of each process (Figure 3). The objective function measures the goodness of fit between the computed and observed at a selected element. The choice of the objective function depends upon the need. Both bias decomposition and statistical methods were applied for two sets of data periods (i.e. wet period and dry period). The following statistical measures were used to quantify the performance accuracy of bias correction outputs. These are Percent Error in volume (PVE), Coefficients of Determination (R2), and Root Mean Squared Error (RMSE) which are widely applicable in hydrologic modeling.





These indicators depend on the number of SREs, successfully detected events (H), missing SRE events that are measured by rain gauges (M), and rainfall detected by SRE but not measured by a rain gauge (FA). The POD calculates the ratio of observed rainfall rates that are correctly detected by SREs (H), whereas the FAR calculates the ratio of wrongly detected rainfall rates (FA) by SREs. The ideal scores for the POD and FAR indices, which have a range of 0 to 1, are 1 and 0, respectively. The FBI calculates the ratio of the total number of rainfall rates observed to the total number of rainfall rates detected by the SREs, with a value range of 0 to ∞. Greater than 1 values of FBI suggest overestimation by SREs, whereas less than 1 values of FBI imply underestimation, with 1 being the perfect value.
RESULTS AND DISCUSSION
The raw CHIRP and CHIRPS products underwent analysis, and the results showed that the older/prior version had a higher percentage of wet spills or erroneous detections. As a result, the CHIRP version's detection percentage is higher. CHIRP rainfall estimates have greater than 50% of FAR.
Comparison of raw CHIRP and CHIRPS dataset at the Hawassa gauging station.
Resampling
The results from the applied spatial disaggregation methods (BL and NN) were compared to corresponding available grid points with ground observations. Examining the NN resampling/downscaling method for the dry period (Table 2), where the raw SREs are preserved, the percent of CHIRPS (V2) detection falls below 30% for most locations. Looking at this in line with FAR as high as 69% (Supplementary material, Table S1) indicates that the performance is poor. Similar conclusions were made by Goshime (2020), where the false detections are more than 50%. However, the descriptive statistical results show that very good agreement was shown at most stations, especially at Leku, Hawassa, and Hawassa_Tabor stations.
Performance evaluation of NN resampling (i.e. to 1 km resolution) techniques for dry and wet periods, for CHIRP (V0) and CHIRPS (V2)
Station . | Wet period . | . | Dry period . | |||||
---|---|---|---|---|---|---|---|---|
PEV (%) . | POD . | PEV (%) . | POD . | |||||
Vo . | V2 . | Vo . | V2 . | Vo . | V2 . | Vo . | V2 . | |
BE | −21.1 | −19.7 | 0.98 | 0.45 | −89.1 | −60.3 | 0.77 | 0.29 |
HA | −3.38 | 1.81 | 0.98 | 0.41 | −16.8 | 9.49 | 0.86 | 0.3 |
HT | −15.3 | −9.8 | 0.98 | 0.44 | −28.5 | −5.9 | 0.85 | 0.31 |
KOF | −7.8 | −5.3 | 0.99 | 0.38 | −0.26 | 14.3 | 0.9 | 0.26 |
KOK | 35.6 | 38.5 | 0.99 | 0.33 | 50.7 | 57.3 | 0.83 | 0.21 |
KU | −21 | −19.3 | 0.93 | 0.36 | −56.2 | −18.7 | 0.89 | 0.28 |
LE | −25.9 | −24.2 | 0.98 | 0.46 | −13.2 | 3.02 | 0.83 | 0.35 |
SHK | −42.7 | −37.1 | 0.98 | 0.4 | −50.9 | − 17.9 | 0.81 | 0.29 |
SHA | −97.2 | −87.2 | 0.98 | 0.37 | −95 | − 46.2 | 0.83 | 0.25 |
TU | −20.3 | −27.7 | 0.97 | 0.44 | 3.12 | 11.2 | 0.82 | 0.29 |
WT | −16.1 | −13.4 | 0.97 | 0.38 | −39.9 | −21.2 | 0.86 | 0.23 |
WO | 1 | 4.9 | 0.98 | 0.41 | −4.23 | 17.7 | 0.86 | 0.26 |
Station . | Wet period . | . | Dry period . | |||||
---|---|---|---|---|---|---|---|---|
PEV (%) . | POD . | PEV (%) . | POD . | |||||
Vo . | V2 . | Vo . | V2 . | Vo . | V2 . | Vo . | V2 . | |
BE | −21.1 | −19.7 | 0.98 | 0.45 | −89.1 | −60.3 | 0.77 | 0.29 |
HA | −3.38 | 1.81 | 0.98 | 0.41 | −16.8 | 9.49 | 0.86 | 0.3 |
HT | −15.3 | −9.8 | 0.98 | 0.44 | −28.5 | −5.9 | 0.85 | 0.31 |
KOF | −7.8 | −5.3 | 0.99 | 0.38 | −0.26 | 14.3 | 0.9 | 0.26 |
KOK | 35.6 | 38.5 | 0.99 | 0.33 | 50.7 | 57.3 | 0.83 | 0.21 |
KU | −21 | −19.3 | 0.93 | 0.36 | −56.2 | −18.7 | 0.89 | 0.28 |
LE | −25.9 | −24.2 | 0.98 | 0.46 | −13.2 | 3.02 | 0.83 | 0.35 |
SHK | −42.7 | −37.1 | 0.98 | 0.4 | −50.9 | − 17.9 | 0.81 | 0.29 |
SHA | −97.2 | −87.2 | 0.98 | 0.37 | −95 | − 46.2 | 0.83 | 0.25 |
TU | −20.3 | −27.7 | 0.97 | 0.44 | 3.12 | 11.2 | 0.82 | 0.29 |
WT | −16.1 | −13.4 | 0.97 | 0.38 | −39.9 | −21.2 | 0.86 | 0.23 |
WO | 1 | 4.9 | 0.98 | 0.41 | −4.23 | 17.7 | 0.86 | 0.26 |
On the other hand, CHIRP (V0) performance exhibited very good, i.e. referencing RMSE (Supplementary material, Table S1) and POD (Table 2). Nevertheless, as illustrated in the above section this is due to the higher rate of wet spill in V0 SREs compared to the V2 SREs. Very high FAR results are recorded in V0 in comparison to V2, which is due to the fact that V2 is sourced from ground observation and SRP (Funk et al. 2015). Negative percentage error in volume indicates over estimation, while near zero results imply very good agreements. The overestimation in volume reaches as high as 89% (at Belela station) using the NN resampling technique, whereas the BL method reduced the over estimation by 2% (Table 3).
Performance evaluation of BL resampling (i.e. to 1 km resolution) techniques for the wet period, for CHIRP (V0) and CHIRPS (V2)
Station . | Wet period . | Dry period . | ||||||
---|---|---|---|---|---|---|---|---|
PEV (%) . | POD . | PEV (%) . | POD . | |||||
Vo . | V2 . | Vo . | V2 . | Vo . | V2 . | Vo . | V2 . | |
BE | −18.5 | −17.4 | 0.98 | 0.51 | −87.3 | −58.8 | 0.77 | 0.31 |
HA | −5.94 | −0.19 | 0.98 | 0.51 | −19.1 | 6.54 | 0.86 | 0.4 |
HT | −12.7 | −8.38 | 0.98 | 0.51 | −25.4 | −4.2 | 0.84 | 0.37 |
KOF | −8.01 | −5.29 | 0.99 | 0.46 | 0.49 | 15.01 | 0.9 | 0.35 |
KOK | 35.6 | 37.48 | 0.99 | 0.41 | 50.76 | 57.3 | 0.83 | 0.26 |
KU | −20.4 | −17 | 0.93 | 0.46 | −56.1 | −10.9 | 0.89 | 0.33 |
LE | −25.2 | −23.8 | 0.98 | 0.51 | −13.4 | 2.88 | 0.83 | 0.38 |
SHK | −41.6 | −35.9 | 0.98 | 0.47 | −50.6 | −18.8 | 0.81 | 0.38 |
SHA | −95.1 | −91.7 | 0.99 | 0.44 | −95.7 | −54.5 | 0.88 | 0.31 |
TU | −24 | −23.7 | 0.97 | 0.49 | 0.54 | 14.08 | 0.82 | 0.34 |
WT | −16.8 | −13.8 | 0.97 | 0.47 | −40 | −21.5 | 0.9 | 0.3 |
WO | 1.56 | 5.1 | 0.99 | 0.5 | −2.72 | 18.54 | 0.86 | 0.31 |
Station . | Wet period . | Dry period . | ||||||
---|---|---|---|---|---|---|---|---|
PEV (%) . | POD . | PEV (%) . | POD . | |||||
Vo . | V2 . | Vo . | V2 . | Vo . | V2 . | Vo . | V2 . | |
BE | −18.5 | −17.4 | 0.98 | 0.51 | −87.3 | −58.8 | 0.77 | 0.31 |
HA | −5.94 | −0.19 | 0.98 | 0.51 | −19.1 | 6.54 | 0.86 | 0.4 |
HT | −12.7 | −8.38 | 0.98 | 0.51 | −25.4 | −4.2 | 0.84 | 0.37 |
KOF | −8.01 | −5.29 | 0.99 | 0.46 | 0.49 | 15.01 | 0.9 | 0.35 |
KOK | 35.6 | 37.48 | 0.99 | 0.41 | 50.76 | 57.3 | 0.83 | 0.26 |
KU | −20.4 | −17 | 0.93 | 0.46 | −56.1 | −10.9 | 0.89 | 0.33 |
LE | −25.2 | −23.8 | 0.98 | 0.51 | −13.4 | 2.88 | 0.83 | 0.38 |
SHK | −41.6 | −35.9 | 0.98 | 0.47 | −50.6 | −18.8 | 0.81 | 0.38 |
SHA | −95.1 | −91.7 | 0.99 | 0.44 | −95.7 | −54.5 | 0.88 | 0.31 |
TU | −24 | −23.7 | 0.97 | 0.49 | 0.54 | 14.08 | 0.82 | 0.34 |
WT | −16.8 | −13.8 | 0.97 | 0.47 | −40 | −21.5 | 0.9 | 0.3 |
WO | 1.56 | 5.1 | 0.99 | 0.5 | −2.72 | 18.54 | 0.86 | 0.31 |
Comparing the two resampling methods, slight improvement in performance (i.e. descriptive statistics) was shown by BL SREs. However, examining the bias decomposition result the NN showed slight performance improvement. This is consistent with the core principle of the resampling techniques where new and undetected values are assigned by the BL technique from neighboring grids (Omondi 2017). Whereas the NN method preserves/does not introduce new observations at focus grids. For the study period (i.e. 11 years) the total depth of missed and correctly estimated SRP at the collocated ground station is determined by MB and HB (Supplementary material, Tables S3 and S4). Higher positive HB value (wet period of Shashemene, Kofele, and Leku station) indicates the total overestimated intensity of rainfall for correctly detected events by the CHIRPS, while the negative value (wet period of Belela, Kokossa, and Wondo_genet stations) implies underestimated intensities by CHIRP product. Whereas higher negative values of MB in V2 estimates compared to the V0 SRP indicates that the CHIRP product has fewer missed events compared to the CHIRPS product.
At this stage of the procedure, it is sufficient to check the percent of volume error (PVE) and RMSE in order to signify the importance of resampling. Overall, the assessment using descriptive statistics (Table 3) illustrates that BL interpolation is a more efficient method of resampling. Nevertheless, looking at the PVE performance we can see that in some stations (i.e. bolded in Table 2) NN method has shown to outperform the BL method. The assessment of bias decomposition methods (Supplementary material, Tables S3 and S4) illustrates that the BL interpolation method is a more efficient method of spatial disaggregation. In conclusion, the performance before and after resampling did not show exaggerated change. This is accredited to the fact that the CHIRPS algorithm also utilizes the information recorded at ground stations. Nevertheless, Gebremedhin et al. (2021) referenced that CHIRPS products can be corrected using ground station records not used in the original product. Refer to Supplementary Information for details on the results of bias decomposition performance measures.
The comparison of the resampling output clearly shows that the BL resampling technique has better performance on the CHIRPS product. The BL resampling technique has proven to perform better in upper Takeze watersheds as well (Gebremedhin et al. 2021). Total depth of missed events in wet period is greater than dry period depth for CHIRPS product, while the reverse is true for the CHIRP product.
Resampling procedures are important in reducing the spatial miss match between grid observation and point observation of ground stations. However, resampling methods need to be selected properly by taking into consideration the pros and cons of available methods in line with the objective of the study. In this particular study, the BL resampling has performed better than the NN resampling technique, which is similar to the study conducted by Gebremedhin et al. (2021).
Bias correction
The bias correction technique utilized in this study was parametric empirical QM, which requires determining best known distribution to fit gauge data and SRP. For similar SRP and control observation (gauge observation), the CDF might vary depending on temporal resolution, geographical location, season, rainfall formation, and other factors (Ma et al. 2019). Accordingly, one should consider the applicability of known distribution functions for the selected co-located data pairs (Soo et al. 2020), meaning different functions for different datasets. Hence, acknowledging the variability of rainfall temporally, distributions were fitted on monthly bases.
Accordingly, distribution good fit tests, and Kolmogorov–Smirnov showed that exponential distribution was a good fit both for resampled CHIRPS product and ground observation. The Kolmogorov–Smirnov test was assessed against critical value at a 5% significance level. The bias correction (Equations (1)–(3)) at available grid points was also tested for performance.
CHIRPS RMSE test result for dry and wet periods separately; before indicates the result after resampling and after indicates the result after bias correction.
CHIRPS RMSE test result for dry and wet periods separately; before indicates the result after resampling and after indicates the result after bias correction.
The limitation of this process (i.e. specific to this study) is that no bias correction was done to correct no rainfall event miss-mach. Hence, one can clearly see that the bias correction has resulted in poor performance of HB and Miss Bias in comparison to the resampled product. With the aim to correct grid points with no ground observations, bias correction using interpolated distribution parameters was tested using three hidden stations (Shashemene, Wateraressa, and Tula stations); while the result indicates (Table 4) that the pattern has been captured well, but the volume was underestimated except for dry period at Tula station.
Performance result of CHIRPS spatial bias correction using interpolated distribution parameters
Station . | RMSE(mm) . | PVE (%) . | HB (mm) . | MB (mm) . | ||||
---|---|---|---|---|---|---|---|---|
Dry period . | Wet period . | Dry period . | Wet period . | Dry period . | Wet period . | Dry period . | Wet period . | |
ShA | 2.87 | 6.59 | −75.09 | −111.24 | −863.50 | −5,010.20 | 704.30 | 2,551.90 |
TU | 4.02 | 7.80 | 5.57 | −37.39 | 144.60 | −3,278.70 | 1,717.60 | 4,486.20 |
WT | 3.53 | 8.24 | −30.11 | −21.17 | −599.70 | −1,966.50 | 1,439.90 | 5,415.50 |
Station . | RMSE(mm) . | PVE (%) . | HB (mm) . | MB (mm) . | ||||
---|---|---|---|---|---|---|---|---|
Dry period . | Wet period . | Dry period . | Wet period . | Dry period . | Wet period . | Dry period . | Wet period . | |
ShA | 2.87 | 6.59 | −75.09 | −111.24 | −863.50 | −5,010.20 | 704.30 | 2,551.90 |
TU | 4.02 | 7.80 | 5.57 | −37.39 | 144.60 | −3,278.70 | 1,717.60 | 4,486.20 |
WT | 3.53 | 8.24 | −30.11 | −21.17 | −599.70 | −1,966.50 | 1,439.90 | 5,415.50 |
Performance test for CHIRPS using interpolated distribution parameters.
Negative results (i.e. except RMSE) indicate that the estimated value has underestimated the ground observation at that point, while the positive results indicate over estimation. From the result (Supplementary material, Figure S1) it has been shown that most results suggest that pattern information can be transferred from grid points with ground observation to neighboring grid points with no ground observation. However, the result (except RMSE) at Tula station is not in agreement with the result of the other two stations. These might be due to the limitation of the study, that no rainfall events were corrected.
Two conventionally used distributions were tested for bias correcting the resampled CHIRP product. The result indicates that the exponential distribution (Table 5) best represents the wet period, while the dry period is well captured by a gamma distribution (Table 5). The result has also signified that false satellite rainfall detections can be eliminated or reduced by implementing moving window sampling techniques. The positive near zero results of FBI and PVE (Table 5) show very good agreement and a slight underestimation. However, compared to the NN performance (Table 2), the dry period is not well captured by the exponential distribution.
At station bias correction result performance of exponential and Gamma distribution
Station . | Exponential distribution . | Exponential distribution . | ||||||
---|---|---|---|---|---|---|---|---|
PVE (%) . | POD . | PVE (%) . | POD . | |||||
Dry . | Wet . | Dry . | Wet . | Dry . | Wet . | Dry . | Wet . | |
BE | −42.76 | 3.03 | 0.77 | 0.98 | 0.01 | 11.94 | 0.56 | 0.89 |
HA | −47.69 | 3.16 | 0.86 | 0.98 | 2.48 | 11.92 | 0.63 | 0.89 |
HT | −39.39 | 0.91 | 0.85 | 0.98 | 1.72 | 11.73 | 0.62 | 0.88 |
KOF | −30.33 | 0.14 | 0.9 | 0.99 | 10.92 | 11.96 | 0.67 | 0.89 |
KOK | −18.53 | 0.83 | 0.83 | 0.99 | 30.75 | 13.85 | 0.54 | 0.88 |
KU | −61.87 | 0.14 | 0.85 | 0.99 | −30.65 | 10.34 | 0.71 | 0.9 |
LE | −26.66 | 1.14 | 0.83 | 0.98 | 10.12 | 10.98 | 0.62 | 0.87 |
SHK | −61.55 | 0.13 | 0.86 | 0.99 | −16.45 | 12.99 | 0.6 | 0.87 |
SHA | −40.96 | 7.02 | 0.83 | 0.94 | 14.42 | 17.54 | 0.54 | 0.84 |
TU | −33.89 | 0.98 | 0.82 | 0.97 | 7.48 | 10.56 | 0.57 | 0.89 |
WT | −33.5 | 2.87 | 0.86 | 0.97 | 7.33 | 14.27 | 0.57 | 0.86 |
WO | −56.04 | 1.92 | 0.85 | 0.98 | −1.95 | 12.37 | 0.58 | 0.88 |
Station . | Exponential distribution . | Exponential distribution . | ||||||
---|---|---|---|---|---|---|---|---|
PVE (%) . | POD . | PVE (%) . | POD . | |||||
Dry . | Wet . | Dry . | Wet . | Dry . | Wet . | Dry . | Wet . | |
BE | −42.76 | 3.03 | 0.77 | 0.98 | 0.01 | 11.94 | 0.56 | 0.89 |
HA | −47.69 | 3.16 | 0.86 | 0.98 | 2.48 | 11.92 | 0.63 | 0.89 |
HT | −39.39 | 0.91 | 0.85 | 0.98 | 1.72 | 11.73 | 0.62 | 0.88 |
KOF | −30.33 | 0.14 | 0.9 | 0.99 | 10.92 | 11.96 | 0.67 | 0.89 |
KOK | −18.53 | 0.83 | 0.83 | 0.99 | 30.75 | 13.85 | 0.54 | 0.88 |
KU | −61.87 | 0.14 | 0.85 | 0.99 | −30.65 | 10.34 | 0.71 | 0.9 |
LE | −26.66 | 1.14 | 0.83 | 0.98 | 10.12 | 10.98 | 0.62 | 0.87 |
SHK | −61.55 | 0.13 | 0.86 | 0.99 | −16.45 | 12.99 | 0.6 | 0.87 |
SHA | −40.96 | 7.02 | 0.83 | 0.94 | 14.42 | 17.54 | 0.54 | 0.84 |
TU | −33.89 | 0.98 | 0.82 | 0.97 | 7.48 | 10.56 | 0.57 | 0.89 |
WT | −33.5 | 2.87 | 0.86 | 0.97 | 7.33 | 14.27 | 0.57 | 0.86 |
WO | −56.04 | 1.92 | 0.85 | 0.98 | −1.95 | 12.37 | 0.58 | 0.88 |
On the contrary, comparing the HB and MB (Supplementary material, Tables S6 and S7) implies that the missed events have not been changed, rather the correctly captured events have increased in intensity. As an illustration, the HB at Shamena_Kedida station has increased by 388.2 mm, while that Miss Bias only increased by 62.9 mm. The Gamma distribution estimate shows mostly underestimation of events except for Kuyera and Shamena_Kedida stations in the dry period.
The RMSE performance measure has improved slightly for bias corrections by Gamma distribution over the dry period, while wet period performance was improved using the exponential distribution (for details refer to Supplementary Information). Hence combining these distributions, i.e. applying gamma distribution for dry period and exponential distribution for wet period provides enhanced BC SREs.
Existing bias correction approaches are commented on for their inability to correct rainfall events (Lehner et al. 2020). Similarly, the underperformance was achieved by applying the parametric QM method without considering non rainfall events. However, significant estimation improvement was achieved at some stations and comparing the dry versus wet season, dry period performance was better. Furthermore, a potential method of spatial bias correction for grid points where no in situ records are available was tested. Encouraging results were achieved through this process, which will signify the importance of transferring bias correction parameters to grid points where ground measurements are not available.
Merged products
The study has assessed two merged products, which helps to understand the advantages of each stage in the devised procedure. The first product was a merged rainfall product gained by using available ground observations and resampled product. The validation procedure includes cross validation with 20 iterations, where in each iteration random two stations were dropped, and the merging was done using the remaining 10 stations. After merging the performance test was done at the hidden/dropped stations. The main reason to select two stations at a time was that it allows the remaining training data to be sufficient and to avoid completely unrepresented or distant observation to conduct interpolation.
Conditionally merged product RMSE result for all iterations (dots) compared to RMSE results after resampling (solid line) for dry and wet periods.
Conditionally merged product RMSE result for all iterations (dots) compared to RMSE results after resampling (solid line) for dry and wet periods.
Bias-corrected and merged product RMSE results for all iterations (dots) compared to RMSE results after resampling (solid line) for dry and wet periods.
Bias-corrected and merged product RMSE results for all iterations (dots) compared to RMSE results after resampling (solid line) for dry and wet periods.
The output of merging using BC rainfall map and ground observation was tested using a similar procedure of cross validation over 20 iterations. The result indicates that all iterations show significantly improved performance test values. Comparing the dry period to wet period, the performance of the dry period is much improved. However, this does not entirely indicate the capability of the procedure to correct the dry period event better than the events in the wet period. Part of this discrepancy could be a result of the number of wet events being higher in wet period, while higher no rainfall events in dry period and vice versa.
Additionally, the study has highlighted the significance of combining two or more observational sources to improve rainfall spatial estimates. Even though the CHIRPS product is well-known for its retrieval algorithm to consider ground rainfall measurements, significant performance improvement was achieved by combining the raw product with ground measurement. Similar deductions were made by different studies (Navas et al. 2019; Gebremedhin et al. 2021). Furthermore, the performance of the simple CM technique has outperformed the devised fused bias correction method, where the parametric empirical QM method was followed by the CM technique. However, this is due to the inability of the study to consider rainfall events in the bias correction process.
The after BC merged product of CHIRP SRE has shown significant improvement compared to the raw product. Especially, the wet period statistics result was better compared to the dry period statistics. Depending on the combination of coupled stations for validation, up to 70% PVE improvement for wet period and up to 50% (Figure 8) improvement was achieved at some stations (Shamena_Kedida, Wateraressa, and Hawassa_Tabor stations).
From the HB performance (Supplementary material, Table S8) one can observe most correctly detected events are underestimated (i.e. negative values) except Hawassa, and Kokossa in the wet season. The above result is the most efficient at a station performance out of 30 iterations by randomly choosing two hidden stations at a time. Comparing the BC and merged CHIRP output (Table 6) with the resampled CHIRPS estimates (Table 2) result indicates some performance measures (i.e. MB, HB, RMSE and POD) (Figure 9) have significantly improved.
Performance result for conditionally merged after bias correction of CHIRP SREs
Stations . | Dry period . | Wet period . | ||
---|---|---|---|---|
PVE (%) . | POD . | PVE (%) . | POD . | |
BE | −53.36 | 0.93 | − 13.04 | 0.99 |
HA | 0.68 | 0.86 | 17.42 | 0.89 |
HT | −16.3 | 0.94 | 4.79 | 0.98 |
KOF | −10.17 | 0.91 | − 8.75 | 0.95 |
KOK | 45.89 | 0.78 | 34.33 | 0.92 |
KU | −52.92 | 0.87 | − 17.77 | 0.8 |
LE | −17.64 | 0.95 | − 19.8 | 0.98 |
SHK | −17.64 | 0.87 | − 17.77 | 0.8 |
SHA | −81.45 | 0.9 | − 72.97 | 0.81 |
TU | 9.78 | 0.9 | − 15.77 | 0.98 |
WT | −6.51 | 0.81 | − 5.84 | 0.94 |
WO | 15.7 | 0.79 | 16.29 | 0.92 |
Stations . | Dry period . | Wet period . | ||
---|---|---|---|---|
PVE (%) . | POD . | PVE (%) . | POD . | |
BE | −53.36 | 0.93 | − 13.04 | 0.99 |
HA | 0.68 | 0.86 | 17.42 | 0.89 |
HT | −16.3 | 0.94 | 4.79 | 0.98 |
KOF | −10.17 | 0.91 | − 8.75 | 0.95 |
KOK | 45.89 | 0.78 | 34.33 | 0.92 |
KU | −52.92 | 0.87 | − 17.77 | 0.8 |
LE | −17.64 | 0.95 | − 19.8 | 0.98 |
SHK | −17.64 | 0.87 | − 17.77 | 0.8 |
SHA | −81.45 | 0.9 | − 72.97 | 0.81 |
TU | 9.78 | 0.9 | − 15.77 | 0.98 |
WT | −6.51 | 0.81 | − 5.84 | 0.94 |
WO | 15.7 | 0.79 | 16.29 | 0.92 |
Comparison of raw CHIRP and CHIRPS product at collocated ground stations with BC, and SBCM
Station . | RMSE (mm) . | R2 . | PVE (%) . | |||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|
BC CHIRP . | Raw CHIRP . | Raw CHIRPS . | SBCM CHIRP . | BC CHIRP . | Raw CHIRP . | Raw CHIRPS . | SBCM CHIRP . | BC CHIRP . | Raw CHIRP . | Raw CHIRPS . | SBCM CHIRP . | |
WA | 96.28 | 105.63 | 100.34 | 100.59 | 0.3 | 0.16 | 0.23 | 0.22 | −3.55 | −16.94 | −14.77 | −6.08 |
TU | 45.13 | 47.23 | 41.26 | 40.22 | 0.58 | 0.52 | 0.63 | 0.67 | −6.96 | −11.57 | −10.38 | −9.31 |
HA | 38.26 | 39.74 | 33.75 | 32.23 | 0.67 | 0.62 | 0.73 | 0.83 | −5.86 | −1.78 | 3.17 | 17.12 |
HT | 43.51 | 46.06 | 44.8 | 42.79 | 0.52 | 0.49 | 0.5 | 0.53 | −6.99 | −14.53 | −9.65 | −5.6 |
SHA | 21.09 | 53.48 | 49.77 | 59.9 | 0.66 | 0.46 | 0.51 | 0.31 | −2.74 | −89.41 | −78.87 | −74.95 |
KO | 33.37 | 43.81 | 36.42 | 39.24 | 0.7 | 0.5 | 0.65 | 0.65 | −6.84 | −2.9 | −0.86 | −9.37 |
WO | 43.04 | 45.55 | 41.87 | 42.66 | 0.66 | 0.58 | 0.66 | 0.69 | 9.66 | 3.54 | 7.34 | 16.18 |
SHK | 46.68 | 51.81 | 48.93 | 42.35 | 0.57 | 0.53 | 0.56 | 0.63 | −11.63 | −39.79 | −33.45 | −22.83 |
LE | 41.69 | 49.93 | 47.13 | 42.35 | 0.61 | 0.55 | 0.61 | 0.7 | 11.48 | −20.07 | −18.81 | −19.37 |
KU | 37.39 | 42.66 | 41.28 | 53.79 | 0.59 | 0.5 | 0.53 | 0.43 | 20.89 | −20.95 | −19.23 | −23.2 |
KO | 76.97 | 105.39 | 103.16 | 93.75 | 0.48 | 0.36 | 0.4 | 0.49 | −3.66 | 43.38 | 42.9 | 37.59 |
BE | 54.5 | 71.38 | 71.36 | 64.01 | 0.68 | 0.45 | 0.44 | 0.53 | −2.95 | −30 | −25.04 | −18.3 |
Station . | RMSE (mm) . | R2 . | PVE (%) . | |||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|
BC CHIRP . | Raw CHIRP . | Raw CHIRPS . | SBCM CHIRP . | BC CHIRP . | Raw CHIRP . | Raw CHIRPS . | SBCM CHIRP . | BC CHIRP . | Raw CHIRP . | Raw CHIRPS . | SBCM CHIRP . | |
WA | 96.28 | 105.63 | 100.34 | 100.59 | 0.3 | 0.16 | 0.23 | 0.22 | −3.55 | −16.94 | −14.77 | −6.08 |
TU | 45.13 | 47.23 | 41.26 | 40.22 | 0.58 | 0.52 | 0.63 | 0.67 | −6.96 | −11.57 | −10.38 | −9.31 |
HA | 38.26 | 39.74 | 33.75 | 32.23 | 0.67 | 0.62 | 0.73 | 0.83 | −5.86 | −1.78 | 3.17 | 17.12 |
HT | 43.51 | 46.06 | 44.8 | 42.79 | 0.52 | 0.49 | 0.5 | 0.53 | −6.99 | −14.53 | −9.65 | −5.6 |
SHA | 21.09 | 53.48 | 49.77 | 59.9 | 0.66 | 0.46 | 0.51 | 0.31 | −2.74 | −89.41 | −78.87 | −74.95 |
KO | 33.37 | 43.81 | 36.42 | 39.24 | 0.7 | 0.5 | 0.65 | 0.65 | −6.84 | −2.9 | −0.86 | −9.37 |
WO | 43.04 | 45.55 | 41.87 | 42.66 | 0.66 | 0.58 | 0.66 | 0.69 | 9.66 | 3.54 | 7.34 | 16.18 |
SHK | 46.68 | 51.81 | 48.93 | 42.35 | 0.57 | 0.53 | 0.56 | 0.63 | −11.63 | −39.79 | −33.45 | −22.83 |
LE | 41.69 | 49.93 | 47.13 | 42.35 | 0.61 | 0.55 | 0.61 | 0.7 | 11.48 | −20.07 | −18.81 | −19.37 |
KU | 37.39 | 42.66 | 41.28 | 53.79 | 0.59 | 0.5 | 0.53 | 0.43 | 20.89 | −20.95 | −19.23 | −23.2 |
KO | 76.97 | 105.39 | 103.16 | 93.75 | 0.48 | 0.36 | 0.4 | 0.49 | −3.66 | 43.38 | 42.9 | 37.59 |
BE | 54.5 | 71.38 | 71.36 | 64.01 | 0.68 | 0.45 | 0.44 | 0.53 | −2.95 | −30 | −25.04 | −18.3 |
However, looking at the error in volume of merged CHIRP products the estimation was poor at Kuyera station, and Belela station for dry period. Additionally, even though the error in volume at Kokossa and Shashemene stations has shown improvement when compared to CHIRPS estimate, the overall performance is not satisfactory. In general, the study signifies that the devised procedure to correct CHIRP SREs can be effective in reducing errors when compared to the CHIRPS product.
Spatial distribution of rainfall estimates recorded on 16 August 2017 for (a) rainfall records at stations; (b) resampled (NN) CHIRPS version 2.0; (c) spatially BC and merged CHIRP; and (d) resampled (NN) CHIRP.
Spatial distribution of rainfall estimates recorded on 16 August 2017 for (a) rainfall records at stations; (b) resampled (NN) CHIRPS version 2.0; (c) spatially BC and merged CHIRP; and (d) resampled (NN) CHIRP.
Monthly SREs performance
Monthly estimates by both SRPs (i.e. CHIRP and CHIRPS) were assessed to investigate the impact of bias correction at different temporal resolution. Aggregating daily raw SRP and BC SREs at the monthly temporal resolution, the monthly estimates were compared (Table 7) using three statistical performance measures (Equations (5)–(7)). Performance tests were not conducted seasonally at this stage. The upgraded CHIRPS product has fairly good estimates at originally incorporated gauging station (i.e. such as Hawassa gauging station). Examining the error in volume over the study period BC CHIRP estimates have outperformed the remaining estimates at all stations except at Hawassa, Kokossa, and Wondo_genet stations. Additionally, BC CHIRP estimates have underestimated observations at Kuyera, Wondo_genet, and Leku stations.
When compared to resampled CHIRPS estimates, Spatially Bias-Corrected and Conditionally Merged (SBCM) estimates showed slightly better correlation at all stations except for estimates at Watararessa, Shashemene, and Kuyera stations (Table 7). Hence, the study has signified the importance of multistage bias correction methods to improve bias corrections for ungauged gridded SREs. This is supported by the comparably better result of SBCM estimates at the under-represented western part of the watershed which is only represented by the Shamena_Kedida station.
In conclusion, the study has proved that the SBCM approach is capable of improving the quality of SREs. The method is expected to improve SREs for other areas, given that all assumptions and requirements of the method are fulfilled. Especially, considering the decline of supplied ground station data to archives such as Climate Hazards Group (CHG) (Funk et al. 2015), the SBCM technique is useful for including unused datasets available from different sources. According to Funk et al. (2015), the number of ground station records available for the development of CHIRPS 2.0 has declined from 2004 stations in the year 2004–500 stations in the year after 2010 in Africa. Hence, the method can be implemented for areas having supplementary rainfall records which are not considered in the development of CHIRPS version 2.0. Furthermore, this study suggests that the performance of the SBCM improves for areas with scarce rain gauge stations. Thus implementation of the method for under-represented areas is highly encouraged in comparison to using readily available BC or merged SREs.
CONCLUSION
Nowadays different sources of rainfall records are flourishing, as the requirement for more accurate high-resolution records is high. However, each source is limited in providing the demanded quality data, in different aspects of quality. Hence, the importance of developing new and improved techniques for fusing multiple data sources is increasing. Similarly, in light of the systematic error and random error found in satellite rainfall products the demand for a more accurate bias correction technique is increasing. However, each method has different limitations: as an illustration, some bias correction techniques are better at correcting patterns rather than extreme events. Other methods used to merge multiple sources suffer from preserving the pattern of in situ observation.
Hence, this study has provided the potential to produce accurate rainfall maps, on a daily time scale, by using multistage procedure to correct CHIRP and CHIRPS satellite rainfall products. Additionally, by evaluating the performance of each stage, the significance of each method devised in each stage was illustrated. However, the performance achieved is limited to the study area, and does not limit the procedure to be tested to other study areas. Furthermore, the study is limited in bias correcting no rainfall frequency.
From the result, it has been proven that the study area BL resampling technique performs better. Additionally, it has been signified that the objective resampling technique helps to reduce the bias due to spatial mismatch between ground observation and selected grid estimates of SREs. The resampling outputs were BC using the parametric empirical QM method. The result indicates that, when good fit distribution equations specific to a given data are used, bias correction of raw satellite rainfall products can improve their performance when compared to in situ observation. Further assessment on transferring bias correction parameters from grid points with in situ records to grid points with no ground observation was performed. Also, the result has illustrated that one can effectively transfer bias correction parameters from available grid points with control ground stations to other grids through interpolation.
Finally, merging satellite rainfall products with ground observation is highly capable of preserving extreme events as well as preserving patterns of rainfall. However, the study is limited to correct dry-day frequency, which has significantly affected the performance bias correction. Hence, it is recommended that future studies focus on such fusing techniques to access the merits of different methods and provide improved bias correction procedures.
DATA AVAILABILITY STATEMENT
Data cannot be made publicly available; readers should contact the corresponding author for details.
CONFLICT OF INTEREST
The authors declare there is no conflict.