## Abstract

The distribution of rainfall is not uniform in various regions of the world. Rainfall study is used to recognize the characteristics, duration, and variability of temporal and spatial rainfall distributions. In Ethiopia, the annual and seasonal rainfall distributions are variable in space and time. This study is focused on the implication of spatiotemporal rainfall distribution on the areal rainfall characteristics evaluation under the Upper Erer Sub-basin, located in Eastern Ethiopia. In the study area, average annual rainfall amount for Dire Dawa, Harar and Haramaya, Girawa, and Gursum stations are found to be 647, 816, 801, 958, and 840 mm with coefficients of variation of 23, 20, 20, 19, and 31%, respectively. However, the rain gauges here are sparsely distributed. The rainfall occurrence and distribution at various gauging stations have been found to vary significantly both temporally and spatially. The rainfall occurrence and distribution at various gauging stations had been found to vary significantly both temporally and spatially. The spatiotemporal rainfall distribution found in the stations was assessed using a joint probability of rain days approach. The result indicated joint probability of rain days estimation approach under monthly time step has better performance than daily, decadal, and seasonal data. The joint probability approach is used along with rainfall amount under monthly rainfall for areal rainfall estimation assessment in rainfall–runoff modeling.

## HIGHLIGHTS

This study focused on the assessment of spatiotemporal rainfall distribution in the Upper Erer Sub-basin.

Correlations between the MERRA version 2 (MERRA-2) and the observed rainfall were evaluated.

Monthly and seasonal rainfall variability and annual contribution were discussed.

Joint probability of rain days for the monthly rainfall was best fitted for areal rainfall estimation characteristics distribution.

## INTRODUCTION

Rainfall plays a significant role in the hydrologic cycle, which is used to manage the water supplies. Determination of areal rainfall of a catchment is the prerequisite for various water resource and watershed modeling studies (Bayraktar *et al*. 2005; Cheng *et al*. 2012). The average rainfall is usually used to calculate the spatial rainfall status of a region and its input into various rainfall–runoff models (Ayoad 1983; Belay *et al*. 2019). So, rainfall analysis is considered imperative in hydrology and in climatological studies that are used to recognize its characteristics, duration, and variability in terms of temporal and spatial distribution (Gummadi *et al*. 2018). The distribution of rainfall is not uniform due to regional orographic effects and sources of rains. Thus, Ethiopia receives rainfall in three main seasons. The main rainy seasons known as Summer (June–September) is the long rainy season, the second is the short rainy season in Spring (February–May), and the third rainy season is in Winter (October–January) (Aldabadh *et al*. 1982; NMSA 1996; Seleshi & Zanke 2004; Cheung *et al*. 2008). Rainfall variability is an important feature of semi-arid climates, and climate change is likely to increase this variability. Generally, in the region climate change is likely to increase rainfall variability with summer seasonal rainfall comprising the largest share (Al-Houri 2014; Girma *et al*. 2019).

Inter-seasonal rainfall variation and intra-annual rainfall availability, as well as reliability, significantly affect agricultural activities, hydrologic conditions, and livelihoods (Ramos & Martínez-Casasnovas 2006; Hessebo *et al*. 2019). There is high intra-annual variability of rainfall in most parts of the country of Ethiopia (Seleshi & Zanke 2004; Mengistu *et al*. 2014). The variability of monthly and seasonal rainfall is higher than that of annual rainfall (Aldabadh *et al*. 1982; Seleshi & Zanke 2004; Ngongondo *et al*. 2011). Rainfall fluctuation occurs both in annual variability and inter-annual variability (Eshetu *et al*. 2016). This is because rainfall occurrence in the region at various gauging stations has been found to vary significantly contingent on source of rain and regional landscape (Singh 1992). The point rainfall data are recorded using the installed gauging stations (Sen & Habib 2000). This point rainfall data are converted to areal rainfall data that represent the quantity of rainfall that falls on the region. Hence, areal rainfall estimations are very sensitive to the number and locations of rain gauges (Wilson 1970; Bell & Moore 2000; Ngongondo *et al*. 2011; Cho *et al*. 2017; Kadhim *et al*. 2020). Moreover, the average annual rainfall depends on the elevation and tends to increase with increasing elevation (Faures *et al*. 1995; Hessebo *et al*. 2019). The other biophysical factors also affect rainfall and result in marked spatiotemporal variability at small distances (Oettli & Camberlin 2005; Terink *et al*. 2018). This relative difference in between rain gauges was determined as one more feature related to their biophysical factors (Nyssen *et al*. 2005; Bewket & Conway 2007; Taesombat & Sriwongsitanon 2009; Ngongondo *et al*. 2011; Nandargi & Mulye 2012; Zhang *et al*. 2016). The aforementioned studies indicate the spatiotemporal rainfall variability affects the rainfall occurrence and distribution in various regions.

In reply to the increasing worry about rainfall studies, investigators have started to improve techniques of evaluating the rainfall distribution and intensities. Researchers calculate the amount of rain that falls on the station, by converting it from point rainfall data to represent the quantities that fall on the region as a whole and not on the station itself (Yuan *et al*. 2014). This is represented as the areal rainfall of the region. So far, several kinds of areal rainfall estimation methods have emerged (Cheng *et al*. 2012; Eruola *et al*. 2015; Cho *et al*. 2017; Kadhim *et al*. 2020) such as: Arithmetic Mean, Statistical, Isoheytes, and Thiessen methods, but no method accurately represents rainfall distribution (Sen & Habib 2000).

In this study area, we are faced with the problem of an inadequate network of rainfall measuring stations, which are also ununiformly distributed. The scattered distribution of gauging stations affects the rainfall distribution and intensities in the gauging stations. The rainfall occurrences at similar time situation for the stations were evaluated using the probability of all or some of the stations to get rain jointly (mutually) would be the main concern. The joint probability of amount of rainfall occurrences at the stations used to assess the areal rainfall estimation characteristics in the region. that involves the spatiotemporal rainfall distribution of mutual rainfall occurrence for the stations is missing. This study aims to fill this gap by conducting a more comprehensive approach that consider the mutual rainfall occurrence including its amount of rainfall for the stations. This study aims to fill this gap by conducting a more comprehensive approach that considers the mutual rainfall occurrence for the stations.

## MATERIALS AND METHODS

### Description of the study area

^{o}13′26.4″ N to 9

^{o}31′26.4″ N latitude and 42

^{o}4′40.8″ E to 42

^{o}20′38.4″ E longitude in the eastern highlands of Ethiopia. The sub-basin is categorized by high topographic relief fluctuating from 1,306 to 3,019 m above the mean sea level. Figure 1 shows the meteorological gauging stations and contour map for the study area. The Upper Erer River Alluvial Valley Plain segregates the two major water basins in Ethiopia (Awash and Wabishble) found about 10–20 km east and south east of Harar Town.

### Datasets and methodology

Daily rainfall data for five stations (Dire Dawa, Haramya, Harar, Gursum, and Grawa) that have relatively good quality and long records (>30 years) were collected from the Ethiopian National Meteorological Institute (ENMI). The observed rainfall data available duration for the study was 38 years (1983–2020) for Dire Dawa and Haramaya, 36 years (1985–2020) for Harar, and 35 years (1983–2017) for Grawa and Gursum. The missing values for all five stations are checked and all values are less than 20%. As the Normal ratio method is recommended for filling in missing values more than 10%, in this study the Normal ration method is used along with the simple arithmetic mean method. The detailed description of the gauging station is provided in Table 1. Rain gauges accurately measure rainfall, they are rarely found in mountainous regions, and satellite rainfall data can be used as an alternative source for these regions (NMSA 1996). These data are used to check the applicability of the ground rainfall data in the region.

Stations . | Geographic location . | Average elevation (m) . | Average annual rainfall (mm) . | Period of observations . | Proportion of missing values (%) . | |
---|---|---|---|---|---|---|

Latitude . | Longitude . | |||||

Dire Dawa | 9.60 | 41.86 | 1045 | 647 | 1983–2020 | 9.12 |

Haramaya | 9.43 | 42.02 | 2025 | 816 | 1983–2020 | 12.92 |

Harar | 9.30 | 42.08 | 1977 | 801 | 1985–2020 | 18.01 |

Grawa | 9.13 | 41.83 | 2470 | 958 | 1983–2017 | 14.83 |

Gursum | 9.35 | 42.39 | 1937 | 840 | 1983–2017 | 8.74 |

Stations . | Geographic location . | Average elevation (m) . | Average annual rainfall (mm) . | Period of observations . | Proportion of missing values (%) . | |
---|---|---|---|---|---|---|

Latitude . | Longitude . | |||||

Dire Dawa | 9.60 | 41.86 | 1045 | 647 | 1983–2020 | 9.12 |

Haramaya | 9.43 | 42.02 | 2025 | 816 | 1983–2020 | 12.92 |

Harar | 9.30 | 42.08 | 1977 | 801 | 1985–2020 | 18.01 |

Grawa | 9.13 | 41.83 | 2470 | 958 | 1983–2017 | 14.83 |

Gursum | 9.35 | 42.39 | 1937 | 840 | 1983–2017 | 8.74 |

Modern-Era Retrospective Analysis for Research and Applications, version 2 (Merraa-2) daily precipitation data from January 1983 to December, 2020 were adapted for this study. This dataset was acquired from the National Aeronautics and Space Administration Prediction of Worldwide Energy Resources (NASA POWER) website (https://power.larc.nasa.gov/). It has a spatial resolution of 0.5° × 0.625° latitude/longitude region (Yoo *et al*. 2008). Spatially, MERRA-2 data tended to closely predict rainfall amounts in high elevation to rugged mountainous zones (e.g., Mt. Kenya) (*R*^{2} = 0.97) as outlined by (Randles *et al*. 2017). This demonstrates the promising potential of satellite remote sensing data (MERRA-2) in complementing the existing meteorological observed data, which are often marred by inconsistency and scarcity, and hence are unreliable in the existing agricultural advisory and other climate-based applications in Kenya and Sub-Saharan Africa at large (Peterson & Easterling 1994). MERRA-2 shows more observable areas with an elevation of more than 1,500 m. It shows a variability ratio closer to one for monthly time step (Machariaa *et al*. 2020). The satellite-based data products are particularly relevant to support rainfall measurements in Africa, which is challenged by low-density rain gauge networks and incomplete observations (Hafizi & Sorman 2022). Another similar study shows that the evaluation of MERRA-2 precipitation datasets is especially important for understanding the spatiotemporal distribution of precipitation (Ayehu *et al*. 2021).

The areal rainfall estimation was highly affected by the number of existing stations within the sub-basin. The continues record of gauging stations data has been sustained attention in understanding the rainfall distribution by assessing its variation in amount and rainy days (Mugalavai *et al*. 2008; Hamal *et al*. 2020). In this study, the reanalysis data sets (Merra-2) were used as the first estimation that is to check their suitability using the linear regression method.

Homogeneity test and correlation analysis were also performed in the study to check the homogeneity and consistency of the ground gauged rainfall data, respectively. Double-mass curve techniques were used for the homogeneity test. The observed rainfall data and the elevation relationship were also assessed to guess the rainfall distribution dependency on the topography (Taesombat & Sriwongsitanon 2009). The spatial correlation coefficients of at least 0.7 between the test series and reference series were required previously (Randles *et al*. 2017). But another study (Ngetich *et al*. 2014) determined that the correlation (r) between the reference series and the candidate station had to be 0.80 or higher to be reliable enough to use for filling missing data.

*σ*is the standard deviation, and

*μ*is the mean. The degree of variability of rainfall events are classified as low (CV < 20), moderate (20 < CV < 30), and high (CV > 30) (NMSA 1996; Peterson

*et al*. 1998). The contribution of seasonal rainfall to the total annual rainfall in percent (CT) for each station is also computed.

*P*(joint rain days) is the joint probability of rain days,

*n*is the joint rain days calculated by counting all mutual rain days for the station combinations, and

*N*is the total number of days in the period under consideration. The rain gauging station combinations for different stations such as five stations, four stations, three stations, and two stations are used for the analysis.

*N*is the number of observations, and is calculated by Equation (4)

is the average rainfall for the combined stations in mm, is the observed rainfall for the stations under consideration, and n is the number of stations under considerations. The maximum joint probability of rain days for the combined stations is the determinant areal rainfall estimation for the region. This is used to develop methods for estimating the areal rainfall characteristics distribution under inadequate data using the rainfall probability approach. The final analysis results are used to identify the times series with the maximum joint probability of rain days in the region.

## RESULTS AND DISCUSSIONS

### Correlation assessment for MERRA-2 and observed rainfall records

*R*

^{2}> 0.5) indicate the good correlation between the ground observation and the MERRA-2 reanalysis data for the region.

Spatially, the MERRA-2 data tended to closely predict rainfall amounts in the high elevation to rugged mountainous zones as suggested by (Abate 2009). Table 2 shows the linear regression of ground rain gauge observation and observed MERRA-2 for specific stations indicated with all coefficients of determination (*R*^{2} > 0.5). The ground rain gauge observation and the MERRA-2 observed data are mutually correlated and this implies that ground gauged data are used for the analysis in the sub-basin. The ground rainfall data have consistency with satellite observed data in the region. This suggests that the ground rainfall data are the representative of the stations in the study area and may be used for analysis to estimate the areal rainfall characteristics.

Station . | Regression equation . | Coefficient of determination (R^{2})
. |
---|---|---|

Dire Dawa | Y= 1.1222x | R^{2} = 0.752 |

Haramaya | Y= 0.9353x | R^{2} = 0.799 |

Gursum | Y= 0.5634x | R^{2} = 0.549 |

Harar | Y= 0.9x | R^{2} = 0.689 |

Grawa | Y= 0.784x | R^{2} = 0.750 |

Station . | Regression equation . | Coefficient of determination (R^{2})
. |
---|---|---|

Dire Dawa | Y= 1.1222x | R^{2} = 0.752 |

Haramaya | Y= 0.9353x | R^{2} = 0.799 |

Gursum | Y= 0.5634x | R^{2} = 0.549 |

Harar | Y= 0.9x | R^{2} = 0.689 |

Grawa | Y= 0.784x | R^{2} = 0.750 |

### Rainfall distribution and elevation of gauging stations

*R*

^{2}= 0.9844). The rainfall stations located in neighboring areas have the same pattern in terms of rainfall amount and their station positions (Taesombat & Sriwongsitanon 2009). This is in agreement with similar studies that suggested high elevation has got more precipitation compared with lower elevation (Gelaro

*et al*. 2017; Al-Ozeer

*et al*. 2020). The graph (Figure 3) shows the annual rainfall distribution is in good agreement with increasing elevation in the region. This determines that the annual rainfall amount will be positively correlated with the elevation. That is, high elevation may receive high rainfall and vice versa in the region.

### Rainfall variability and seasonal contribution

The average annual rainfall for Dire Dawa, Harar, Haramaya, Girawa, and Gursum are found to be 647, 816, 801, 958, and 840 mm with coefficients of variation of 23, 20, 20, 19, and 31%, respectively (Table 3). The results showed that seasonal rainfall distributions are more variable than annual rainfall (Girma *et al*. 2019). Spring season was more variable than Summer and less variable than Winter. The mean coefficient of variation for the stations shows 27, 41, and 81% for Summer, Spring, and Winter season, respectively. This also indicates that Summer and Spring seasons have a maximum annual contribution of 53 and 36%, respectively. This agrees with the study by Lebel *et al*. (1987), with Spring contributing up to 40% of the annual rainfall over northeastern, central, and southwestern Ethiopia.

Station name . | Summer Rainfall . | Spring Rainfall . | Winter Rainfall . | Annual Rainfall . | |||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|

μ
. | σ
. | CV (%) . | CT (%) . | μ
. | σ
. | CV (%) . | CT (%) . | μ
. | σ
. | CV (%) . | CT (%) . | μ
. | σ
. | CV (%) . | |

Dire Dawa | 309 | 96 | 31 | 53 | 260 | 118 | 45 | 36 | 78 | 69 | 88 | 12 | 647 | 150 | 23 |

Haramaya | 447 | 98 | 22 | 55 | 283 | 95 | 33 | 34 | 86 | 71 | 83 | 11 | 816 | 162 | 20 |

Harar | 408 | 106 | 26 | 52 | 306 | 132 | 43 | 37 | 87 | 60 | 69 | 12 | 801 | 157 | 20 |

Grawa | 500 | 101 | 20 | 54 | 361 | 149 | 41 | 35 | 97 | 74 | 75 | 12 | 958 | 183 | 19 |

Gursum | 424 | 148 | 35 | 53 | 331 | 145 | 44 | 36 | 85 | 76 | 88 | 10 | 840 | 262 | 31 |

Mean | 418 | 27 | 53 | 308 | 41 | 36 | 89 | 81 | 11 | 816 | 23 |

Station name . | Summer Rainfall . | Spring Rainfall . | Winter Rainfall . | Annual Rainfall . | |||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|

μ
. | σ
. | CV (%) . | CT (%) . | μ
. | σ
. | CV (%) . | CT (%) . | μ
. | σ
. | CV (%) . | CT (%) . | μ
. | σ
. | CV (%) . | |

Dire Dawa | 309 | 96 | 31 | 53 | 260 | 118 | 45 | 36 | 78 | 69 | 88 | 12 | 647 | 150 | 23 |

Haramaya | 447 | 98 | 22 | 55 | 283 | 95 | 33 | 34 | 86 | 71 | 83 | 11 | 816 | 162 | 20 |

Harar | 408 | 106 | 26 | 52 | 306 | 132 | 43 | 37 | 87 | 60 | 69 | 12 | 801 | 157 | 20 |

Grawa | 500 | 101 | 20 | 54 | 361 | 149 | 41 | 35 | 97 | 74 | 75 | 12 | 958 | 183 | 19 |

Gursum | 424 | 148 | 35 | 53 | 331 | 145 | 44 | 36 | 85 | 76 | 88 | 10 | 840 | 262 | 31 |

Mean | 418 | 27 | 53 | 308 | 41 | 36 | 89 | 81 | 11 | 816 | 23 |

where stands for mean values of rainfall (mm), stands for standard deviation, CV stands for coefficient of variation (%) and CT stands for seasonal rainfall contribution (%). Bold values indicate the mean.

The Winter season is characterized by significant inter-annual and intra-seasonal variability. The inter‐annual and intra‐seasonal variability makes the Winter season rainfall highly variable. Large scale rainfall variability with the highest rainfall amount (Hamal *et al*. 2020). In addition, a similar study confirmed there is high rainfall and variability in most Ethiopian regions (Bewket & Conway 2007).

### Seasonal distribution of rain days and rainfall amounts

As shown in Table 4, mean annual rain days ranged from 67 to 100 and the mean annual rainfall amount was between 647 and 958 mm for all stations. Dire Dawa station was with relatively minimum mean annual rain days and rainfall amount. Summer season rain days distribution ranges for more than half of the annual rain days contribution. But the seasonal contribution of Spring covers more than one-third of the annual rain days for all stations. This is in agreement with the considerable variations in annual and seasonal rainfall amounts listed by Girma *et al*. (2019). As discussed by Bekele-Biratu *et al*. (2018), the number of rain days with rainfall amounting ≥ 0.1 mm ranged from 101 to 116 from June to October months of the monsoon season, which is the Summer rainy season in India.

Seasons . | Stations . | |||||
---|---|---|---|---|---|---|

Dire Dawa . | Haramaya . | Harar . | Grawa . | Gursum . | ||

Annual | Rainy days | 67 | 90 | 100 | 83 | 73 |

Rainfall amounts (mm) | 647 | 816 | 801 | 958 | 840 | |

Summer | Rainy days | 36 | 50 | 52 | 44 | 39 |

Rainfall amounts (mm) | 309 | 447 | 408 | 500 | 424 | |

Spring | Rainy days | 24 | 30 | 37 | 29 | 26 |

Rainfall amounts (mm) | 260 | 283 | 306 | 361 | 331 | |

Winter | Rainy days | 8 | 10 | 12 | 9 | 7 |

Rainfall amounts (mm) | 78 | 86 | 87 | 97 | 85 |

Seasons . | Stations . | |||||
---|---|---|---|---|---|---|

Dire Dawa . | Haramaya . | Harar . | Grawa . | Gursum . | ||

Annual | Rainy days | 67 | 90 | 100 | 83 | 73 |

Rainfall amounts (mm) | 647 | 816 | 801 | 958 | 840 | |

Summer | Rainy days | 36 | 50 | 52 | 44 | 39 |

Rainfall amounts (mm) | 309 | 447 | 408 | 500 | 424 | |

Spring | Rainy days | 24 | 30 | 37 | 29 | 26 |

Rainfall amounts (mm) | 260 | 283 | 306 | 361 | 331 | |

Winter | Rainy days | 8 | 10 | 12 | 9 | 7 |

Rainfall amounts (mm) | 78 | 86 | 87 | 97 | 85 |

### Annual, Summer, Spring, Winter rain days variability and distribution

The rainy days are the actual number of days for rain occurrence of 1 mm or more (Seleshi & Zanke 2004). The results in Table 5 show mean Summer rainy days variability is less than 20% for Haramaya and Dire Dawa stations and for the rest, the variability is all less than 30%. The mean annual rainy days were also the lowest for Haramaya and Dire Dawa stations, having a variability of 16 and 18%, respectively. Harar is the station with 39% variability for Spring season and Haramaya is the lowest, having 26% variability. For Dire Dawa station, the variability is high (coefficient of variation = 74%) and for the rest of the stations, the variability is more than 50%.

Stations . | Mean rainy days for Summer season . | Mean rainy days for Spring season . | Mean rainy days for Winter season . | Mean rainy days for Annual season . | ||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|

μ
. | σ
. | CV . | μ
. | σ
. | CV . | μ
. | σ
. | CV . | μ
. | σ
. | CV . | |

Dire Dawa | 36 | 7 | 19 | 24 | 9 | 37 | 8 | 6 | 74 | 67 | 12 | 18 |

Haramaya | 50 | 7 | 15 | 30 | 8 | 26 | 10 | 6 | 59 | 90 | 14 | 16 |

Harar | 52 | 13 | 25 | 37 | 14 | 39 | 12 | 6 | 54 | 100 | 25 | 25 |

Grawa | 44 | 11 | 25 | 29 | 10 | 33 | 9 | 6 | 62 | 83 | 19 | 23 |

Gursum | 39 | 9 | 24 | 27 | 9 | 34 | 8 | 5 | 67 | 73 | 17 | 23 |

Mean | 44 | 22 | 29 | 34 | 9 | 63 | 83 | 21 |

Stations . | Mean rainy days for Summer season . | Mean rainy days for Spring season . | Mean rainy days for Winter season . | Mean rainy days for Annual season . | ||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|

μ
. | σ
. | CV . | μ
. | σ
. | CV . | μ
. | σ
. | CV . | μ
. | σ
. | CV . | |

Dire Dawa | 36 | 7 | 19 | 24 | 9 | 37 | 8 | 6 | 74 | 67 | 12 | 18 |

Haramaya | 50 | 7 | 15 | 30 | 8 | 26 | 10 | 6 | 59 | 90 | 14 | 16 |

Harar | 52 | 13 | 25 | 37 | 14 | 39 | 12 | 6 | 54 | 100 | 25 | 25 |

Grawa | 44 | 11 | 25 | 29 | 10 | 33 | 9 | 6 | 62 | 83 | 19 | 23 |

Gursum | 39 | 9 | 24 | 27 | 9 | 34 | 8 | 5 | 67 | 73 | 17 | 23 |

Mean | 44 | 22 | 29 | 34 | 9 | 63 | 83 | 21 |

As provided in Table 5, the maximum and minimum Summer daily coefficients of variation are 25 and 19%, respectively. The Spring seasonal rain days variability ranges from 26 to 37% and for Winter, the seasonal variability is all greater than 50% with a maximum value of 74%. But the annual rain days have a variability from 16% to 25% . The annual rain days variability is less than the seasonal; the minimum rain days variability is in the Summer season. But the maximum variability occurs in Winter season with a mean variability of 63% for the stations (Bewket & Conway 2007; Bekele-Biratu *et al*. 2018), which suggests that the stations receive rainfall that is variable in its amount and distribution.

### Annual rain days and relationship between amount and distance and elevation variance

The stations considered for the analysis were found at ∼70 km radius from the centroid of the sub-basin (Figure 1). The elevation of the gauging stations ranges from 2,470-m (Grawa) to 1,045-m (Dire Dawa) (Table 1). In Table 6, the relative distance from the centroid of the sub-basin shows that Haramaya and Harar stations are at the shortest distances of 18 and 21 km, respectively. Gursum station is at the longest distance with 69 km from the centroid. So, Haramaya and Harar, which have the shortest distance between each, may be highly probability to determine the occurrence and amount of rainfall iwithin the stations.

The values are described as correlation coefficients, distance between stations in kilometers, and the elevation difference in meters in the order from left to right. Bold values are the maximum.

High correlation coefficient value of 0.99 for Dire Dawa–Gursum with relative distance and elevation difference is 110 km and 892 m, respectively. At Haramaya–Harar, rainy days have extremely high correlation coefficient values of 0.95, the relative distance is 16 km, and the elevation variation is 48 m. From the result, it can be seen that the smallest coefficient of correlation value is 0.71, which is the correlation coefficient of Harar–Dire Dawa and Harar–Gursum, and the relative distance and elevation difference are 41 km and 932 m and 82 km and 40 m, respectively.

The rain days coefficients of correlation for Haramaya–Dire Dawa and Haramaya–Gursum are both 0.75 with the relative distance of 25, 89 km and elevation difference of 980, 88 m respectively. This shows that the elevation–rainfall relationship is negative and the relative distance between the stations of Haramaya–Gursum is 89 km, which is greater than 80 km. The rain days coefficient of correlation not only depends on the elevation, but the relative distance also has an impact. Generally, for all stations (rainy days elevation) and (rainy days distance) correlation coefficients (*r* > 0.50), as suggested by Kendall & Gibbons (1990), implied there were high correlations. The rain days not only depend on elevation, but also the relative distance, this was also confirmed by a similar study (Ayoad 1983). However, the correlation coefficient was significant for the elevation-rainfall agreement, as suggested (Oettli & Camberlin 2005).

As provided in Table 7, the probability of rain for Haramaya is 31% and that of Harar is 33% with a relative distance from the centroid of the sub-basin of 18 and 21 km, respectively. For Haramaya and Harar, the mean annual rainfall amount correlation coefficient is 0.98, the relative distance is 16 km, and the elevation variation is 48 m, so Haramaya–Harar have good correlation and the relative distance and elevation gap are small. Haramaya–Gursum have an extremely high correlation coefficient (0.97) and Harar–Gursum also have an extremely high correlation (0.95), but the relative distance and elevation difference are 89 km and 88 m and 82 km and 40 m, respectively. The correlation coefficients (*r* > 0.50) shows the correlations are in good agreement for all stations (Kendall & Gibbons 1990).

The values are described as correlation coefficients, distance between stations in kilometers, and the elevation difference in meters in the order from left to right. Bold values are the maximum.

For Haramaya–Harar, the correlation is extremely high and the relative distance and elevation difference is also minimum, and the correlation coefficient depends on the relative distance and elevation difference between the two stations. The smallest correlation coefficient value is 0.68, which is the correlation coefficient value for Grawa–Dire Dawa, where the distance between the stations is 110 km, and the elevation difference is 1,425 m. This implies that the correlation coefficient depends on the relative distance and elevation between the stations. As the relative distance between the station and the elevation difference were high, the correlation coefficient was relatively small for the Grawa–Dire Dawa station. For Gursum–Harar, (coefficient of correlation = 0.95) and the relative distance and elevation difference are 82 km and 40 m. This shows that even if the elevation difference is minimal and the distance between the two stations is more, the correlation for mean annual rainfall is extremely high.

The coefficient of correlation for Gursum–Haramaya is extremely high with a value of 0.97, and the distance and elevation difference are 89 km and 88 m, respectively. From the result, the relative distance between the two is more (89 km), the elevation difference is minimal, the correlation coefficient for the mean annual rainfall amount is extremely high, and has extremely high correlation. For Gursum–Haramaya and Gursum–Harar, the coefficient of correlation is 0.97 and 0.95, respectively, but the elevation difference is not a positive relation to the mean annual rainfall.

Haramaya–Dire Dawa and Harar–Dire Dawa correlation coefficient values are 0.79 and 0.81 with the relative distance of 25 and 41 m and elevation difference of 980 and 932 m. The result shows that the correlation coefficient of Haramaya–Dire Dawa is the smallest compared with Harar–Dire Dawa, the elevation difference is also maximum for Haramaya–Dire Dawa compared with Harar–Dire Dawa. For Grawa–Gursum and Grawa–Haramaya, the correlation coefficient are 0.88, 0.85, and the relative distance and elevation are 112,39 km and 533,445 m, respectively, the relative distance of Grawa to Gursum is high (=112 km) and the elevation difference only has no impact on the mean annual rainfall amount, the correlation coefficient and the relative distance also have an impact. But as discussed by Mishra (Mishra 2013), significant variations were detected for stations found within 15 km distance.

### Joint probability of rain days estimation with daily, decadal, monthly, and seasonal variations

The joint probability of rainfall incident for different combinations of gauging stations at a given period is calculated by counting the joint rainfall events in the whole period under consideration (in Supplementary Material, Appendix Table S1) and dividing by the number of observation records, and then multiplying by 100%. Identifying the implication of spatiotemporal areal rainfall distribution of the gauging stations in the sub-basin requires designating the joint probability method considering continuous observed rainfall data for time series. The joint probability of rain days occurrence of the station indicated that the frequencies of all or parts of the gauging stations have rain simultaneously for the period under consideration.

The joint combination of the gauging stations considers the rainfall incidence at a period, which will be used to estimate the areal rainfall characteristics using the probability approach method.

The probability of daily rainfall for station combinations receiving rain mutually are all less than 15% except for the Haramaya–Harar station combination, which is greater than 15%. The joint probability of all stations receiving rain mutually are 25.37 and 39.47% for decadal and monthly periods, respectively. The joint probability of monthly rain days for all combinations receiving rain mutually are greater than 39% with the maximum joint probability of 62.50%. The minimum joint probability for Summer and Spring seasons to receive rain mutually for the stations are 68.42 and 67. 42%, with the maximum value of 89.47 and 89.47%, respectively, as shown in appendix Table S1. The analysis result indicated that the Summer season has high probability of joint rainfall occurrence and daily time series with the smallest probability of joint rainfall for the stations in the region.

### Joint arithmetic average rainfall estimation with daily, decadal, monthly, and seasonal variations

For the arithmetic average areal rainfall estimates, as suggested by Eruola *et al*. (2015) at least five temporal scales were analyzed. Namely, daily, decadal, monthly, Spring season, and Summer season. The different timescale rain gauge combinations were compared to get the best combinations out of 2, 3, 4, and 5. The areal rainfall estimation of joint arithmetic mean rainfall for different timescales is assessed. The arithmetic average of rainfall estimation for the stations under daily, decadal, monthly, and seasonal observed rainfall were calculated by averaging the rainfall amounts from January 1984 to December 2020 for different periods (in appendix Table S2).

The joint average combination of the gauging station considers the rainfall amount at a period that will bring the arithmetic average rainfall for the region. About 23 gauging station combinations were developed to identify the areal average rainfall in different periods such as daily decadal, monthly, Summer season, Spring season.

### The effects of joint rain gauge combination change on the joint probability of rain days estimation approach

The rain days probability for daily time step for the Upper Erer river basin is less than 20% probability and for decadal and monthly joint probability, the estimates were a minimum value of 25.37 and 39.47%, respectively. The probability of joint rain days for daily time step is less than ≈ 15%. The maximum joint probability estimation for rain days is ≈ 89%, that is, for two-station combinations. The decadal joint probability estimation for the two-station combinations is all >40% and for three and four stations the values are >35 and 25%, respectively. And the joint probability estimation for five stations is >25%.

The maximum and minimum monthly joint probability values of 62. 50% and 39.47% for two- and five-station combinations were noted, respectively. The seasonal joint probability estimation is with a maximum value of 89.47 and a minimum value of 67.42% for two- and five-station combinations, respectively. As confirmed by Jackson (1977, 1981), a better relationship occurs among monthly rainfall and number of rain days than among monthly rainfall and mean daily rainfall amount.

### Relationship between joint probability of rain days and joint arithmetic mean method

For the 23 station combinations, the linear regression of the joint arithmetic mean and the probability of joint rain days for the different periods under consideration show that the maximum correlation value is for monthly, *R*^{2} is 0.9784. This shows that the linear regression of joint arithmetic monthly areal indicated high accuracy with respect to the probability of joint monthly rain observations. The seasonal linear regressions for Summer and Spring show a good relationship for the joint areal seasonal mean rainfall with the probability of joint seasonal rain observation.

Furthermore, this study also revealed that areal rainfall estimations are very sensitive to the position and number of rain gauging stations, which was also noted by other scholars (Bell & Moore 2000; Terink *et al*. 2018). Similarly, as confirmed by Nandargi & Mulye (2012), the accuracies of the areal rainfall estimation tend to increase when the number of stations increases.

### Assessing the joint probability of rain days approaches

As shown in Table 8, the correlations between the joint probability rain days estimation using five different timescales and the observed joint arithmetic areal rainfall were assessed using a scatter diagram. From the result, the joint probability approach for monthly timescale is better for estimating the areal rainfall of the region than the daily and decadal periods. But, the seasonal (Summer and Spring) is slightly more than the monthly timescale. During the Spring season the coefficient of determination is higher than in the other time series (Table 8). The slope of the graph is approaching to 1 for monthly times scale than the others. This shows that the relation is in good agreement on the monthly basis than others.

Indicators . | Joint probability of rain days estimation for different periods . | ||||
---|---|---|---|---|---|

Daily . | Decadal . | Monthly . | Summer . | Spring . | |

Α | 3.2184 | 1.8925 | 0.8943 | 0.8659 | 1.1683 |

R^{2} | 0.8267 | 0.9704 | 0.9784 | 0.9864 | 0.9921 |

Indicators . | Joint probability of rain days estimation for different periods . | ||||
---|---|---|---|---|---|

Daily . | Decadal . | Monthly . | Summer . | Spring . | |

Α | 3.2184 | 1.8925 | 0.8943 | 0.8659 | 1.1683 |

R^{2} | 0.8267 | 0.9704 | 0.9784 | 0.9864 | 0.9921 |

The letters *A* and *R*^{2} denote the equation of the line and the coefficient of determination drawn using the joint arithmetic (horizontal) and joint probability (vertical), respectively.

## CONCLUSIONS

The study showed that up-to-date areal rainfall estimation characteristics and spatial and temporal behavior in the Upper Erer Sub-basin were identified. Moreover, the implications of the daily, decadal, monthly, and seasonal rainfall distribution characteristics were analyzed using an efficient approach over the Upper Erer Sub-basin. The correlation of Haramaya–Harar is high and relative distance and elevation difference are minimum. The correlation coefficient depends on the relative distance and elevation difference between the two stations. The smallest correlation coefficient value is 0.68, which is the correlation coefficient value for Grawa–Dire Dawa, where the distance between the stations is 110 km, and the elevation difference is 1,425 m. The mean annual rainy days and mean annual rainfall amount of correlation coefficients (*r* > 0.50) for all gauging stations were determined.

Linear regression for the total monthly records for the MERRA-2 and ground rain gauge observations were analyzed. The results for the coefficient of determination (*R*^{2} > 0.5) indicate the high correlation between the ground observation and MERRA-2 reanalysis data for the region. The arithmetic average of areal rainfall estimation for the daily, decadal, monthly, and seasonal observed rainfall were determined by averaging the rainfall amounts from January 1984 to December 2020 for different time series. The probability of joint daily rain for the station combinations is all less than 20%. For decadal time series, the probability of rain days for all combinations is greater than 25%. While, the probability of monthly time series for the station combinations are all greater than 50% except in the case of a few combinations. Out of 23 possibilities, only eight of them are less than 50%.

The correlation between the probability of joint rainfall estimation using five different timescales and the observed joint arithmetic areal rainfall of the scatter diagram showed that the sub-basin rainfall characteristics are not described well only by making direct averaging by any areal rainfall estimation known method, unless the probability of joint rainy days are determined. The result indicated the joint probability of rain days estimated for daily and decadal timeseries did not determine the likely output. However, for monthly rain the joint probability determines the sub-basin average areal rainfall characteristics reasonably well. So, the areal rainfall of the sub-basin is recommended to be determined by the combination of the probability of joint monthly rainfall and the rainfall amount together for water resource assessment in rainfall–runoff modeling. The study analysis was conducted with a scattered and small number of stations, and future investigations should consider additional gauging stations under different regions to further evaluate the present method. Especially, more gauging station samplings used for the integration of observations over timescales need to be discovered.

## DISCLOSURE

This research work is part of a Ph.D. thesis of the corresponding author and the co-author is an advisor of the corresponding author.

## ACKNOWLEDGMENTS

The authors would like to acknowledge the National Meteorological Institute of Ethiopia for allowing the use of the meteorological data.

## FUNDING

The research work received no external funding.

## DATA AVAILABILITY STATEMENT

All relevant data are included in the article or its Supplementary Information.

## CONFLICT OF INTEREST

The authors declare there is no conflict.