Abstract
In this study, three high-resolution gridded rainfall datasets, viz., Integrated Multi-Satellite Retrievals for Global Precipitation Measurement (IMERG), Modern-Era Retrospective Analysis for Research and Applications 2 (MERRA-2), and Precipitation Estimation from Remotely Sensed Information using Artificial Neural Networks (PERSIANN) have been collected, analyzed, and compared against the ground-based observed rain gauge datasets of Indian Meteorological Department (IMD) of the Bagmati river basin from 2001 to 2014 in the Bihar State of India. Comparison analyses were performed at daily, monthly, seasonal, and annual time scales. Various statistical parameters, contingency tests, trend analysis, and rainfall anomaly index were used for comparison of datasets. Though MERRA-2 had the highest probability of detection (POD) and lowest false alarm ratio (FAR), analysis showed that IMERG data were closely matching with the observed data, whereas MERRA-2 and PERSIANN underestimated the extreme values. For the monthly scale, again IMERG had the most optimal Coefficient of Determination (R2) and Nash-Sutcliffe Efficiency (NSE) values. IMERG also performed well in detecting rainfall trends and identifying wet and dry years. Overall, IMERG was the most suitable dataset at all time scales for future studies in the basin.
HIGHLIGHTS
Gridded rainfall datasets of IMERG, MERRA-2, and PERSIANN can be used as the substitute for IMD rainfall data in relatively ungauged and data-scarce regions of the Bagmati river basin.
IMERG datasets have the potential to be used for real-time hydrological and climatological studies.
IMERG performed best in cumulative distribution function, trend analysis, statistical evaluation, and box plot analysis.
INTRODUCTION
Rainfall datasets are a major influencing factor in most hydrological and climatological studies. The datasets of rainfall are very useful in flood forecasting, climate modeling, water resources management, agriculture, and urban planning (Hamal et al. 2020; Setti et al. 2020; Gautam & Pandey 2022; Ranjan & Singh 2022; Kumar et al. 2023). All these applications require accurate and reliable datasets with easy accessibility. Generally, the source of these datasets is a ground-based rain gauge network. But in most of the developing countries, rain gauge networks are inept and sparse over the region which causes an inadequate supply of data (Hughes 2006; Nanda et al. 2016; Kumar et al. 2017; Rincón-Avalos et al. 2022). These networks are generally expensive to establish and maintain, particularly in remote or difficult-to-access regions. Rain gauges give estimates of rainfall only at point locations (Nanding et al. 2015). They are affected by measurement errors due to wind-induced under-catch or evaporation (Yeditha et al. 2020). So, to solve this issue of authentic data scarcity, gridded rainfall datasets are often used as an alternative (Yeditha et al. 2020; Bhattacharyya et al. 2022; Gautam & Pandey 2022). They have better spatial and temporal resolution, which makes it possible to study rainfall in remote or inaccessible regions (Su et al. 2008; Kidd & Huffman 2011). They provide information in detail, which is important for understanding local variations in rainfall and helps in the continuous monitoring of rainfall (Saikrishna et al. 2021). Most of the gridded datasets are free of cost and easily accessible without restriction in real or near real-time (Sorooshian et al. 2000; Gelaro et al. 2017; Huffman et al. 2020). The gridded datasets have significantly improved the studies involving climate change, flood, landslide, soil erosion, and drought monitoring (Hong et al. 2007; Wang et al. 2016; Xiao et al. 2020; Hinge et al. 2021; Yeditha et al. 2022; Suroso et al. 2023).
The gridded rainfall datasets are broadly categorized into three types, viz., gauge-based datasets (observed), satellite-based datasets, and reanalysis datasets (Wang et al. 2020; Bhattacharyya et al. 2022). The gauge-based datasets are derived from measurements taken at ground-level using rain gauges (Boers et al. 2016). The data are then interpolated onto a regular grid using various methods, such as inverse distance weighting or kriging. (Chua et al. 2022). Examples of gauge-based datasets include the Global Historical Climatology Network (GHCN) (Menne et al. 2012) and the Global Precipitation Climatology Centre (GPCC) (Schneider et al. 2015). The satellite-based datasets are derived from satellite measurements of precipitation using microwave and infrared sensors (Ray et al. 2022). Examples of satellite-based datasets include the Tropical Rainfall Measuring Mission (TRMM), (Huffman et al. 2007), Integrated Multi-Satellite Retrievals for Global Precipitation Measurement (IMERG) (Huffman et al. 2020), and Precipitation Estimation from Remotely Sensed Information using Artificial Neural Networks (PERSIANN) (Sorooshian et al. 2000). The reanalysis datasets are derived from atmospheric models that combine observations from various sources, such as satellites, surface stations, and radiosondes to produce a comprehensive record of the Earth's climate over several decades (Bhattacharyya et al. 2022). Examples of reanalysis datasets include the Modern-Era Retrospective Analysis for Research and Applications 2 (MERRA-2) (Gelaro et al. 2017), European Centre for Medium-Range Weather Forecasts Interim Reanalysis (ERA-Interim) (Dee et al. 2011).
Gridded datasets must undergo evaluation against reliable ground-based observed rainfall to be suitable for hydrological and climatological studies (Xiao et al. 2020). This is necessary because observation methods, instrumentation, and rainfall extraction in gridded products can introduce errors. Additionally, natural variables like climate, rainfall intensity, season, and topography can vary, affecting dataset performance and precision across different areas (AghaKouchak et al. 2012; Sun et al. 2018). Uncertainties in rainfall data can impact hydrologic and climatological modeling predictions, highlighting the importance of initial dataset evaluation.
Many researchers around the globe have evaluated the gridded rainfall datasets against ground-based rainfall measurements. For instance, the results of a study by Miri et al. (2019) showed that Global Precipitation Mission (GPM)-IMERG outperformed other satellite datasets (TRMM_3B42 and PERSIANN-Climate Data Record (CDR) with R2 as 0.6 and 0.83 at daily and annual scales, respectively over Iran. According to Tang et al. (2020) study in China, the precision of IMERG precipitation estimations rose from 2001 to 2018 due to improvements in satellite sensors. Ramos Filho et al. (2022) showed that in areas with few rain gauges, Satellite Precipitation Products (SPPs) can be used to establish precipitation thresholds. Bhattacharyya et al. (2022) performed a comparative analysis of different gridded rainfall datasets and India Meteorological Department (IMD) gauge-based gridded datasets over India. Notably, the PERSIANN-CDR dataset exhibited a predominant drying trend, while both the Climate Prediction Centre (CPC) and MERRA-2 showcased a wetting trend across all extreme rainfall indices. Reddy & Saravanan (2022) based on the study of seven SPPs in the Godavari basin, India concluded that finer resolution SPPs can be used for regional-scale hydrological studies after comparison. Sireesha et al. (2020) appraised the performance of precipitation datasets of IMD, TRMM, GPCC, and MERRA-2 in the Sina basin. Based on all the tests, TRMM datasets were ascertained as suitable for climatic or hydrological studies. Using the hydrological model J2000, Kumar et al. (2017) assessed the TMPA-3B42 v7 dataset and rain gauge observations in the Kopili river basin and reached the conclusion that bias-corrected TRMM precipitation has the potential to substitute the observed rainfall dataset in regions with limited data or ungauged basins. This is due to the dataset's superior spatial and temporal resolution.
Bagmati river is an international river flowing between Nepal and India. In India, it flows through the state of Bihar. The region is prone to floods, and the existing rain gauge network in the Bagmati basin is relatively inefficient, leading to a lack of adequate data for hydrological and climatological studies. Despite the importance of the Bagmati river basin in Bihar, no studies have been found that have employed satellite-based datasets like IMERG, PERSIANN, or reanalysis datasets like MERRA-2 to understand the hydrological and climatological conditions of the region. This highlights the need for more research in this area, especially given the challenges posed by floods in this densely populated and developing state.
The main objectives of this paper are (1) to compare three gridded datasets, viz., IMERG and PERSIANN (satellite-based datasets) and MERRA-2 (reanalysis datasets) with respect to observed daily, monthly, seasonal, and annual rain gauge datasets of the Bagmati river basin of Bihar and (2) to investigate the limitations in these gridded rainfall datasets.
STUDY AREA AND DATA COLLECTION
Study area
Data used
The analysis was carried out based on the available daily data from the years 2001 to 2009 and monthly and annual data from 2001 to 2014 of the rain gauge stations of IMD. The different gridded daily rainfall datasets for the same period were downloaded. The source of different gridded datasets is mentioned in their respective sections. The Digital Elevation Model (DEM) tiles of 90 m of Shuttle Radar Topography Mission (SRTM) were downloaded from the website of the United States Geological Survey (USGS) Earth Explorer (https://earthexplorer.usgs.gov/) for preparing the digital elevation model, basin delineation, and river network map.
Daily gridded rainfall data
In this study, three gridded datasets are used, viz., IMERG, MERRA-2, and PERSIANN. In these datasets, the area of interest is divided into a regular grid of cells, and the amount of rainfall that occurred in each cell is estimated based on observations from various sources, such as satellite sensors, rain gauges, radar, or atmospheric models (Tang et al. 2020; Ray et al. 2022).
IMERG
IMERG-gridded SPP datasets were developed by the National Aeronautics and Space Administration (NASA) and Japan Aerospace Exploration Agency (JAXA) for GPM. It leverages data from satellites in low-earth orbit and geostationary satellites, combining microwave-calibrated infrared satellite estimates to provide accurate estimates of rainfall. The quickest version of IMERG provides rainfall data within 4 h of the observation (Huffman et al. 2020). In this study, ‘Final Run’ datasets of IMERG have been used. These datasets were downloaded from the website of GIOVANNI (https://giovanni.gsfc.nasa.gov/giovanni/).
MERRA-2
MERRA-2-gridded reanalysis dataset was developed by NASA. It utilizes the Atmospheric General Circulation Model and the Global Statistical Interpolation (GSI) atmospheric analysis system to estimate rainfall. The dataset incorporates rainfall estimates derived from assimilating data from satellites, surface stations, and aircraft into the atmospheric model (Gelaro et al. 2017). The data are updated with a latency period of 14 h. In this study, the MERRA-2 datasets were downloaded from the website of NASA Power Data Access Viewer (https://power.larc.nasa.gov/data-access-viewer/).
PERSIANN
PERSIANN-gridded satellite-based precipitation product (SPP) was developed by the Center for Hydrometeorology and Remote Sensing (CHRS), USA. The product utilizes infrared brightness temperature data from geostationary satellites to estimate precipitation (Sorooshian et al. 2000). The product's fastest variant can provide data within 15–60 min of observation. In this study, the PERSIANN datasets were used which were downloaded from the website of the CHRS Data Portal (https://chrsdata.eng.uci.edu/).
The details of various gridded rainfall datasets used in this study are tabulated in Table 1.
Summary of gridded rainfall datasets used for the study area
Sl. No. . | Name (data provider) . | Spatial resolution . | Spatial coverage . | Temporal resolution (min) . | Temporal coverage . | References . |
---|---|---|---|---|---|---|
1. | IMERG (NASA and JAXA) | ![]() | ![]() | 30 | 2000–present | Huffman et al. (2020) |
2. | MERRA-2 (NASA) | ![]() | ![]() | 60 | 2000–present | Sorooshian et al. (2000) |
3. | PERSIANN (CHRS, USA) | ![]() | Global | 60 | 1980–present | Gelaro et al. (2017) |
Sl. No. . | Name (data provider) . | Spatial resolution . | Spatial coverage . | Temporal resolution (min) . | Temporal coverage . | References . |
---|---|---|---|---|---|---|
1. | IMERG (NASA and JAXA) | ![]() | ![]() | 30 | 2000–present | Huffman et al. (2020) |
2. | MERRA-2 (NASA) | ![]() | ![]() | 60 | 2000–present | Sorooshian et al. (2000) |
3. | PERSIANN (CHRS, USA) | ![]() | Global | 60 | 1980–present | Gelaro et al. (2017) |
METHODOLOGY
Flowchart of methodology
The gridded datasets were evaluated against the benchmark IMD datasets at daily, monthly, seasonal, and annual scales to assess the best-gridded dataset that can capture the rainfall pattern and distribution similar to the observed datasets. For this, monthly, seasonal, and annual datasets were computed from the daily gridded datasets from the years 2001 to 2014. Cumulative Distribution Function (CDF) plot and contingency test have been performed using daily data. For the monthly level, monthly rainfall box plots and statistical metrics were analyzed. For the seasonal level, the seasonal rainfall box plot and Mann–Kendall (M–K) and Sen's Slope (SS) test results were analyzed. For the annual level, the annual rainfall plot, descriptive statistics, and the rainfall anomaly index (RAI) were analyzed. Based on all analysis, a heat map was made to rank the gridded datasets. The details of all these analyses have been described in a further section.
Contingency test
The contingency test assesses the ability of a satellite to capture rainy and non-rainy days using categorical metrics (Bharti & Singh 2015; Navale et al. 2020).
Four categorical metrics were evaluated in contingency tests: Hit (H), which represents the number of days when rainfall events were recorded at both the rain gauge station and the satellite; Miss (M), which represents the number of days when rainfall events were recorded by the rain gauge station but not by the satellite; False Alarm (F), which represents the number of days when rainfall events were recorded by the satellite but not observed in the rain gauge station; and Correct Negative (Q), which represents the number of days when neither the rain gauge station nor the satellite recorded any rainfall events, indicating non-rainy days.
Cumulative distribution function
To compare the daily distribution of gridded and observed rain gauge datasets, the CDF was plotted. The CDF for daily rainfall datasets gives information about the probability of observing rainfall equal to or below a particular value (Park 2018).
M–K and SS test




After this p-value is obtained. The p-value less than 0.05 (significance level) indicates that the observed trend is likely to be statistically significant. More details of the method of performing this test can be found in studies of Kumar et al. (2022) and Zhang et al. (2023).


If the SS value is positive, it indicates an increasing trend, while a negative value suggests a decreasing trend.
Statistical metrics
The gridded rainfall datasets are evaluated against rain gauge observed datasets by numerous performance criteria. If = ith rain gauge data,
= ith gridded rainfall data,
= Mean of the datasets, and N = Number of observations, then the various statistical metrics used for gridded rainfall datasets performance evaluation are as follows:
- (i)Coefficient of determination
: It is used to assess the goodness of fit between the observed rainfall from IMD and gridded rainfall data. It quantifies the proportion of the variance in the observed rainfall that can be explained by the gridded rainfall data.
ranges from 0 to 1, where a value of 1 indicates a perfect fit, suggesting that the gridded rainfall data can explain all the variance in the observed rainfall. The correlation coefficient is given as R. The square of R gives the
. R is computed as (Gautam & Pandey 2022):
- (ii)RMSE: It measures how spread the gridded rainfall dataset values are from the rain gauge value. It ranges from 0 to ∞. The lower RMSE value represents a better estimation. It is computed as (Gautam & Pandey 2022):
- (iii)NSE: It measures the predictive ability of the gridded data. It ranges from − ∞ to 1. The value 1 represents the correct rainfall representation. It is computed as (Nash & Sutcliffe 1970):
- (iv)Percent of Bias (Pbias): It is an indication of the average overestimation or underestimation of the gridded values relative to the observed values. It ranges from -∞ to ∞. PBIAS has an optimum value of 0 with negative values suggesting underestimation and positive values suggesting overestimation. It is computed as (Gautam & Pandey 2022):
The performance categorization for , NSE, and Pbias (Sithara et al. 2020) are tabulated in Table 2.
Performance categorization for statistical index
Parameter . | Range . | Performance inference . |
---|---|---|
![]() | ![]() | Acceptable |
![]() | Very good | |
NSE | ![]() | Unsatisfactory |
![]() | Satisfactory | |
![]() | Good | |
![]() | Very good | |
Pbias | ![]() | Very good |
![]() ![]() | Good | |
![]() ![]() | Satisfactory | |
![]() | Unsatisfactory |
Parameter . | Range . | Performance inference . |
---|---|---|
![]() | ![]() | Acceptable |
![]() | Very good | |
NSE | ![]() | Unsatisfactory |
![]() | Satisfactory | |
![]() | Good | |
![]() | Very good | |
Pbias | ![]() | Very good |
![]() ![]() | Good | |
![]() ![]() | Satisfactory | |
![]() | Unsatisfactory |
RAI for wet and dry year validation
Rainfall anomaly refers to the deviation between the observed rainfall and long-term average rainfall for a particular region or period. Positive anomalies indicate above-average rainfall, while negative anomalies indicate below-average rainfall. The RAI is calculated as (Van Rooy 1965; El-Tantawi et al. 2021).
The classification of the years according to the RAI is tabulated in Table 3 (Bougara et al. 2021).
Classification of the year according to the RAI
RAI . | Classification . |
---|---|
≥3.00 | Extremely wet |
2.00–2.99 | Very wet |
1.00–1.99 | Moderately wet |
0.50–0.99 | Slightly wet |
0.49 to −0.49 | Near normal |
−0.50 to −0.99 | Slightly dry |
−1.00 to −1.99 | Moderately dry |
−2.00 to −2.99 | Very dry |
≤−3.00 | Extremely dry |
RAI . | Classification . |
---|---|
≥3.00 | Extremely wet |
2.00–2.99 | Very wet |
1.00–1.99 | Moderately wet |
0.50–0.99 | Slightly wet |
0.49 to −0.49 | Near normal |
−0.50 to −0.99 | Slightly dry |
−1.00 to −1.99 | Moderately dry |
−2.00 to −2.99 | Very dry |
≤−3.00 | Extremely dry |
RESULTS AND DISCUSSION
Thiessen polygon: (a) Rain gauge stations; (b) IMERG grids; (c) MERRA-2 grids; and (d) PERSIANN grids.
Thiessen polygon: (a) Rain gauge stations; (b) IMERG grids; (c) MERRA-2 grids; and (d) PERSIANN grids.
Gridded datasets evaluation
Daily scale
Contingency test for daily rainfall datasets
The contingency metrics for all the datasets at Benibad, Dheng, Hayaghat, and Kamtaul rain gauge stations are presented in Table 4. Across all stations, MERRA-2 consistently exhibits the highest number of hits of 950 days, 937 days, 961 days, and 947 days at stations Benibad, Dheng, Hayaghat, and Kamtaul respectively indicating its superior performance in accurately capturing precipitation events compared to IMERG and PERSIANN. However, IMERG and PERSIANN are not far behind in terms of hit counts and demonstrate competitive results, with values that are generally close to MERRA-2. On the other hand, MERRA-2 also exhibits a higher number of false alarms compared to IMERG and PERSIANN. This suggests that MERRA-2 might overestimate the occurrence of precipitation events, leading to more false alarms (Arshad et al. 2021). The larger grid size of MERRA-2 compared to IMERG and PERSIANN may contribute to the higher number of false alarms. With a larger grid size, MERRA-2 may capture a broader area within each grid cell, potentially including regions with no or minimal precipitation (Ramos Filho et al. 2022). This can lead to overestimation and result in more false alarms. In contrast, IMERG and PERSIANN, with potentially finer grid sizes, are able to capture smaller-scale precipitation events more accurately and with fewer false alarms.
Contingency metrics for different gridded datasets
Station . | Contingency metrics (days) . | IMERG . | MERRA-2 . | PERSIANN . |
---|---|---|---|---|
Benibad | Hit | 869 | 950 | 799 |
Miss | 160 | 79 | 230 | |
False alarm | 777 | 1,030 | 645 | |
Correct negative | 1,878 | 1,625 | 2,010 | |
Dheng | Hit | 893 | 937 | 822 |
Miss | 136 | 92 | 207 | |
False alarm | 819 | 1,046 | 684 | |
Correct negative | 1,869 | 1,642 | 2,004 | |
Hayaghat | Hit | 910 | 961 | 784 |
Miss | 119 | 68 | 245 | |
False alarm | 890 | 982 | 674 | |
Correct negative | 1,798 | 1,706 | 2,014 | |
Kamtaul | Hit | 876 | 947 | 812 |
Miss | 153 | 82 | 217 | |
False alarm | 758 | 1,032 | 684 | |
Correct negative | 1,930 | 1,656 | 2,004 |
Station . | Contingency metrics (days) . | IMERG . | MERRA-2 . | PERSIANN . |
---|---|---|---|---|
Benibad | Hit | 869 | 950 | 799 |
Miss | 160 | 79 | 230 | |
False alarm | 777 | 1,030 | 645 | |
Correct negative | 1,878 | 1,625 | 2,010 | |
Dheng | Hit | 893 | 937 | 822 |
Miss | 136 | 92 | 207 | |
False alarm | 819 | 1,046 | 684 | |
Correct negative | 1,869 | 1,642 | 2,004 | |
Hayaghat | Hit | 910 | 961 | 784 |
Miss | 119 | 68 | 245 | |
False alarm | 890 | 982 | 674 | |
Correct negative | 1,798 | 1,706 | 2,014 | |
Kamtaul | Hit | 876 | 947 | 812 |
Miss | 153 | 82 | 217 | |
False alarm | 758 | 1,032 | 684 | |
Correct negative | 1,930 | 1,656 | 2,004 |
CDF for daily rainfall datasets
CDF plots for the following stations (a) Benibad; (b) Dheng; (c) Hayaghat; and (d) Kamtaul.
CDF plots for the following stations (a) Benibad; (b) Dheng; (c) Hayaghat; and (d) Kamtaul.
Monthly scale
Box plot analysis
Box plots of monthly rainfall for the following stations: (a) Benibad; (b) Dheng; (c) Hayaghat; and (d) Kamtaul.
Box plots of monthly rainfall for the following stations: (a) Benibad; (b) Dheng; (c) Hayaghat; and (d) Kamtaul.
Statistical performance of monthly rainfall datasets
Statistical performance of monthly rainfall datasets
Station . | Statistical parameters . | IMERG . | MERRA-2 . | PERSIANN . |
---|---|---|---|---|
Benibad | R2 | 0.83 | 0.77 | 0.74 |
NSE | 0.82 | 0.76 | 0.74 | |
Pbias | − 2.64 | 10.16 | 6.12 | |
RMSE | 57.62 | 67.10 | 70.54 | |
Dheng | R2 | 0.87 | 0.62 | 0.75 |
NSE | 0.86 | 0.61 | 0.72 | |
Pbias | − 5.96 | 15.10 | 19.28 | |
RMSE | 58.97 | 100.02 | 83.96 | |
Hayaghat | R2 | 0.77 | 0.77 | 0.69 |
NSE | 0.77 | 0.75 | 0.69 | |
Pbias | − 3.06 | 13.21 | 7.51 | |
RMSE | 66.59 | 68.79 | 77.00 | |
Kamtaul | R2 | 0.77 | 0.69 | 0.68 |
NSE | 0.72 | 0.68 | 0.68 | |
Pbias | − 15.00 | 4.33 | 1.25 | |
RMSE | 69.88 | 74.66 | 75.40 |
Station . | Statistical parameters . | IMERG . | MERRA-2 . | PERSIANN . |
---|---|---|---|---|
Benibad | R2 | 0.83 | 0.77 | 0.74 |
NSE | 0.82 | 0.76 | 0.74 | |
Pbias | − 2.64 | 10.16 | 6.12 | |
RMSE | 57.62 | 67.10 | 70.54 | |
Dheng | R2 | 0.87 | 0.62 | 0.75 |
NSE | 0.86 | 0.61 | 0.72 | |
Pbias | − 5.96 | 15.10 | 19.28 | |
RMSE | 58.97 | 100.02 | 83.96 | |
Hayaghat | R2 | 0.77 | 0.77 | 0.69 |
NSE | 0.77 | 0.75 | 0.69 | |
Pbias | − 3.06 | 13.21 | 7.51 | |
RMSE | 66.59 | 68.79 | 77.00 | |
Kamtaul | R2 | 0.77 | 0.69 | 0.68 |
NSE | 0.72 | 0.68 | 0.68 | |
Pbias | − 15.00 | 4.33 | 1.25 | |
RMSE | 69.88 | 74.66 | 75.40 |
Monthly correlation of rainfall datasets for the following stations: (a) Benibad; (b) Dheng; (c) Hayaghat; and (d) Kamtaul.
Monthly correlation of rainfall datasets for the following stations: (a) Benibad; (b) Dheng; (c) Hayaghat; and (d) Kamtaul.
Seasonal scale
Box plot analysis
Box plot of seasonal rainfall for the following stations: (a) Benibad; (b) Dheng; (c) Hayaghat; and (d) Kamtaul.
Box plot of seasonal rainfall for the following stations: (a) Benibad; (b) Dheng; (c) Hayaghat; and (d) Kamtaul.
Trend validation by mann kendall and sen slope test
The results of the M–K and SS tests for summer and monsoon seasons are tabulated in Table 6 and for post-monsoon and winter seasons are tabulated in Table 7. It was observed that for all the types of rainfall datasets except MERRA-2 across all the stations and seasons, p values generally exceed 0.05 indicating an overall insignificant trend of rainfall in the basin. The Sen's slope also suggested an overall decreasing trend at all stations across all seasons except the winter season. For instance, in the summer season, there is an insignificant decreasing trend in all the datasets except MERRA-2, where it an insignificant increasing trend for all stations. For the SW monsoon season, the test result indicates that there is a decreasing trend in all the datasets except the MERRA-2 dataset, in which there is an increasing trend for all stations. However, PERSIANN datasets show a significantly decreasing trend at all the stations except Dheng. For the post-monsoon season, the observed rainfall indicates that there is an insignificant increasing trend in rainfall for all the stations except for Dheng, where there is an insignificant decreasing trend. For the winter season, the results indicate that there is an insignificant increasing trend in rainfall for all stations except for Dheng, where there is no trend. Zakwan & Ara (2019) as well as Kumar et al. (2022) have both documented a similar pattern of overall decreasing rainfall trends. In their study, Zakwan & Ara (2019) examined IMD datasets for the entirety of Bihar and reported a decline in rainfall during the monsoon season, post-monsoon season, and winter season. Their analysis, which included parametric and non-parametric trend assessments, indicated declining trends for nearly all months except May. Notably, May showed an increasing trend in rainfall, a phenomenon possibly linked to climate change, as suggested by Zakwan & Ara (2019). Overall results showed that IMERG recorded the most similar trend in all seasons like observed datasets followed by PERSIANN. MERRA-2 showed the opposite trend to that of observed rainfall which is possibly due to inaccurate estimates of rainfall magnitude as discussed earlier.
Annual scale
Descriptive statistics of annual rainfall datasets
Descriptive statistics of annual rainfall datasets
Rainfall datasets . | Minimum (mm) . | Maximum (mm) . | Mean (mm) . | Standard deviation (mm) . |
---|---|---|---|---|
Observed | 782.28 | 2,077.80 | 1,225.16 | 335.15 |
IMERG | 871.34 | 1,981.36 | 1,320.16 | 307.10 |
MERRA − 2 | 783.73 | 1,578.23 | 1,098.19 | 213.58 |
PERSIANN | 821.01 | 1,533.12 | 1,091.61 | 238.62 |
Rainfall datasets . | Minimum (mm) . | Maximum (mm) . | Mean (mm) . | Standard deviation (mm) . |
---|---|---|---|---|
Observed | 782.28 | 2,077.80 | 1,225.16 | 335.15 |
IMERG | 871.34 | 1,981.36 | 1,320.16 | 307.10 |
MERRA − 2 | 783.73 | 1,578.23 | 1,098.19 | 213.58 |
PERSIANN | 821.01 | 1,533.12 | 1,091.61 | 238.62 |
Wet and dry year detection by gridded rainfall datasets
RAI for the Bagmati basin: (a) observed; (b) IMERG; (c) MERRA-2; and (d) PERSIANN.
RAI for the Bagmati basin: (a) observed; (b) IMERG; (c) MERRA-2; and (d) PERSIANN.
Ranking of gridded datasets
CONCLUSIONS
In this study, three gridded datasets, viz., IMERG, MERRA-2, and PERSIANN were analyzed and compared with the observed rain gauge datasets. The analysis was performed at station and basin level data for the temporal scale of daily, monthly, seasonal, and annual. Based on the CDF plot analysis of daily rainfall data, it was found that IMERG had the most similar daily rainfall distribution as per the observed data, while MERRA-2 and PERSIANN highly underestimated the extreme rainfall event. This indicates that MERRA-2 has the ability to capture a significant portion of rainfall events but fails to accurately represent the intensity or magnitude of extreme rainfall events. Based on the monthly gridded rainfall datasets, it was found that all gridded datasets perform well in capturing the variability of rainfall, as indicated by very good R2 and NSE values. M–K and SS tests showed that there is a decreasing trend of rainfall in summer and monsoon seasons whereas in post-monsoon and winter seasons, there is an increasing trend at most of the stations. A mostly insignificant trend was observed at all stations. IMERG recorded the most similar trend in all seasons like observed datasets followed by PERSIANN and MERRA-2. On the annual scale, wet and dry year detection by different rainfall datasets showed that IMERG had the most similar pattern of rainfall anomalies as compared to the observed dataset. Based on all the analysis, the IMERG, PERSIANN, and MERRA-2 were ranked as rank 1, rank 2, and rank 3, respectively. However, this study also highlighted critical limitations in these gridded rainfall datasets due to sensor technology, data acquisition methods, processing algorithms, and data dissipation. Enhancing these aspects holds the key to future data quality improvements.
So, overall, IMERG datasets can be considered a reliable source of rainfall data at all temporal scales for the Bagmati river basin if there is data scarcity from IMD. While, MERRA-2 and PERSIANN are more suited for use at monthly, seasonal, and annual timescale. The variability observed in rainfall estimation between IMERG and observed rainfall data are attributed to the fundamental differences in their measurement methods. Despite this variability, IMERG datasets have demonstrated remarkable results in various hydrological and climatological studies worldwide. Thus, IMERG datasets may be used for further studies like real-time monitoring of flood, drought, soil erosion, and others in the Bagmati basin and checked for their performance.
DATA AVAILABILITY STATEMENT
All relevant data are included in the paper or its Supplementary Information.
CONFLICT OF INTEREST
The authors declare there is no conflict.