ABSTRACT
Many reanalyses have been used in climate-related studies globally, but these products often contain significant biases, especially when it comes to precipitation and temperature (P&T) data. The potential of these reanalyses to reproduce P&T patterns changes significantly at the regional level, particularly in areas with considerable spatio-temporal variability, like India. Five global reanalyses, namely the Climate Forecast System Reanalysis (CFSR), the European Centre for Medium-Range Weather Forecasts (ECMWF) fifth generation Reanalysis (ERA-5), the Modern-Era Retrospective Analysis for Research and Application (MERRA-2), the Japan Meteorological Agency (JMA) 55-year Reanalysis (JRA-55), and the National Centre for Medium-Range Weather Forecasting (NCMRWF) Reanalysis (IMDAA) estimate the annual and seasonal variability in spatio-temporal patterns of P&T data with respect to the IMD gridded data for 30 years from 1991 to 2020. The reanalyses were evaluated for spell analysis, and statistical and categorical indices, for seasonal and annual daily values. MERRA-2 followed by JRA-55 exhibited the best performance for precipitation across India, as gauge-based precipitation observations were used with updated forcing. Similarly, JRA-55 followed by MERRA-2 perform better than other datasets for temperature data. The reliability and relevance of reanalyses for climatic and hydrological studies in India are better understood as a result of these findings.
HIGHLIGHTS
Comprehensive analysis of precipitation and temperature data of CFSR, ERA-5, MERRA-2, JRA-55, and IMDAA with 30 years of IMD records.
MERRA-2 is able to give good results for mean and extreme precipitation detection in India.
All reanalyses show good correlation in Aw and BSh climate zone except CFSR.
JRA-55 provides best findings for temperature when compared to other reanalyses based on all statistical indices.
INTRODUCTION
Global reanalysis datasets are frequently used for research and other purposes, especially as a replacement for observed climate variables (Gupta et al. 2020). Reanalyses give long-term data that are uniform, consistent, and dependable for various atmospheric variables. The datasets released by the European Centre for Medium Weather Forecasts (ECMWF), National Aeronautics and Space Administration (NASA), National Centre for Environmental Prediction (NCEP), and Japanese Meteorological Agency (JMA) are the most widely utilised global reanalyses (Singh et al. 2021). Regional reanalysis dataset from the National Centre for Medium-Range Weather Forecasting (NCMRWF) is also used frequently in India. These products perform considerably differentially from one location to the next due to the use of varied observable data, data assimilation techniques, modelling approaches, and tactics, as well as geographic and temporal precision in each dataset. Reanalysis products aim to provide a consistent global dataset by integrating observations with numerical models, but variations in data coverage (e.g., sparse observations in remote areas), sensor limitations, or evolving measurement techniques can introduce errors. These biases affect variables such as temperature, precipitation, wind, and soil moisture, impacting their accuracy over different regions and time periods. Therefore, it is quite interesting to investigate the precision with which various reanalyses reproduce observed climate data (Shaikh et al. 2022) and trend patterns (Mehta & Yadav 2021) in various regions.
Researchers around the world have made several attempts to determine the most appropriate reanalyses for certain geographic area or climate by comparing it to data that has been observed (Verma et al. 2023). Many researchers have compared the first-generation reanalyses to evaluate the consistency of the two primary hydrological variables, precipitation and temperature, using observable gauge and climate data as well as satellite data at global and regional scales (Lorenz & Kunstmann 2012; Vose et al. 2012; Manzanas et al. 2014; Blacutt et al. 2015; Jafarpour et al. 2022). This shows how the reanalyses performed differently for various areas and climate zones. Later, second-generation reanalyses were introduced and used in numerous studies. These included the National Centre for Environmental Prediction Reanalysis-2 (NCEP R2), ECMWF 40-year Reanalysis (ERA-40), Japan Meteorological Agency (JMA) 25-years Reanalysis 7(JRA-25), ECMWF Interim Reanalysis (ERA-Interim), and Modern Era Retrospective Analysis for Research and Application (MERRA) (Hodges et al. 2011; Outten et al. 2013; Iwasaki et al. 2014; Lin et al. 2014; Kang & Ahn 2015). The inter-annual variability of the worldwide monsoon precipitation could be accurately represented by ERA-Interim and MERRA (Lin et al. 2014), because these datasets have a higher resolution than the others. In analysing the extratropical synoptic-scale cyclones, reanalyses using the Climate Forecast System Reanalysis (CFSR), MERRA, ERA-Interim, and JRA-25 datasets were able to perform better in the northern hemisphere whereas only ERA-Interim and CFSR were similar in the southern hemisphere (Hodges et al. 2011). It has been found that ERA-Interim and MERRA accurately reflect temperature and precipitation dynamics over the Continental United States (Dhanya & Villarini 2017). Iwasaki et al. (2014) analysed the diurnal cycle in East Asia using four reanalyses, namely JRA-55, ERA-Interim, CFSR, and MERRA. Consistently in all reanalyses, JRA-55 was able to record the diurnal phase change of the low in the east and a good image of precipitation on the Tibetan Plateau while passing over East Asia. When the global energy and water balance was examined using the reanalyses JRA-55, ERA-Interim, MERRA, and CFSR, it was discovered that all of them revealed an energy imbalance at the surface and upper atmosphere, potentially as a result of an underestimation of incoming energy and the flows of emitted heat (Kang & Ahn 2015). Again, precipitation overestimate was found in JRA-55 over a zone of tropical convergence and temperature overestimate near the convergence zone of the South Pacific.
Several studies had utilised reanalyses for highlighting certain climate changes in the Indian subcontinent (Misra et al. 2012; Kar & Rana 2013). The performance variability of reanalysis datasets for climate research in regions like India is influenced by several factors, including the quality and density of observational data assimilated into the reanalysis process, the resolution of the dataset, and the underlying model physics and data assimilation techniques. For India, accurate rainfall data is critical due to the region's complex monsoonal systems and extreme spatio-temporal variability in rainfall. CFSR, MERRA and ERA-Interim reanalysis data showed anomalies in monsoon offset, direction, and spatial and temporal variability, and did not continually deliver for monsoon season temperature as well as monsoon season precipitation (Mishra & Shah 2014), when they replicated the retrospective monsoon drought from 1980 to 2005. Rana et al. (2015) investigated seasonal rainfall variability over the Indian subcontinent using rainfall data of gridded observations, satellite data and reanalyses, and found that CFSR and ERA-Interim overestimate monsoon season rainfall variability in East India in respect to Indian Meteorological Department (IMD) data. A comparison of gridded IMD data and different reanalyses (NCEP R1 and R2, CFSR, ERA-Interim, MERRA-Land, and JRA-55) revealed that among the reanalyses, JRA-55 overestimates the rainfall over North India, but the MERRA-Land datasets distribution is similar with respect to IMD over a major part of India, except for north India and the regions of high rainfall of India, where it underestimates precipitation. For temperature analysis in India, ERA-Interim and JRA-55 perform better unlike other reanalyses (Ghodichore et al. 2018).
Several investigators have utilised datasets to evaluate the dependability of two key hydrologic variables, namely the precipitation and temperature datasets (Miao et al. 2015; Angelil & Alexander 2016; Alexander et al. 2020). CFSR, ERA-Interim, MERRA, and JRA-55 datasets show large discrepancies in replicating extreme precipitation and temperature of observed data globally (Angelil & Alexander 2016). Most of the previous studies were performed for precipitation only. Some of the studies were conducted on both features of the temperature and precipitation in India, but the datasets used in these studies are older versions. This study evaluates the performance of recently introduced high spatio-temporal resolution reanalyses of extreme precipitation and temperature features in India. The objective of this study is (i) to identify the most appropriate reanalysis dataset that performs more effectively in replicating the long-term averages of IMD observed data in India and (ii) to assess the performance of the reanalysis datasets over the major climate zones of India in reproducing the long-term average and extreme events of the IMD observed data.
STUDY AREA AND DATA USED
The analysis period is chosen as 1991–2020 taking into account the availability of all datasets, and the extent is from 5 to 45°N latitudes and 65 to 105°E longitudes. Daily precipitation and temperature at a height of 2 m above the ground are the comparison variables. The sub-daily to daily time scales for all precipitation and temperature datasets are converted (mm/day and C units for daily precipitation and temperature, respectively). The key characteristics of the reanalyses employed in this research are shown in Table 1.
Main characteristics of datasets used in this study
Dataset . | IMD . | CFSR . | ERA-5 . | JRA-55 . | MERRA-2 . | IMDAA . |
---|---|---|---|---|---|---|
Period coverage | Precipitation 1901 to present and temperature 1951 to present | 1979 to present | 1950 to present | 1958 to present | 1980 to present | 1979 to present |
Temporal resolution – Precipitation | Daily | Hourly | Hourly | Hourly | Hourly | Hourly |
Temporal resolution – Temperature | Daily | 6-H | Hourly | 6-h | Hourly | Hourly |
Spatial resolution | 0.25 × 0.25° | 0.5 × 0.5° | 0.1 × 0.1° | 0.5625 × 0.5625° | 0.25 × 0.25° | 0.12 × 0.12° |
Precipitation units | mm/day | kg/m² | M | mm/day | kg/m²/s | kg/m² |
Temperature units | °C | K | K | K | K | K |
Organization | IMD | NCEP | ECMWF | JMA | NASA GMAO | NCMRWF |
Dataset . | IMD . | CFSR . | ERA-5 . | JRA-55 . | MERRA-2 . | IMDAA . |
---|---|---|---|---|---|---|
Period coverage | Precipitation 1901 to present and temperature 1951 to present | 1979 to present | 1950 to present | 1958 to present | 1980 to present | 1979 to present |
Temporal resolution – Precipitation | Daily | Hourly | Hourly | Hourly | Hourly | Hourly |
Temporal resolution – Temperature | Daily | 6-H | Hourly | 6-h | Hourly | Hourly |
Spatial resolution | 0.25 × 0.25° | 0.5 × 0.5° | 0.1 × 0.1° | 0.5625 × 0.5625° | 0.25 × 0.25° | 0.12 × 0.12° |
Precipitation units | mm/day | kg/m² | M | mm/day | kg/m²/s | kg/m² |
Temperature units | °C | K | K | K | K | K |
Organization | IMD | NCEP | ECMWF | JMA | NASA GMAO | NCMRWF |
Indian Meteorological Department
The reference data comprise the grid-based precipitation and temperature data from IMD. The angular distance weighting algorithm has been updated to create these gridded datasets using station data (Shepard 1968). For a period of 120 years, from 1901 to 2020, the grid-based daily precipitation data are available with a resolution of 0.25 × 0.25° encompassing India from 66 to 100°E and 6 to 39°N (Pai et al. 2014). This dataset clearly delineates the regions with heavy and light rainfall and accurately depicts the orographic influence on precipitation. For a 63-year period from 1951 to 2020, the IMD-gridded daily mean temperature data are provided with a 1 × 1° resolution (Srivastava et al. 2009). For the analysis of extremes, this dataset is frequently utilised (Dash & Kjellstrom 2011; Deshpande et al. 2016; Vinnarasi & Dhanya 2016).
Climate forecast system reanalysis
The CFSR, the third-generation product from NCEP, included significant improvements such as a greater resolution of T382 (38 km) with 64 vertical levels ranging up to 0.26 hPa from the surface. In order to assimilate three-dimensional variational (3D-VAR) data, the CFSR employs the NCEP coupled forecast system model, which consists of a spectral atmospheric model and a modular ocean model, and is based upon the grid-point statistical interpolation (GSI) technique (Saha et al. 2010). CFSR data is accessible from 1979 to 2010 at 6-h time increments, and CFSv2, an expansion of CFSR data, is accessible starting in 2011 from NCAR's Research Data Archive (https://rda.ucar.edu).
ERA-5
ECMWF has released ERA5, the most recent high-resolution reanalysis data, which offers several enhancements over ERA-Interim (Hersbach 2016). Using a more sophisticated 4D-Var assimilation approach, the analysis is produced at each 1-h time step. Higher horizontal and vertical resolution is one of ERA5's key advantages over ERA-Interim. ERA5 employs a horizontal resolution of 31 km (0.28°) and 139 vertical levels, encompassing the atmosphere from the surface up to 0.01 hPa (80 km), and it has been available since 1979.
Modern era retrospective analysis for research and application
MERRA-1 is replaced by NASA with the MERRA-2, which is produced by the GMAO Version 5.12.4 of the GEOS-5 data assimilation system and significantly improved methodologies for atmospheric general circulation model assimilation are employed in MERRA-2. Utilising observations from more recent microwave sounders and hyperspectral infrared radiance equipment is made possible by these techniques. Precipitation is the term for the atmospheric moisture sinking that is calculated using the atmospheric temperature, humidity profiles, along with cloud-radiative interactions. Grid-point resolutions for MERRA-2 are 1 h and 0.5 × 0.625°, respectively. Production of MERRA-2 initially began in four processing streams in June 2014 and later converged into a single near-real-time product in mid-2015 (Gelaro et al. 2017; Reichle et al. 2017). The MERRA-2 products are accessible online (https://daac.gsfc.nasa.gov).
JRA-55
The second reanalysis product from JMA, JRA-55, is an upgraded version of Japanese 25 years reanalysis JRA-25 and uses 4D-VAR data assimilation method and variation bias correction for satellite data. JRA-55's global horizontal resolution is lowered Gaussian TL319 (55 km), and it is available at 6-h timesteps for a period of 55 years beginning in 1958 (Ebita et al. 2011) from NCAR's Research Data Archive (https://rda.ucar. edu/datasets/ds628.0).
IMDAA
IMDAA is a regional atmospheric reanalysis that is being created as part of the National Monsoon Mission (NMM) project of the Ministry of Earth Sciences, Government of India, by the NCMRWF, India, and Met Office (MO), UK. It is the first of its kind (Mahmood et al. 2018; Ashrit et al. 2020). IMDAA is a cutting-edge data assimilation and numerical weather prediction (NWP) system-based high-resolution reanalysis for South Asia and surrounding regions. The IMDAA files have 63 vertical levels and a horizontal resolution of 12 km (0.12 × 0.12). Using a mesoscale version of the Met Office Unified Model (UM) and the 4D-Var data assimilation technique, the IMDAA atmospheric regional reanalysis is produced every 6 h assimilation cycle. The lateral boundary conditions for the ERA-Interim global reanalysis are used by the UM model (Dee et al. 2011). Currently, the reanalysis is accessible for the period covering the development of contemporary meteorological satellites, which is 1979 to 2018 (extended until December 2020). Both satellite and traditional observation data are assimilated within a 6-h timeframe in the IMDAA assimilation system. More information about the IMDAA assimilation system can be found in Mahmood et al. (2018) which also includes a detailed description of the system's operation and its 2-year (2008–2009) trial period's outcomes.
METHODOLOGY
In response to the different spatial resolutions of the datasets, a common grid of 0.25 × 0.25° is selected to make it easier to compare precipitation data. Similarly, the reanalyses outputs are re-gridded to 1 × 1° for temperature data analysis. The data are further divided into the pre-monsoon (February to May; FMAM), monsoon (June to September; JJAS), and post-monsoon (October to January; ONDJ) seasons in light of the fact that these seasonal divisions can explain the majority of the significant annual variations in precipitation and temperature. All datasets only take into account land points.
Calculating the frequency of extremely wet and extremely warm days and spells for each of the three seasons is done as part of the seasonal analysis for precipitation and temperature. Days with more than 100 mm of precipitation per day are classified in this study as extremely wet days. Extreme wet spells are the extended period of time when there have been extreme wet days immediately followed by less than 100 mm/day precipitation. In this research, we have employed a threshold of 100 mm/day for defining extreme wet day, i.e., days receiving more than 100 mm/day of precipitation are considered as extreme wet day and the minimum spell duration for grids is 3 days, although the maximum spell duration may be different (Rajeevan et al. 2010; Sushama et al. 2014; Vinnarasi & Dhanya 2016; Chaudhary et al. 2017). A threshold of 64.5–124.4 mm/day is used by some studies for heavy rain considering that precipitation less than 64.5 mm/day is a light range of precipitation (Bhatla et al. 2019). For temperature analysis, a similar methodology is used, and reanalyses are evaluated for extreme warm days. Days with a mean temperature over the 90th percentile of the climatological mean temperature for the associated grid are considered extremely warm days in this study. It should be noted that the threshold for extremely warm days in India will vary both spatially and temporally. Extremely warm spells are extended spans of time with extremely hot days followed by cooler temperatures. The minimal duration of the extreme warm spell is 1 day.
These indices are shown along with a 2 × 2 contingency table in Table 2 which is created with IMD as observation data for all reanalyses. Based on the contingency table, the frequency of three categorical indices – the false alarm ratio (FAR), Probability of Detection (POD), and Critical Success Index (CSI)—are determined for extremely wet days for precipitation datasets and the frequency of extremely warm days for the temperature datasets. Table 3 explains these categorical indices formula, range, and relevance.
2 × 2 Contingency table for precipitation threshold P (100 mm) and temperature threshold of 90th percentile
. | Reference dataset (IMD) . | ||
---|---|---|---|
Event detected (Yes) IMD > P . | Event not detected (No) IMD < P . | ||
Reanalysis dataset (Rana) | Event detected (Yes) Rana > P | Hits (H) | False alarm (F) |
Event not detected (No) Rana < P | Miss (M) | Correct rejections (T) |
. | Reference dataset (IMD) . | ||
---|---|---|---|
Event detected (Yes) IMD > P . | Event not detected (No) IMD < P . | ||
Reanalysis dataset (Rana) | Event detected (Yes) Rana > P | Hits (H) | False alarm (F) |
Event not detected (No) Rana < P | Miss (M) | Correct rejections (T) |
List of categorical indices used
Score . | Formula . | Range . | Ideal value . |
---|---|---|---|
Probability of detection (POD) | H/(H + M) | 0–1 | 1 |
False alarm ratio (FAR) | F/(H + F) | 0–1 | 0 |
Critical Success Index (CSI) | H/(H + M + F) | 0–1 | 1 |
Score . | Formula . | Range . | Ideal value . |
---|---|---|---|
Probability of detection (POD) | H/(H + M) | 0–1 | 1 |
False alarm ratio (FAR) | F/(H + F) | 0–1 | 0 |
Critical Success Index (CSI) | H/(H + M + F) | 0–1 | 1 |
POD stands for the percentage of observations that the model properly identified. The ideal POD value is 1, and values in the range of 0–1 indicate that none of the observed events or all of them were accurately predicted by the model. FAR, on the other hand, is the percentage of occurrences that were predicted by the models but did not really happen. If FAR is 0, it means that no observed events were improperly simulated by the reanalyses. This is the optimum number. The characteristics of both FAR and POD are combined in CSI, which reflects the overall simulation skill in relation to the data. CSI is a scale that goes from 0 to 1, with 0 denoting no skill and 1 denoting perfect skill. You may get more information about the categorical indices and the contingency table in Wilks (2011). As a result, the ability of several reanalyses to replicate the amount of precipitation and temperature changes as well as their pattern in India was assessed.
RESULTS AND DISCUSSION
In order to understand and compare the performance of the reanalysis products in India, the results of the precipitation analysis are discussed in Section 4.1, while the results for the temperature analysis are shown in Section 4.2. The results are further elaborated on the basis of long-term mean patterns, spell analysis and categorical indices. Overall, the results show that while all reanalysis products are able to reproduce the mean distribution pattern of precipitation and temperature in India, none of the products were able to capture the extreme precipitation events. Moreover, the spatio-temporal distribution of the precipitation and temperature were observed to vary in all reanalysis products in comparison with the IMD reference data. The details of the comparison are further elaborated in the following sections.
Analysis of precipitation characteristics
Results of annual precipitation characteristics
Climatological annual cycle of mean annual precipitation (mm/day) of IMD-gridded data from 1991 to 2020; mean precipitation difference of various datasets with respect to IMD-gridded dataset are also shown.
Climatological annual cycle of mean annual precipitation (mm/day) of IMD-gridded data from 1991 to 2020; mean precipitation difference of various datasets with respect to IMD-gridded dataset are also shown.
The spatial distribution of mean annual precipitation (mm/day) of IMD-gridded data and bias plots between IMD and reanalyses from 1991 to 2020.
The spatial distribution of mean annual precipitation (mm/day) of IMD-gridded data and bias plots between IMD and reanalyses from 1991 to 2020.
The IMD defines wet years and dry years based on the annual departure of rainfall from the long-period average (LPA). A wet year occurs when the annual rainfall is +10% or more above the LPA, indicating significantly above-normal precipitation. Conversely, a dry year is characterised by an annual rainfall departure of −10% or more below the LPA, signifying a substantial deficit in precipitation. Between the years 1991–2020, three wet years (1994, 2005 and 2019) and three dry years (2002, 2004 and 2009) have been identified based on the above definition for further analysis.
During the wet years, IMD daily precipitation exhibits the highest rainfall value ranging from 16 to 20 mm/day over the Western Ghats and Northeast India, while the lowest rainfall value (<1 mm/day) is observed over the Northwestern desert regions. In comparison, most of the reanalyses appear to be in agreement with the spatial distribution of the mean daily precipitation patterns over most of Central India. However, significant underestimation is observed over the Western Ghats and Northeast India in all reanalyses, with the exception of IMDAA, which shows overestimation over Northeast India. Overall, JRA-55, ERA-5 and MERRA-2 perform better over the wet years (Supplementary Figure S9). Similarly for the dry years, the highest precipitation (>16 mm/day) is observed over a portion of Northeastern India, while the Western Ghats exhibit mean daily precipitation of around 10–15 mm/day. Among the reanalyses, JRA-55, IMDAA and CFSR over-estimate the precipitation over the northern half of India, particularly the foothills of the Himalayas. Additionally, all reanalyses underestimate the precipitation over the Western Ghats. Overall, ERA-5 seems to perform better than other reanalyses for the dry years, followed by MERRA-2 (Supplementary Figure S10).
Daily precipitation analysis results: Statistical indices
Statistical measures of reanalyses against IMD observations
Precipitation . | ERA-5 . | JRA-55 . | MERRA-2 . | CFSR . | IMDAA . | |||||
---|---|---|---|---|---|---|---|---|---|---|
RMSE . | R2 . | RMSE . | R2 . | RMSE . | R2 . | RMSE . | R2 . | RMSE . | R2 . | |
Entire India | 2.270 | 0.924 | 1.555 | 0.873 | 1.702 | 0.854 | 1.696 | 0.855 | 1.780 | 0.873 |
Aw climate zone | 3.582 | 0.912 | 2.778 | 0.861 | 2.874 | 0.85 | 4.775 | 0.518 | 2.748 | 0.868 |
BSh climate zone | 2.489 | 0.852 | 2.464 | 0.777 | 2.456 | 0.763 | 3.063 | 0.562 | 2.268 | 0.785 |
Bwh climate zone | 2.736 | 0.681 | 3.652 | 0.589 | 2.755 | 0.616 | 5.288 | 0.428 | 2.854 | 0.59 |
Cwa climate zone | 3.721 | 0.888 | 3.372 | 0.865 | 3.309 | 0.824 | 4.451 | 0.777 | 5.119 | 0.863 |
. | ERA-5 . | JRA-55 . | MERRA-2 . | CFSR . | IMDAA . | |||||
Temperature . | RMSE . | R2 . | RMSE . | R2 . | RMSE . | R2 . | RMSE . | R2 . | RMSE . | R2 . |
Entire India | 5.039 | 0.988 | 3.778 | 0.991 | 3.817 | 0.978 | 4.970 | 0.985 | 4.249 | 0.981 |
Aw climate zone | 1.113 | 0.979 | 0.500 | 0.992 | 1.319 | 0.965 | 1.737 | 0.982 | 0.781 | 0.987 |
BSh climate zone | 0.679 | 0.990 | 0.644 | 0.993 | 0.991 | 0.977 | 1.093 | 0.986 | 0.915 | 0.987 |
Bwh climate zone | 0.977 | 0.989 | 1.154 | 0.989 | 1.927 | 0.972 | 1.812 | 0.978 | 1.517 | 0.981 |
Cwa climate zone | 1.359 | 0.992 | 0.746 | 0.995 | 1.515 | 0.969 | 2.029 | 0.982 | 0.974 | 0.991 |
Precipitation . | ERA-5 . | JRA-55 . | MERRA-2 . | CFSR . | IMDAA . | |||||
---|---|---|---|---|---|---|---|---|---|---|
RMSE . | R2 . | RMSE . | R2 . | RMSE . | R2 . | RMSE . | R2 . | RMSE . | R2 . | |
Entire India | 2.270 | 0.924 | 1.555 | 0.873 | 1.702 | 0.854 | 1.696 | 0.855 | 1.780 | 0.873 |
Aw climate zone | 3.582 | 0.912 | 2.778 | 0.861 | 2.874 | 0.85 | 4.775 | 0.518 | 2.748 | 0.868 |
BSh climate zone | 2.489 | 0.852 | 2.464 | 0.777 | 2.456 | 0.763 | 3.063 | 0.562 | 2.268 | 0.785 |
Bwh climate zone | 2.736 | 0.681 | 3.652 | 0.589 | 2.755 | 0.616 | 5.288 | 0.428 | 2.854 | 0.59 |
Cwa climate zone | 3.721 | 0.888 | 3.372 | 0.865 | 3.309 | 0.824 | 4.451 | 0.777 | 5.119 | 0.863 |
. | ERA-5 . | JRA-55 . | MERRA-2 . | CFSR . | IMDAA . | |||||
Temperature . | RMSE . | R2 . | RMSE . | R2 . | RMSE . | R2 . | RMSE . | R2 . | RMSE . | R2 . |
Entire India | 5.039 | 0.988 | 3.778 | 0.991 | 3.817 | 0.978 | 4.970 | 0.985 | 4.249 | 0.981 |
Aw climate zone | 1.113 | 0.979 | 0.500 | 0.992 | 1.319 | 0.965 | 1.737 | 0.982 | 0.781 | 0.987 |
BSh climate zone | 0.679 | 0.990 | 0.644 | 0.993 | 0.991 | 0.977 | 1.093 | 0.986 | 0.915 | 0.987 |
Bwh climate zone | 0.977 | 0.989 | 1.154 | 0.989 | 1.927 | 0.972 | 1.812 | 0.978 | 1.517 | 0.981 |
Cwa climate zone | 1.359 | 0.992 | 0.746 | 0.995 | 1.515 | 0.969 | 2.029 | 0.982 | 0.974 | 0.991 |
Density scatter plots of daily precipitation values between IMD and other reanalyses for the period 1991–2020.
Density scatter plots of daily precipitation values between IMD and other reanalyses for the period 1991–2020.
The spatial distribution of annual frequency of extreme wet days in IMD and frequency of extreme wet days bias plots between IMD and reanalyses.
The spatial distribution of annual frequency of extreme wet days in IMD and frequency of extreme wet days bias plots between IMD and reanalyses.
The spatial distribution of seasonal frequency of extreme wet days in IMD and frequency of extreme wet days bias plots between IMD and reanalyses for monsoon seasons.
The spatial distribution of seasonal frequency of extreme wet days in IMD and frequency of extreme wet days bias plots between IMD and reanalyses for monsoon seasons.
Result of spell analysis
Annual frequency of extreme wet days
The rate of extreme precipitation bias in reanalyses vs. IMD, as well as the regional distribution of annual frequency of extreme precipitation, are shown in Figure 6. The highest annual frequency of precipitation extremes is observed over the Western Ghats and Northeast (10–20 events/year), whereas the average annual frequency of <1 event/year of extreme precipitation events is observed over the rest of India. The reanalyses reveal significant regional disparities, particularly over areas with considerable precipitation such as Western Ghats and Northeast. CFSR and IMDAA overestimates the frequency of extreme precipitation over Northeast India, while both of them underestimate the precipitation over the Western Ghats. MERRA-2 performs slightly better than ERA-5 and JRA-55 reanalyses over India. In comparison with IMD, MERRA-2 shows similar distribution of precipitation over India, except in some parts of Western Ghats, Northeast India, where there is a tendency to underestimate. ERA-5 underestimates the frequency of extreme precipitation over the high rainfall region of Western Ghats and Northeast India. Over the wet years (1994, 2005 and 2019), the highest mean number of extreme wet days (>15) in IMD data is observed to be in the Western Ghats region. In comparison, all reanalyses seem to underestimate the extreme events over Western Ghats, while performing well over the rest of India. Overall, JRA-55, ERA-5 and MERRA-2 perform better in case of extreme events in the wet years in India (Supplementary Figure S11).
Seasonal frequency of extreme wet days
The rate of extreme precipitation bias in reanalyses and IMD for monsoon season, as well as the spatial distribution of these variables, are shown in Figure 7. The highest mean seasonal frequencies of extreme precipitation for the monsoon season are seen across the Western Ghats and the Northeast (6–16 events/year), whereas the majority of India experiences less than one incident of extreme precipitation per year on average. In reanalyses, there are large regional disparities that can be seen, particularly over areas with a lot of precipitation like the Western Ghats and Northeast. For the pre-monsoon season, frequency of extremely wet days is close to zero over a major part of India, except the Northeast, where (1–4 events/year) are observed by IMD. All the reanalyses perform well in pre-monsoon season over India, exhibiting similar precipitation patterns to the reference IMD data. For the post-monsoon season, Southeast coastal areas receive extremely wet days while another part of India receives almost no extreme events in post-monsoon season. In post-monsoon season less than two events/year are observed by IMD, where all the reanalyses perform well and there are no discrepancies between observed and reanalysis data. Heavy precipitation mostly occurred in monsoon season precipitation in India, which is why only the variation in precipitation during the monsoon season across IMD observations and reanalyses is assessed in this study.
Extreme precipitation analysis: Categorical indices
Spatial plots of categorical indices for detection of extreme wet days.
Analysis of temperature characteristics
Result of annual temperature characteristics
Climatological annual cycle of mean annual temperature (°C) of IMD-gridded data from 1991 to 2020; mean temperature difference of various datasets with respect to IMD-gridded dataset are also shown.
Climatological annual cycle of mean annual temperature (°C) of IMD-gridded data from 1991 to 2020; mean temperature difference of various datasets with respect to IMD-gridded dataset are also shown.
The spatial distribution of mean annual temperature (°C) of IMD-gridded data and bias plots between IMD and reanalyses from 1991 to 2020.
The spatial distribution of mean annual temperature (°C) of IMD-gridded data and bias plots between IMD and reanalyses from 1991 to 2020.
Daily temperature analysis results: Statistical indices
Density scatter plots of daily temperature values between IMD and other reanalyses for the period 1991–2020.
Density scatter plots of daily temperature values between IMD and other reanalyses for the period 1991–2020.
The spatial distribution of annual frequency of extreme warm days in IMD and frequency of extreme warm days bias plots between IMD and reanalyses.
The spatial distribution of annual frequency of extreme warm days in IMD and frequency of extreme warm days bias plots between IMD and reanalyses.
The spatial distribution of seasonal frequency of extreme warm days in IMD and frequency of extreme warm day bias plots between IMD and reanalyses.
The spatial distribution of seasonal frequency of extreme warm days in IMD and frequency of extreme warm day bias plots between IMD and reanalyses.
The spatial distribution of annual frequency of extreme warm spells in IMD and frequency of extreme warm spell bias plots between IMD and reanalyses.
The spatial distribution of annual frequency of extreme warm spells in IMD and frequency of extreme warm spell bias plots between IMD and reanalyses.
The spatial distribution of seasonal frequency of extreme warm spells in IMD and frequency of extreme warm spell bias plots between IMD and reanalyses.
The spatial distribution of seasonal frequency of extreme warm spells in IMD and frequency of extreme warm spell bias plots between IMD and reanalyses.
Result of spell analysis
Annual frequency of extreme warm days
The annual frequency of extreme warm days bias plots between IMD and reanalyses, as well as the spatial distribution of yearly rate of extreme warm days in IMD, are shown in Figure 12. The highest annual frequency of extreme temperature is observed over the majority of India (35–40 events/year), whereas a small part of India experiences less than 16–30 events/year. The reanalyses show significant regional disparities, particularly over areas having small numbers of gauge stations such as the North and Northeast. The frequency of temperature extremes is consistently underestimated in all reanalyses over Northeast India, while all of them overestimate the temperature over the Western part of India in the states of Gujarat and Rajasthan. All reanalyses perform equally well over the central part of India where bias between IMD observed data and reanalyses data is nil. Over east and west coastal areas underestimation is observed in all reanalyses, MERRA-2 shows similar temperature distribution in comparison with IMD over the rest of India. ERA-5 and IMDAA underestimate the frequency of extreme temperature over the high rainfall region of Western Ghats and Northeast as compared to other reanalyses.
Seasonal frequency of extreme warm days
The spatial distribution of the seasonal frequency of extreme temperature in the IMD dataset and the temperature bias plots between reanalyses and IMD are shown in Figure 13. The highest seasonal frequencies of extreme temperature for monsoon, pre-monsoon and post-monsoon season is observed all over India (12–14 events/year). The reanalyses reveal significant regional differences, particularly over areas like the central portion of India. For pre-monsoon season frequency of extremely warm days is highly over- or under-estimated by reanalyses in India. In monsoon and post-monsoon seasons over central India in the Cwa and Aw climate zones, JRA-55 performance is better than other reanalyses. IMDAA's post-monsoon season performance over the Ganga river basin in the Cwa climate zone is slightly better.
Annual frequency of extreme warm spell
The spatial pattern of annual frequency of extremely warm spells in IMD and annual frequency of extreme warm spell bias between reanalyses and IMD are shown in Figure 14. The highest annual frequency of extreme warm spells observed in India is 5–5.5 events/year, whereas the average annual frequency of extreme warm spells over a major part of central India is 2.5–4 events/year. The reanalyses show significant regional disparities, particularly over areas having small numbers of gauge stations such as North and Northeast India. The frequency of extremely warm spells across Northeast India is consistently underestimated in all reanalyses, while all of them overestimates warm spells over the Western part of India in the states of Gujarat and Rajasthan. All reanalyses perform equally well over the central part of India where bias between IMD observed data and reanalyses data is nil. Over east and west coastal areas, underestimation is observed in all reanalyses, MERRA-2 exhibits similar warm spell distribution over the rest of India in comparison with IMD. ERA-5 and CFSR underestimate the occurrence of extreme warm spells over Western Ghats and Northeast as compared to other reanalyses.
Seasonal frequency of extreme warm spell
The spatial pattern of seasonal frequency of extremely warm spells in the IMD dataset and the warm spell bias between reanalyses and the IMD dataset are shown in Figure 15. The highest seasonal frequencies of extreme warm spells for monsoon, pre-monsoon and post-monsoon season is observed all over India (1.6–1.8 events/year). The reanalyses reveal significant regional disparities, particularly over Northeast and Northwest India in monsoon season. For pre-monsoon season frequency of extreme warm spells are over- or under-estimated by all reanalyses in some parts of northeast India. The Himalayan region or Bwk climate zone has no variation in reanalyses performance. All datasets perform equally well for all climate zones and all seasons.
Extreme temperature analysis: Categorical indices
Spatial plots of categorical indices for detection of extreme warm days.
CONCLUSIONS
The aim of this study was to evaluate the effectiveness of several reanalyses in capturing both the mean and temporal and geographical variability of precipitation and temperature in India. Five reanalyses, namely CFSR, ERA-5, MERRA-2, JRA-55, and IMDAA, were employed in this study to examine the seasonal variability of temperature and precipitation during a 30-year period from 1991 to 2020 using the FMAM, JJAS, and ONDJ. The performance of these reanalyses was compared with the gridded rainfall and temperature data in India from the IMD. The accuracy of reanalyses in recreating the temperature and precipitation patterns in India was examined using a variety of statistical and categorical metrics. Following is a summary of the study's main conclusions:
(1) The majority of the seasonal precipitation's key properties were very effectively captured by all of the reanalyses except ERA-5 which is highly underestimated for the precipitation pattern in India and all the reanalyses underestimated for mean temperature variation in India. However, there are notable regional and seasonal discrepancies between reanalyses and reference data for geographically complex regions with limited data.
(2) The seasonal analysis of spells across India reveals that no single reanalysis is capable of accurately capturing both precipitation and temperature features in India. For instance, MERRA-2 performed better than other reanalyses in estimating precipitation but less well at estimating temperature.
(3) Seasonal precipitation variability analysis revealed that all reanalyses underestimated extreme precipitation in Western Ghats and Northeast except IMDAA and CFSR which overestimate extreme precipitation over Northeast India, especially during monsoon season (JJAS), for post-monsoon season (ONDJ) and pre-monsoon season (FMAM) there is no variation between IMD observed and reanalyses data in capturing extreme precipitation.
(4) The highest overestimation of extreme precipitation events is seen in the IMDAA over Northeast India and CFSR over some parts of Northeast India and the Cwa climate zone.
(5) MERRA-2 is able to give good results for mean precipitation and also for extreme precipitation detection in India for all seasons.
(6) In the case of categorical indices, the POD of extreme precipitation events in India found that IMDAA has the best results followed by MERRA-2 over a major part of India.
(7) All reanalyses show a good correlation in Aw and BSh climate zone with correlation values around 0.8 except CFSR, for the Cwa climate zone all reanalyses perform well and for the dry arid region of the Kutch and Thar desert reanalyses performance is not as good as other climate zones.
(8) All datasets appear to understate temperatures across the Himalayas of mean temperature change in India, but for the rest of the parts of India, all reanalyses perform well. Performance of JRA-55, ERA-5 and MERRA-2 seem to perfectly capture the temperatures over major parts of India.
(9) Seasonal extreme temperature variation shows overestimation (underestimation) in all datasets in India for all seasons, while for monsoon and post-monsoon seasons JRA-55 performance over central India is better than other reanalyses.
(10) Seasonal extreme temperature variation shows overestimation in all datasets over some parts of India for all seasons, while for monsoon seasons temperature variation JRA-55 gives better results all over India, especially the Bwh climate zone which has a dry climate, where all other datasets overestimates.
(11) Although the general correlation for temperature across all datasets is relatively good, it can be concluded from all statistical indices that JRA-55 and MERRA-2 perform better than ERA-5, CFSR and IMDAA in terms of temperature values.
(12) All reanalyses show a good correlation in Aw, BSh, Bwh and Cwa climate zones with correlation values rather high, yet it can be said that JRA-55 provides the best findings for temperature when compared to other reanalyses based on all statistical indices.
(13) In the case of categorical indices, the POD of extreme temperature events over India found that all reanalyses give the best results over India except the Himalayas.
FUTURE SCOPE OF THIS STUDY
According to the overall result, it is concluded that these reanalyses products could be utilized in various hydrological studies and applications, as climatological forcing into hydrological modeling. These product data could also be used in future by the government authorities, stakeholders, water resource departments, NGOs, flood management agencies, etc. for better planning and management. According to this study, the results can be improved by using higher resolution and updated versions of reanalyses in the future. There were some limitations to this study due to the large variations encountered in obtained results because of the lack of observation stations in difficult terrain, The available observed data may vary from actual data on this location, which leads to a reduction in the overall performance of reanalyses in replicating observed data.
ACKNOWLEDGEMENTS
The authors would like to thank the University Teaching Department, Chhattisgarh Swami Vivekanand Technical University (CSVTU) Bhilai for providing the necessary infrastructure facilities such as workstation and Geo-server for the successful completion of this research work. The authors would also like to thank the IMD, NCEP, ECMWF, NASA, and JMA for providing access to the reanalysis datasets. The authors highly acknowledge the invaluable suggestions of the reviewers, which helped to improve the quality of this manuscript.
DATA AVAILABILITY STATEMENT
All relevant data are included in the paper or its Supplementary Information.
CONFLICT OF INTEREST
The authors declare there is no conflict.