## Abstract

In order to explore the evolvement mechanism of hydrometeorological elements, spatial–temporal distribution of precipitation in the Huai river basin is studied by statistical drawing and empirical orthogonal function decomposition. How to make an objective combination for the predictive results of precipitation? Information fusion in data assimilation is introduced to merge the improved National Centers for Environmental Prediction coupled forecast system model version 2 (CFSv2) with the multilinear regression model. Firstly, in terms of time, the annual precipitation is apt to decline at most stations within 30 years, and precipitation mainly concentrates in the flood season. The characteristics of spatial distribution are similar to topographic features. It can also be found that precipitation gradually decreases from south to north. Secondly, from statistical forecasting, the relationship between precipitation and global sea surface temperature (SST) is explored. Prediction equation is established with SST and the average precipitation. Thirdly, from dynamic model forecasting, the CFSv2 original model and the CFSv2 statistical downscaling model are used to analyze the influence of model deviation on fusion prediction. The optimum interpolation assimilation method is applied for realizing the optimal integration of statistical and dynamic model prediction. Finally, the standardized precipitation index (SPI) is calculated by the combined forecasting of annual precipitation to evaluate drought conditions. The results show that SST is an important factor affecting precipitation, which may be applied as a forecasting direction with other factors. The merged precipitation prediction skill by the CFSv2 original model and the statistical model do not have the great promotion, which is still lower than the prediction skill only by the statistical model. However, the merged precipitation prediction skill by the CFSv2 statistical downscaling model and the statistical model is better than the prediction skills by the two models mentioned above, respectively. These indicate that when the prediction difference between the models is large, the merged prediction error cannot be minimized. When the prediction skill levels are equal, there is an improvement in the merged result. So, it is necessary to revise the climate dynamic model by downscaling. What is more, the obtained drought levels match the actual disaster conditions, providing theoretical support of hydrology and meteorology for the prevention of natural disasters.

## HIGHLIGHTS

Information fusion in data assimilation is introduced to merge precipitation forecasting from an improved CFSv2 model and a regression model.

SST is an important factor that may be applied as a precipitation forecasting direction with other factors.

When prediction skill levels between two models are equal, there is an improvement in the merged result. So, it is necessary to revise the climate dynamic model by statistical downscaling.

## INTRODUCTION

Precipitation is a natural phenomenon that has a great impact on human life. When precipitation exceeds a certain limit, it may cause floods, and when precipitation is under a certain baseline, it can lead to drought. Therefore, it is crucial to study precipitation, which is one of the major causes of natural disasters. China is a country with numerous river basins in diverse climate types. The Huai river basin is the third important river basin in China that is worth exploring. As for geographical distribution, the Huai river basin is considered as the boundary of northern and southern China. It locates between temperate and subtropical monsoon climate zones in transitional and cross-distributed types of many geographical elements. From the point of view of economic development, the plains of the Huai river basin have one of the most exotic and salubrious agricultural cultures. Here, the population density is high and the economy is growing at a fast rate. In terms of dry and wet conditions, the basin lies on the 800 mm annual isoprecipitation line and is the distribution boundary of paddy and dryland. Precipitation is complicated, localized, and concentrated. Also, history tells us that severe floods and droughts have occurred here with huge losses to the economy. For example, the Hongze Lake, the Lixiahe Plain, and the coastline of northern Jiangsu are all the products of previous floods. From 1949 to 2010, the Huai river basin experienced drought, with drought occurring almost every year during this period. Today, as global warming intensifies, hydrometeorological disasters resulting from extreme precipitation events in the Huai river basin have become more frequent causing great harm to living species. As an object of cross study, precipitation connects the field of water conservancy with that of atmospheric science. So, research on the evolvement mechanism and forecasting model of precipitation can not only offer a solid scientific foundation and an important basis for sustainable development, national security, and social stability in regions of climatic transition, but is also of great significance in terms of broad application prospects in interdisciplinary research and knowledge integration (Gharbia *et al.* 2016).

Experts and scholars in hydrology and meteorology all over the world have done a lot of research and achieved fruitful results. Their research not only undergoes a transformation from qualitative descriptions to quantitative analyses over a period of time, but also has great depth and is wide in its scope (Lehmann *et al.* 2015; Wang *et al.* 2018). As for temporal–spatial distribution, precipitation in the Huai river basin changes from year to year. Gu *et al.* (2010) pointed out that the intra-annual distribution of precipitation had obvious inhomogeneity, and there was a marked inter-annual variation, predominantly in the north. In most areas, drought and flood had a good correlation with the precipitation concentration degree (PCD), more so in the south. Xia *et al.* (2012) compared the extreme precipitation events with the annual maximum series (AM) and the peak over threshold series (POT). The results showed that during the last five decades, the daily maximum precipitation was in the upward trend at most stations, with only a few stations recording a downward trend, and neither a positive nor negative trend was obvious. Wang *et al.* (2017) explored the temporal–spatial distribution characteristics of extreme precipitation indices by applying the Mann–Kendall test and Kriging interpolation. They found that there was a gradual and slow decrease in the total precipitation and heavy rainfall days (R10 and R20), while there was a slow increase in the daily maximum precipitation, the maximum precipitation during a five-day period, and precipitation intensity. All the extreme precipitation indices mentioned above showed no obvious change over the last 55 years. Abdulrazzaq *et al.* (2019) integrated TRMM (Tropical Rainfall Measuring Mission) data and the standardized precipitation index (SPI) to monitor meteorological drought, which proved to be an effective tool to map the spatial distribution of drought assessment level condition.

In the precipitation forecasting model, the statistical method and the climate model are widely employed (Shao *et al.* 2017; Wu *et al.* 2018; Mafi Gholami & Baharlouii 2019). When using the statistical method, Wu *et al.* (2015) decomposed a variety of predictive variables by MEOF (multivariate empirical orthogonal function decomposition), and the Markov prediction model was constructed by using spatial models and the corresponding time coefficient. It was found that the model had good prediction skills for precipitation in Northeast Asia. Liu & Duan (2017) established a statistical prediction model by the regression analysis method based on time-scale decomposition. The model could well describe the correlation between the sensible heat of the Tibetan plateau in spring and the monthly precipitation in eastern China in summer and achieved a good quantitative prediction of local precipitation. Hu *et al.* (2019) took the average summer precipitation in the western mountain area as a dependent variable, SST in winter, and the intensity of the Northwest Pacific subtropical high in spring as predictive variables to build a model. The predictive results were good. When using the climate model, Wang *et al*. (2014) analyzed 11 ACGM models and found that the model had poor simulation and prediction ability for summer precipitation anomalies in the northern hemisphere, especially for the inter-annual variability of the summer monsoon between 5°N and 30°N in East Asia. As far as the CFSv1 and CFSv2 model of the American National Centers for Environmental Prediction (NCEP) was concerned, it was weak in simulating large-scale monsoon circulation in Asia, and its ability to forecast precipitation on a seasonal time-scale and a basin space scale needed to be improved (Luo *et al.* 2013; Saha *et al.* 2014; Moya Álvarez *et al.* 2020). Oo *et al.* (2019, 2020) assessed the future climate change projections using multiple global climate models and analyzed the streamflow response to climate changing conditions using the SWAT model. It was found that the monthly minimum temperature rise was a bit higher than the maximum temperature rise in all seasons, with more low flows in the dry season and more high flows in the wet season which could cause more natural hazards.

The studies mentioned above form a basis for the current understanding of precipitation in the Huai river basin (Wang *et al.* 2015; Krishnamurti *et al.* 2016; Rodrigues *et al.* 2019). A few researchers continue to study the possible influencing factors from a combination of the statistical method and cause analysis. The relationship between precipitation and SST in different coastal areas has not been discussed much. The integration of statistical and dynamic model prediction mostly focuses on the temporal–spatial propagation characteristics of prediction error but does not consider the information near the prediction point. Given this, this paper establishes a whole system from the occurrence, development, and application of precipitation in the Huai river basin. It combines hydrology methods with meteorology methods to explore the spatial–temporal distribution of precipitation, selects SST as a sole explanatory variable for precipitation prediction to explore influencing factors of precipitation from the water cycle of ocean, atmosphere, and hydrology, constructs the dynamic-statistical forecasting model using the optimum interpolation assimilation method, and forewarns drought conditions. The objectives of this study are: (1) to analyze historical precipitation data by using a statistical drawing map in hydrology; (2) to investigate precipitation characteristics by EOF decomposition in meteorology; (3) to establish a forecasting model of SST to precipitation by using regression analysis and a climate dynamic model; (4) to study information fusion of statistical and dynamic forecasting results. This paper takes Huai river basin as an example. The structure of this paper is organized as follows. The Introduction is given in Section 1. Data and methods are described in Section 2. Results and analysis are given in Section 3. Conclusions are presented in Section 4. Discussions are proposed in Section 5.

## DATA AND METHODS

### Hydrometeorological data and station selection

To study the evolvement mechanism of precipitation and its mutual responses with SST, the daily precipitation data from January 1, 1971 to December 31, 2000 are selected from the China meteorological data sharing network. Accordingly, the grid data of the global monthly SST for multi-year average between 1971 and 2000 are obtained from the National Oceanic and Atmospheric Administration (NOAA) in the United States with a grid spacing of 2° × 2°. In fact, the precipitation and SST data series from 1971 to 2018 is made available. This paper mainly wants to study influencing factors of precipitation from natural factors with less disturbance of human activity. The impact of human activity on climate change is seeing a rapid increase from the beginning of this century, that is from the year 2001 (Tursunova 2017). So, this study takes the data series of 30 years until 2000.

Based on the distribution characteristics of ocean currents and the climate-related factors in the Huai river basin, the main global oceans, namely, the Pacific, Atlantic, and Indian Oceans, are divided into 11 sea areas to find out the key areas that can affect precipitation. The specific divisions of these global oceans are given in Table 1. Then, according to the uniform distribution of the river basins, 10 stations are selected as representative stations based on the data continuous and complete principle, and these are given in Figure 1. Initially, the conditions of precipitation at a few stations are analyzed. If the results are good, more comprehensive research will be done at more stations later.

Ocean . | Number . | Longitude . | Latitude . |
---|---|---|---|

Pacific Ocean | 1 | 122°E–178°E | 28°N–18°N |

2 | 168°E–168°W | 62°N–32°N | |

3 | 138°E–148°W | 22°N–2°N | |

4 | 122°W–98°W | 32°S–8°S | |

5 | 148°E–178°E | 28°N–2°N | |

6 | 122°W–78°W | 28°N–2°N | |

Indian Ocean | 7 | 48°E–102°E | 28°N–2°N |

8 | 28°E–112°E | 2°S–32°S | |

Atlantic Ocean | 9 | 62°W–2°W | 62°N–42°N |

10 | 62°W–2°W | 48°N–18°S | |

11 | 32°W–2°W | 12°S–28°S |

Ocean . | Number . | Longitude . | Latitude . |
---|---|---|---|

Pacific Ocean | 1 | 122°E–178°E | 28°N–18°N |

2 | 168°E–168°W | 62°N–32°N | |

3 | 138°E–148°W | 22°N–2°N | |

4 | 122°W–98°W | 32°S–8°S | |

5 | 148°E–178°E | 28°N–2°N | |

6 | 122°W–78°W | 28°N–2°N | |

Indian Ocean | 7 | 48°E–102°E | 28°N–2°N |

8 | 28°E–112°E | 2°S–32°S | |

Atlantic Ocean | 9 | 62°W–2°W | 62°N–42°N |

10 | 62°W–2°W | 48°N–18°S | |

11 | 32°W–2°W | 12°S–28°S |

As for the CFSv2 model, the NCEP's Global Forecast System (GFS) model is adopted at the atmospheric part, the GFDL's fourth generation Modular Ocean model (MOM4) is applied as the ocean part, and the four-layer NO-AH land surface model is used for the land part. The CFSv2 model contains 16 set members, with the horizontal resolution increasing to T126 (close to 100 km). The model has a 9-month forecast period for 28 years (1982–2010), a 6-h time resolution, and 12 atmospheric variables. The CFSv2 model provided the operational forecast in March 2011, and this model (T126L64) generated three forecast products simultaneously, which were four periods (0, 6, 12, and 18 UTC cycles) with a predictive period of 9 months, one period (0 UTC cycle) with a forecast of one season (about 123 days), and three periods (6, 12, and 18 UTC cycles) with forecasting for 45 days. This paper selects historical forecasting data from 1982 to 2000 corresponding to the statistical model and historical return data from the month of March in each year to establish the prediction test data set. At about the start time, it integrates the start months of 0, 6, 12, and 18 UTC. The predictive period is 9 months. The results of different start times are collected and averaged, and the data are adopted from January to December for research and application.

### Methods

Many methods are used in this study. In the spatial and temporal distribution of precipitation, statistical drawing, the Kriging spatial interpolation method, and empirical orthogonal function (EOF) decomposition are employed. In the precipitation forecasting fusion model, the multilinear regression model, the CFSv2 original model, the CFSv2 statistical downscaling model, information fusion in data assimilation, and the optimum interpolation assimilation method are used. The specific technical route is shown in Figure 2.

#### Trend analysis and spatial detection method

For precipitation distribution, the variation characteristics of the annual average precipitation in the whole basin and the average precipitation in the flood season from June to September at 10 representative stations are analyzed by using the temporal change curve and the spatial distribution map. Furthermore, the monthly precipitation for multi-year average is described with the histogram of monthly precipitation in each year to study the inter-monthly distribution of precipitation in the river basin (Chen *et al.* 2019a, 2019b).

#### Mann–Kendall trend analysis

*S*, is shown aswhere is the sign function. When is less than, equal to, or more than 0, the values of are −1, 0, or 1, respectively. The formula of statistic

*S*that is less than, equal to, or more than 0 can be expressed as follows:

When the value of *Z* is more than 0, the trend is thought to be increasing, while when *Z* is less than 0, the trend is decreasing. The absolute value of *Z* is more than or equal to 1.64, 1.96, and 2.58, respectively, and the time sequence is verified to pass the significance test with a reliability of 90, 95, and 99%.

#### EOF decomposition method

*et al.*2016). That is, the data

*X*is decomposed into two parts, which are the time function

*Z*and the spatial function

*V*, namely,The specific steps are as follows: (1) The original precipitation data

*X*is processed by the anomaly method or normalized treatment. (2) Covariance matrix

*A*is obtained by using the formula

*A*=

*XX*. (3) The eigenvalues and eigenvectors of the realistic symmetric matrix

^{T}*A*are calculated by the Jacobi method, where

*λ*is the

_{h}*h*th eigenvalue,

*V*is the

_{h}*h*th eigenvector,

*h*= 1−

*H*, and

*H*is the total number. (4) All the eigenvalues are arranged in a non-ascending order by the sink and float method, and the ordinal numbers of the corresponding eigenvectors change accordingly. (5) According to

*λ*and the total variance of

_{h}*X*, the contribution rate of the

*h*th eigenvector to

*X*, which is expressed as

*ρ*, and the contribution rate of the first

_{h}*h*eigenvectors to

*X*, which is described as

*p*, are both calculated. (6) The time coefficient

_{h}*Z*is obtained from

_{h}*X*and the main

*V*. The quantity of the main

_{h}*V*is determined by the analysis purpose and the analysis object. (7) The calculated results are output.

_{h}#### Statistical forecasting model

The analysis of correlation and the establishment of the statistical model are as follows:

In the formula, is the mean of sample data, and *S* is the mean square error. After *Z*-score normalization, the processed data are in accordance with the standard normal distribution, namely, the mean value is 0 and standard deviation is 1.

- 2. Correlation analysis. The correlation coefficient is applied as an index to describe the closeness of interdependency between the data time-series. The greater the absolute value of correlation coefficient, the closer the relationship between the two time-series is. This study uses Spearman's correlation analysis to explore the relationship, which does not require the distribution of original variables with a wide range of applications (Zhao
*et al.*2017). It assumes two time-series as*X*and*Y*with the same sample length*N*, and the correlation coefficient*r*is expressed as

*X*and

*Y*, respectively.

3.

*T-*test

Then, the key areas of SST that are significantly related to precipitation can be picked out.

4. Statistical forecasting equation

In hydrometeorology, it is generally believed that the precipitation series exist the remarkable seasonal variation in a yearly cycle. So, there might be a stronger teleconnection relationship between the SST and precipitation within a year than in other timescales. What is more, from cause analysis, it takes time for the hydrological cycle to happen and develop. That is to say, the effect of SST on the corresponding precipitation needs a certain propagation time. Combined with the actual situation prevailing in the Huai river basin, the SST of the first half of the year and the precipitation of the latter half of the year have been selected to find some regularity. SST in the key regions and periods is taken as an independent variable. The average precipitation is taken as the dependent variable. Then, regression analysis is carried out.

#### CFSv2 statistical downscaling model

The Climate Forecasting System (CFS) of the NCEP in the USA is a dynamic seasonal forecasting system with the fully coupled ocean–land–atmosphere (Sun *et al.* 2019). At present, the second generation of the CFS (CFSv2) is widely used in business units and research institutions. Past research indicates that the prediction results of summer precipitation in East Asia by the CFSv2 model is not good. (Luo *et al.* 2013; Saha *et al.* 2014). Cheng *et al.* (2016) opined that although the precipitation forecast for the Huai river basin by the CFSv2 model was not accurate, there were certain prediction techniques for some abnormal events, the main mode of the time and space structure, which could provide a reference for short-term climate prediction. So, this paper opts for the CFSv2 original model. Then, the H500 circulation forecast field of the CFSv2 model and the first half of SST in the observation field are selected as predictor variables to forecast precipitation. To make the CFSv2 model have the same forecasting technique of the statistical method, the field information coupling method is applied to establish the statistical downscaling correction model. This treatment can minimize the direct error of precipitation forecasting by the CFSv2 original model. Then, the predictive results of the CFSv2 original model and the CFSv2 statistical downscaling model are comparatively analyzed with the actual data, respectively. Correlation coefficients between the predicted data and the measured data are also given. All these can not only describe the precipitation prediction of the CFSv2 model after downscaling, but also verify the necessity of model downscaling to correction before information fusion.

*t*, the EOF is employed to decompose the variable fields of the predicted factors and predicted values, respectively. Kaiser's standard (Wilks 2006) is also applied to keep the main model, so that the original variable field form can be counted back. Kaiser's standard formula is shown in the following equation:

In the formula, is the retained characteristic value of the EOF, is the *k*th variance of the decomposed variable, *T* is the threshold parameter, and *T* = 0.7 (Jolliffe 1972, 2002). This step is to remove the needless noise in the variable field to achieve filtering. Secondly, singular value decomposition (SVD) is used to decompose the predicted factors and predicted values to extract the coupling variation between two variable fields. Finally, the SVD model, the corresponding time coefficient, and the predicted time period *T* + 1 are applied to forecast the factor field. Also, the multiple linear regression method is used to make statistical downscale prediction (Liu & Fan 2012, 2013). The modeling process is shown in Figure 3, where *Y*(*t* + Δ*t*, *x*) is precipitation in the predicted year, *R _{i}*(

*x*) is the SVD space model of precipitation, and

*K*(

_{i}*t*+ Δ

*t*) is the SVD time coefficient in the predicted year.

#### Optimal interpolation assimilation method

*et al.*2019; Rodrigues

*et al.*2019), where the model error of the background field is considered to have unbiased characteristics (Krishnamurti

*et al.*2016; Dai

*et al.*2018). The specific process is as follows. The predictive results of the dynamic model are regarded as a ‘background field’, and the predictive results of the statistical method are considered as an ‘observed value’ in assimilation. Based on these assumptions, fusion prediction is formed with the predictive results of two models, which can be expressed as

In the formula, *Y*_{p} is the fusion of the predictive results, *X*_{d} is the predictive results by the dynamic model, *X*_{e} is the predictive results by the statistical method, and *W* is the optimal weight coefficient matrix.

In the formula, *B* represents the error covariance matrix of predictive results by the dynamic model, and *R* represents the error covariance matrix by the statistical method, which can be obtained by error statistics between the predictive results of the statistical method or the dynamic model and the historical observed data, respectively. After the error covariance matrix is calculated, the optimal weight coefficient matrix *W* can be obtained. This treatment can fully make use of the a-priori information from statistical forecasting data and dynamic model predictive data. The weighting function determined by the minimum variance estimation can also help get the optimal forecast values in a statistical sense to realize the optimal integration of the predictive results by the statistical and dynamic models.

#### Infusion of predictive results by the statistical model and the CFSv2 model

The effects of model deviation on fusion prediction are analyzed by comparative research. Firstly, the predictive results of the CFSv2 original model are merged with those of statistical prediction by the optimal interpolation assimilation method. The dynamic-statistical forecasting results are compared with the observed data by correlation analysis, and meanwhile, they are also compared with the results of the CFSv2 original model and the statistical model, respectively. This treatment is to explore whether, when the forecasting difference between two models for precipitation is too big, especially when the predictive level of the dynamic model is much lower than that of the statistical model, the prediction error from the infusion of two models will be minimized. Secondly, the CFSv2 statistical downscaling model is established by the multiple linear regression method based on the CFSv2 original model. The process, as described above, is repeated by the CFSv2 statistical downscaling model. The fusion results are analyzed by contrasting them with the measured data and predictive data of the statistical model and the CFSv2 statistical downscaling model, respectively. This treatment is to explore whether, when the predictive results of the statistical and dynamic models are about the same prediction skill level, deviation correction is necessary before information fusion. It is also to explore whether infusion results can be improved.

In this paper, to begin with, the spatial–temporal distribution characteristics of precipitation are explored. Precipitation data are analyzed to obtain the features using the statistical mapping method, and EOF decomposition is applied to carry out the comparison and validation. Besides, the global multi-year average SST data are sorted and divided, and *Z*-score correlation analysis is used to calculate the relationship between precipitation in the Huai river basin and SST in different coastal areas at the prophase in the same period. Thirdly, the forecasting equation of precipitation is established based on statistical correlation with SST. The CFSv2 original model and the CFSv2 statistical downscaling model are used to forecast precipitation. By considering prediction error and the information near the prediction point, information fusion between the dynamic model and the statistical method is applied to obtain the optimal prediction results. Finally, the measured data are selected to verify and supplement the analysis, and the drought level is characterized by the SPI, providing an idea and direction for precipitation forecasting and natural disaster prevention in the Huai river basin.

## RESULTS AND ANALYSIS

### Characteristics of temporal distribution for precipitation

#### Inter-annual and seasonal distribution of precipitation

The maximum and the minimum values of annual precipitation and precipitation in the flood season at the representative 10 stations are given in Table 2. The inter-annual variation and the seasonal variation are described in Figure 4(a)–4(j) as follows.

Station . | Yanzhou . | Linyi . | Rizhao . | Kaifeng . | Zhumadian . | Bozhou . | Huaiyin . | Sheyang . | Gushi . | Bengbu . | |
---|---|---|---|---|---|---|---|---|---|---|---|

The maximum annual precipitation | Year | 1990 | 1974 | 1974 | 1992 | 1982 | 1979 | 1991 | 1972 | 1991 | 1991 |

Value | 1,127.9 | 1,228.6 | 1,295 | 999.8 | 1,791.6 | 1,267.6 | 1,339.4 | 1,525.2 | 1,576.6 | 1,483 | |

The minimum annual precipitation | Year | 1976 | 1988 | 1997 | 1986 | 1992 | 1978 | 1978 | 1978 | 1978 | 1978 |

Value | 406 | 529.5 | 510.3 | 352.3 | 476.3 | 472.7 | 552 | 535.1 | 543.7 | 442.1 | |

The maximum annual precipitation the in flood season | Year | 1990 | 1970 | 1974 | 1992 | 1982 | 2000 | 2000 | 1990 | 1991 | 1991 |

Value | 859.8 | 1,076.8 | 897.8 | 812.2 | 1,537.2 | 921.6 | 967 | 1,028.9 | 1,085.7 | 947.4 | |

The minimum annual precipitation in the flood season | Year | 1997 | 1998 | 1997 | 1981 | 1999 | 1988 | 1994 | 1981 | 1999 | 1978 |

Value | 239.6 | 367.9 | 246.9 | 124.2 | 351.2 | 287.3 | 310 | 125.8 | 206.5 | 198.7 |

Station . | Yanzhou . | Linyi . | Rizhao . | Kaifeng . | Zhumadian . | Bozhou . | Huaiyin . | Sheyang . | Gushi . | Bengbu . | |
---|---|---|---|---|---|---|---|---|---|---|---|

The maximum annual precipitation | Year | 1990 | 1974 | 1974 | 1992 | 1982 | 1979 | 1991 | 1972 | 1991 | 1991 |

Value | 1,127.9 | 1,228.6 | 1,295 | 999.8 | 1,791.6 | 1,267.6 | 1,339.4 | 1,525.2 | 1,576.6 | 1,483 | |

The minimum annual precipitation | Year | 1976 | 1988 | 1997 | 1986 | 1992 | 1978 | 1978 | 1978 | 1978 | 1978 |

Value | 406 | 529.5 | 510.3 | 352.3 | 476.3 | 472.7 | 552 | 535.1 | 543.7 | 442.1 | |

The maximum annual precipitation the in flood season | Year | 1990 | 1970 | 1974 | 1992 | 1982 | 2000 | 2000 | 1990 | 1991 | 1991 |

Value | 859.8 | 1,076.8 | 897.8 | 812.2 | 1,537.2 | 921.6 | 967 | 1,028.9 | 1,085.7 | 947.4 | |

The minimum annual precipitation in the flood season | Year | 1997 | 1998 | 1997 | 1981 | 1999 | 1988 | 1994 | 1981 | 1999 | 1978 |

Value | 239.6 | 367.9 | 246.9 | 124.2 | 351.2 | 287.3 | 310 | 125.8 | 206.5 | 198.7 |

In Figure 4(a), it can be seen that at the Yanzhou station, precipitation in the flood season accounts for most of annual precipitation, and the change in precipitation during the flood season basically keeps the same pace with the change in annual precipitation. From 1971 to 2000, precipitation shows a downward trend as a whole, except for fluctuations in some individual years. Precipitation is not below the average for more than 3 years. Most precipitation fluctuates up and down in a 3-year cycle approximately. There is only one period from 1986 to 1989 in which the precipitation of four consecutive years is lower than the average. This indicates that the precipitation in 30 years at the Yanzhou station has maintained a steadily downward trend with an even inter-annual distribution.

In Figure 4(b), it can be seen that at the Linyi station, the changes in precipitation in the flood season and the annual year are basically the same. But from 1985 to 1987, the opposite situation appears. The annual precipitation decreases first and then rises, while precipitation in the flood season rises first and then decreases. So, by and large, precipitation follows a downward trend. Specifically, the annual precipitation is continuously lower than the average from 1976 to 1983. These show that precipitation in a period of 30 years at the Linyi station is unevenly distributed and appears to have some low values during the early study period.

In Figure 4(c), it can be seen that at the Rizhao station, the variation in annual precipitation and the precipitation in the flood season basically keep the same pace, with the maximum and the minimum precipitation taking place in the same year. But from 1979 to 1983, the annual precipitation decreases first, rises, then drops again, while precipitation in the flood season reduces continuously. From 1992 to 1994, the annual precipitation and precipitation in the flood season show opposite change characteristics. Overall, there is no significantly low or high period of precipitation at the Rizhao station.

In Figure 4(d), it can be seen that at the Kaifeng station, the changing trends of the annual precipitation and precipitation in the flood season are consistent with each other. Only in 1982 and 1983, the opposite situation occurs. From 1978 to 1983, precipitation is lower than the average in six consecutive years, but the difference is small. Therefore, precipitation at the Kaifeng station remains around the average with an even inter-annual distribution in no obvious upward or downward trend.

In Figure 4(e), it can be seen that at the Zhumadian station, precipitation shows a slightly downward trend in an uneven inter-annual distribution. The change of precipitation in the flood season is basically the same as the trend of annual precipitation. But from 1972 to 1974, the opposite occurs, and an incompletely similar trend happens during 1992 to 1995. What is more, the difference between the maximum and the minimum precipitation is found to be quite high. The maximum precipitation in the flood season and in the annual year is almost the same, with both happening in 1982. It is inferred that there might be flood in summer or drought in spring or winter. This inference is consistent with the actual occurrence of flood in the summer of 1982. Thankfully, because of the construction of a water conservancy project in 1980, the flood damage was mitigated and loss reduced.

In Figure 4(f), it can be seen that at the Bozhou station, precipitation shows a slightly upward trend with an uneven inter-annual distribution, while precipitation in the flood season is relatively uniform with close extreme values. There are many differences between precipitation in the flood season and in the annual year. These reveal that precipitation in the flood season has a relatively weak impact on annual precipitation. Furthermore, precipitation is above or below the average for at least three consecutive years. This indicates that this area may often suffer from flood or drought. This inference conforms to the actual local conditions.

In Figure 4(g), it can be seen that at the Huaiyin station, both annual precipitation and precipitation in the flood season are in regular fluctuation mode with almost identical changes. There is a continuous period from 1983 to 1986 in which the annual precipitation is close to the average with little positive and negative differences. The inter-annual precipitation is not uniform, but is in a certain upward trend. The upward trend of annual precipitation is more obvious than that in the flood season, which indicates that the precipitation in the non-flood season increases significantly.

In Figure 4(h), it can be seen that at the Sheyang station, the annual precipitation and precipitation in the flood season experiences a continuous decline from 1972 to 1978. The first and last years of the study period are also the years in which the maximum and the minimum, respectively, are recorded. It can be seen that the precipitation in the first half of 30 years is very irregular, while it fluctuates constantly with great differences in the latter half. The annual precipitation shows a downward trend, with almost the same change in precipitation in the flood season.

In Figure 4(i), it can be seen that at the Gushi station, the overall precipitation is in a downward trend. The changes in annual precipitation and precipitation in the flood season are almost in agreement with each other, while inconsistent conditions occur during 1971 and 1973 and during 1992 and 1994. The fluctuation in annual precipitation is regular, and many values are near average. The distribution is relatively uniform, but in the flood season, it is uneven.

In Figure 4(j), it can be seen that at the Bengbu station, the distribution of annual precipitation is fairly uniform with that of precipitation in the flood season. Except for the maximum and the minimum, most values are around average, and the fluctuation is regular. Annual precipitation shows an overall upward trend, while precipitation in the flood season does not show a clear upward or downward trend. These indicate that the impact of precipitation in the other seasons on annual precipitation is increasing to some extent.

In Table 2 and Figure 4, it can be seen that the maximum and the minimum values at 10 stations are random during the period 1971 and 2000. The average maximum and minimum annual precipitation are 1,363.47 and 482 mm, respectively. The average maximum and minimum precipitation in the flood season are 1,013.47 and 245.81 mm, respectively. The variance in the maximum annual precipitation is more than that in precipitation in the flood season, while the variance in the minimum annual precipitation is less than that in precipitation in the flood season. Furthermore, it can be found that the annual precipitation shows a downward trend at nearly half of the representative stations, while only at the Huaiyin station, it shows an upward trend. At other stations, it is in stable fluctuation with a relatively weak magnitude. The conditions of precipitation in the flood season are basically the same as those in the annual year, but the degree of impact to annual precipitation varies. Usually, the precipitation in the flood season accounts for most of annual precipitation. But at the Zhumadian and Bengbu stations, the influence on the annual precipitation in other seasons is increasing.

The multi-year average precipitation in the whole year and in the flood season from 1971 to 2000 in the Huai river basin is shown in Figure 5. In Figure 5, it can be seen that the maximum annual average precipitation occurs in 1998 at 1,115.4 mm, and the minimum appears in 1978 at 600.1 mm. The maximum in the flood season occurs in 2000 at 740.9 mm, and the minimum appears in 1997 at 352.4 mm. According to these statistics, the annual average precipitation is more than the multi-year average for 17 years, while it is less than that for 13 years. The average precipitation in the flood season is more or less than the multi-year average for 15 years. Both the annual average precipitation and the average precipitation in the flood season are in a stable state of fluctuation with a gradual downward trend, but not by as much. Just after the precipitation is below or above the average for 1–3 years, a correspondingly reverse condition appears soon. There are no continuous multi-years in which the precipitation is more or less than the average. But it can be seen that the minimum annual average precipitation appears at several stations in 1978, which indicates that drought might have occurred in some part or the whole river basin in 1978 to some degree. This inference is consistent with the actual severe drought of 1978 in the Huai river basin.

#### Inter-monthly distribution of precipitation

In Figure 6, it can be seen that the precipitation in July is the maximum at 202.4 mm, which reaches nearly a quarter of the whole year. Then, precipitation in both August and June is more than 100 mm. All these three months belong to the flood season. Precipitation in the flood season is able to account for 64% of annual precipitation from a watershed perspective. This proves that the main precipitation in the Huai river basin is relatively concentrated in the flood season from June to September. This also conforms to the obvious subtropical monsoon climate from cause analysis. There is a huge difference in precipitation between winter and summer, with the precipitation being more in summer and less in winter.

In general, precipitation is decreasing, and the distribution is uneven. Extreme precipitation and extreme drought events do happen randomly in high frequency. To prevent these, early disaster prevention and mitigation strategies must be put in place. Flood control and drought relief plans, along with emergency plans, need to be chalked out in detail.

### Mann–Kendall trend analysis and test

The Mann–Kendall method is used to test the changing trends on the significance levels of *α* = 0.10, 0.05, and 0.01 at the representative 10 stations in the Huai river basin. The *Z* values are calculated by using formulas (1) and (2), which are given in Table 3. As we can see, the annual precipitation and precipitation in the flood season at six stations are in the decline, while they are rising or the trend is not obvious at the other stations. The data at four stations have passed the significance test of *α* = 0.10, and only one station has passed the significance test of *α* = 0.05. Generally speaking, the downward trend exists during the period 1971 and 2000, but it is not significant at many stations. These conditions match the conditions described in Section 3.1.

Station . | Yanzhou . | Linyi . | Rizhao . | Kaifeng . | Zhumadian . | Bozhou . | Huaiyin . | Sheyang . | Gushi . | Bengbu . | Huai river basin . |
---|---|---|---|---|---|---|---|---|---|---|---|

The annual precipitation | −1.69* | −1.81* | −1.97** | 0.05 | −0.49 | 0.30 | 1.22 | −0.40 | −1.66* | 0.53 | −0.56 |

The annual precipitation in the flood season | −1.78* | −1.65* | −2.04** | 0.02 | −0.23 | 0.77 | 0.78 | −0.61 | −1.78* | 0.24 | −0.75 |

Station . | Yanzhou . | Linyi . | Rizhao . | Kaifeng . | Zhumadian . | Bozhou . | Huaiyin . | Sheyang . | Gushi . | Bengbu . | Huai river basin . |
---|---|---|---|---|---|---|---|---|---|---|---|

The annual precipitation | −1.69* | −1.81* | −1.97** | 0.05 | −0.49 | 0.30 | 1.22 | −0.40 | −1.66* | 0.53 | −0.56 |

The annual precipitation in the flood season | −1.78* | −1.65* | −2.04** | 0.02 | −0.23 | 0.77 | 0.78 | −0.61 | −1.78* | 0.24 | −0.75 |

*Note*: *, ** and *** are thought to have passed the significance test of *α* = 0.10, 0.05, and 0.01, respectively.

### Characteristics of spatial distribution for precipitation

The Kriging spatial interpolation method is used to describe the spatial distribution characteristics, which is as follows. According to the location information, the spatial relationship of data points is constructed by the covariance function. A weight for each point is given by the variant function. The weighted average method is employed to estimate the information of adjacent unknown points. The advantage is that the interpolation is obtained by using the smallest unbiased estimate of square error, which tends to be 0, and then the best linear unbiased prediction can be carried out.

#### Spatial distribution of annual average precipitation during the period 1970 and 2000

The spatial distribution of annual average precipitation over 30 years is shown in Figure 7. In Figure 7, it can be seen that the maximum precipitation appears in the Dabie Mountains of the southwest basin, and the minimum occurs in the Heze–Kaifeng–Zhengzhou area of the northwest basin, both of which are located in a high mountain terrain in the west basin. There is a decreasing trend of distribution characteristics from south to north. The contour line of precipitation is approximately parallel to the latitude line. The isohyet of 900 mm roughly parallels with the main stream of the Huai river basin. The area of high values in the southwest basin protrudes to the north, the area of median values in the northeast basin protrudes to the north, while the area of low values in the northwest basin protrudes to the south. So, the distribution of precipitation from 800 to 900 mm is narrow and uneven in the west basin, and the contour lines are in an intensive state. There is a disparity in the amount of precipitation between north and south, while it is relatively broad and uniform in the east basin with little difference. These conditions are closely related to the terrain.

#### Spatial distribution of average precipitation in the flood season during the period 1970 and 2000

The spatial distribution of average precipitation in the flood season over 30 years is shown in Figure 8. In Figure 8, it can be seen that the maximum precipitation in the flood season appears in the Dabie Mountains of the southwest basin, and the minimum occurs near the Kaifeng station of the northwest basin, both of which are located in a high mountain terrain in the west basin. The precipitation in the flood season is more uneven than the annual precipitation, and the contour lines are more distorted. The decreasing feature from south to north is still basically maintained. The contour lines in the west basin are dense, while the contour lines in the east basin are very sparse. The area of high values in the southwest basin protrudes to the north, which is similar to that of the annual precipitation. But the area of median values in the east basin protrudes more to the north and some relatively high values appear near the Linyi station. The area of low values in the northwest basin protrudes to the south and shrinks more from wide to narrow strip. The effect of topography on the precipitation in the flood season is relatively obvious.

#### EOF decomposition of precipitation

Precipitation data are processed by EOF decomposition, and the spatial patterns of the first mode and the second mode (see Figures 9 and 10) with time coefficients (see Figures 11 and 12) are obtained, which are as follows. It is believed that the first mode is equivalent to the mean field that is described in Figure 9. By comparing Figure 9 with Figure 7, it is found that the results from EOF decomposition and the Kriging spatial interpolation method are similar. The area of large values is located in the southern basin, and the precipitation keeps decreasing from south to north. This reveals that the description of annual average precipitation in the Huai river basin is relatively credible. In Figure 11, it can be seen that time coefficients fluctuate up and down, showing that basin precipitation is constantly increasing or decreasing. Particularly, the time coefficients of the first mode in the years 1974, 1983, 1990, 1998, and 1999 are high in the forward direction. This indicates that there is a large amount of precipitation in these 5 years. Time coefficients in the years 1985, 1988, and 1991 are high but in the negative. This shows that there is lack of precipitation in these 3 years. These inferences are in good agreement with the measured data in history. In Figure 10, it can be seen that there is a progressive decrease from south to north for the second mode. The line of 0 value passes through the central part of the river basin, and the precipitation in the southern part changes inversely with the precipitation in the northern part. The annual precipitation in the southern part increases, while in the northern part it reduces. The annual precipitation is more in the eastern and southern regions, while in the western region it is less. This statement is consistent with the actual situation.

### Statistical forecasting model

The multiple linear regression equation is established, which takes the SST of obvious correlation as an independent variable and the basin average precipitation as a dependent variable.

#### Relevance over the same period between precipitation and SST

The correlation and test analysis are carried out with the normalized sequences of precipitation and SST. The results are obtained by using formulas (5) and (6), as given in Table 4. It can be found that the significance levels of all the sea areas are less than 0.05, except for the 7th sea area located in the Atlantic Ocean. Although the significance level of the 11th sea area is less than 0.05, its correlation coefficient is only −0.594. This shows that the obvious correlation between the annual precipitation in the Huai river basin and SST in most sea areas exists. Particularly in the 5th and 6th sea areas located in the Pacific Ocean, the 8th and 9th sea areas located in the Atlantic Ocean, and the 10th sea area located in the Indian Ocean, the significance levels are even lower than 0.01. These indicate that an extremely significant correlation exists when there are high correlation coefficients. So, SST can be thought as one predictor variable of precipitation.

Sea area . | Correlation coefficient (r)
. | Significance level . |
---|---|---|

1 | 0.776 | 0.005 |

2 | 0.643 | 0.024 |

3 | 0.657 | 0.020 |

4 | 0.671 | 0.017 |

5 | 0.727 | 0.007 |

6 | 0.860 | 0.000 |

7 | 0.531 | 0.075 |

8 | −0.734 | 0.007 |

9 | 0.734 | 0.007 |

10 | 0.776 | 0.003 |

11 | −0.594 | 0.042 |

Sea area . | Correlation coefficient (r)
. | Significance level . |
---|---|---|

1 | 0.776 | 0.005 |

2 | 0.643 | 0.024 |

3 | 0.657 | 0.020 |

4 | 0.671 | 0.017 |

5 | 0.727 | 0.007 |

6 | 0.860 | 0.000 |

7 | 0.531 | 0.075 |

8 | −0.734 | 0.007 |

9 | 0.734 | 0.007 |

10 | 0.776 | 0.003 |

11 | −0.594 | 0.042 |

#### Relevance in advance between precipitation and SST

If SST is applied as one predictor variable, it is necessary to select SST of some time slot in advance from cause analysis. Then, the forecasting effectiveness of SST can be reflected. Correlation coefficients between SST sequence for the first half year and precipitation sequence for the latter half year, denoted as *r*, are calculated, and significance levels are also given as follows (see Table 5). By reanalyzing the correlation, it can be found that there is no significant correlation between precipitation and SST in the 2nd, 4th, 8th, 9th, 10th, and 11th sea areas. The significance level of the 7th sea area is less than 0.05, showing an obvious correlation. Only for the 6th sea area, the correlation is extremely significant. At the same time, a reverse correlation appears in some sea areas from the positive to the negative. Therefore, if SST is taken as an influencing factor to forecast annual precipitation, a modest early analysis should be done. The corresponding correlations of different time periods are likely to be different, so the changes in sea areas need to be taken into account accordingly.

Sea area . | Correlation coefficient r
. | Significance level . |
---|---|---|

1 | −0.829 | 0.042 |

2 | −0.600 | 0.208 |

3 | −0.809 | 0.042 |

4 | −0.714 | 0.111 |

5 | −0.829 | 0.042 |

6 | −0.943 | 0.005 |

7 | −0.829 | 0.042 |

8 | 0.600 | 0.208 |

9 | −0.600 | 0.208 |

10 | −0.714 | 0.111 |

11 | 0.600 | 0.208 |

Sea area . | Correlation coefficient r
. | Significance level . |
---|---|---|

1 | −0.829 | 0.042 |

2 | −0.600 | 0.208 |

3 | −0.809 | 0.042 |

4 | −0.714 | 0.111 |

5 | −0.829 | 0.042 |

6 | −0.943 | 0.005 |

7 | −0.829 | 0.042 |

8 | 0.600 | 0.208 |

9 | −0.600 | 0.208 |

10 | −0.714 | 0.111 |

11 | 0.600 | 0.208 |

By considering the results of two correlation analysis mentioned in Sections 3.4.1 and 3.4.2, four sea areas, which are the 1st sea area, the 3rd sea area, the 5th sea area, and the 6th sea area, are selected as stable sea areas based on the relatively high correlation coefficients. Because the most significant correlation and high correlation coefficient occur in the 6th sea area, it can be considered that SST in the 6th sea area has the most stable correlation with the precipitation in the Huai river basin. Then, the 6th sea area is chosen as the key area, which has a significantly negative correlation between SST and precipitation.

#### Forecasting model establishment

In the formula, *Y* is the precipitation forecasting value in the latter half of the year, *X*_{1} is the average SST in the 1st half of the year in the first sea area, *X*_{2} is the SST in the 3rd sea area, *X*_{3} is the SST in the 5th sea area, and *X*_{4} is the SST in the 6th sea area.

Observed data of SST from 1971 to 2000 are substituted into the forecasting equation (see formula 10) to obtain the predicted values of precipitation (see Figure 13). By comparing with the two curves intuitively, it can be found that the trend of the predicted values is in good agreement with that of the measured values. However, it is obvious that there are large deviations at the extreme points. Correlation coefficients between predicted and measured precipitation in the latter half of the year at 10 representative stations are calculated, whose values vary during 0.50 and 0.70, as shown in Table 6. The correlation coefficient during the period 1970 and 2000 in the Huai river basin is 0.57, which can pass the 5% significance test. So, it can be argued that forecasting equation may obtain the changing trend of precipitation from the perspective of qualitative and quantitative analyses. But it is not sufficient for extreme values, and the prediction accuracy in some years cannot meet the requirement. All these show that although SST is an important factor, the actual forecasting needs cannot be satisfied completely. The factors affecting precipitation in the river basin are very complex and diverse. The prediction model should not be established only using SST. The various factors need to be considered comprehensively, and other factors can be added to the forecasting model in order to improve the forecasting accuracy.

Correlation coefficient . | Yanzhou . | Linyi . | Rizhao . | Kaifeng . | Zhumadian . | Bozhou . | Huaiyin . | Sheyang . | Gushi . | Huai river basin . |
---|---|---|---|---|---|---|---|---|---|---|

Statistical model | 0.55 | 0.62 | 0.69 | 0.67 | 0.52 | 0.53 | 0.60 | 0.65 | 0.63 | 0.57 |

Correlation coefficient . | Yanzhou . | Linyi . | Rizhao . | Kaifeng . | Zhumadian . | Bozhou . | Huaiyin . | Sheyang . | Gushi . | Huai river basin . |
---|---|---|---|---|---|---|---|---|---|---|

Statistical model | 0.55 | 0.62 | 0.69 | 0.67 | 0.52 | 0.53 | 0.60 | 0.65 | 0.63 | 0.57 |

### Infusion of prediction results by the statistical model and the CFSv2 model

Precipitation predictions by the statistical model, the CFSv2 original model, the CFSv2 statistical downscaling model, and the dynamic-statistical information infusion model using formulas (8) and (9) during the period 1982 and 2000 are shown in Figure 14. Two validations of prediction results, which are correlation coefficients and the root mean squared error (RMSE), have been carried out to report a qualitative analysis and quantitative error measures.

On the one hand, correlation coefficients are calculated (see Table 7). In Figure 14 and Table 7, it can be seen that firstly, there is a great difference in forecasting results between the CFSv2 original model and the measured data, whose correlation coefficient is only 0.12. This condition conforms to the fact that precipitation prediction cannot be studied only by climate models in quantitative research. Secondly, there is a significant improvement in forecasting precipitation by the CFSv2 statistical downscaling model in contrast to the predicted values by the CFSv2 original model, whose correlation coefficient changes from 0.12 to 0.60. Due to its good prediction skill, the CFSv2 statistical downscaling model can be applied for dynamic-statistical information fusion. Thirdly, the predicted values between the statistical model and the CFSv2 original model have a big difference in magnitude, whose correlation coefficients are 0.57 and 0.12, respectively. There is not much improvement in dynamic-statistical information infusion with the CFSv2 original model. Its correlation coefficient is 0.34, which is far less than 0.57 by the statistical model. This shows that when the prediction level of the dynamic model is much lower than that of the statistical model, prediction error after the infusion of two models cannot be minimized. Finally, the predicted values by the statistical model and the CFSv2 statistical downscaling model are close to each other, whose correlation coefficients are 0.57 and 0.60, respectively. After dynamic-statistical information infusion with the CFSv2 statistical downscaling model, the correlation coefficient reaches 0.71, which is higher than 0.57 and 0.60. There is an improvement in precipitation prediction.

Forecasting model . | Statistical model . | CFSv2 original model . | CFSv2 statistical downscaling model . | Information fusion with CFSv2 original model . | Information fusion with CFSv2 statistical downscaling model . |
---|---|---|---|---|---|

Correlation coefficient | 0.57 | 0.12 | 0.60 | 0.34 | 0.71 |

RMSE | 89.60 | 129.48 | 82.37 | 103.25 | 72.31 |

Forecasting model . | Statistical model . | CFSv2 original model . | CFSv2 statistical downscaling model . | Information fusion with CFSv2 original model . | Information fusion with CFSv2 statistical downscaling model . |
---|---|---|---|---|---|

Correlation coefficient | 0.57 | 0.12 | 0.60 | 0.34 | 0.71 |

RMSE | 89.60 | 129.48 | 82.37 | 103.25 | 72.31 |

On the other hand, the RMSE between predicted and measured values is also applied as another evaluation criterion to quantitatively assess the differences between the various models. According to the models’ results sorting in Figure 14, the RMSE is calculated in Table 7, respectively, which are 72.31, 103.25, 82.37, 129.48, and 89.60. It can be seen that the CFSv2 original model has poor prediction skill. Statistical downscaling is an effective way to improve predicted values. After a series of corrections, predicted values by dynamic-statistical information infusion with the CFSv2 statistical downscaling model are the closest to measured values. In addition, both validations show that when the prediction results of the statistical model and the dynamic model are more or less at the same skill level, there can be a significant improvement in the infusion results. So, it is necessary to correct the forecasting deviation before information fusion.

### Drought conditions by the standardized precipitation index

Apparently, basin average precipitation for the latter half of the year from 1970 to 2000 is 549 mm. The amount of precipitation can be obtained from predicted values by information fusion with the CFSv2 statistical downscaling model. Then, drought conditions can be directly deduced by the standardized precipitation index (SPI) based on drought levels in China (Chen *et al.* 2019a, 2019b), as given in Table 8. In Table 8, it can be seen that moderate and light droughts often take place in the Huai river basin, while severe drought occurs rarely. But the risk of drought exists constantly. Based on China's yearbook of meteorological disasters, many drought events occurred in the years 1977, 1978, 1981, 1986, 1988, 1992, 1994, 1997, and 1999 in the Huai river basin. These conditions are basically consistent with the identified results in Table 8, with only a slight difference existing in the drought levels. Therefore, the dynamic-statistical information fusion model can not only be used to predict precipitation more accurately, but can also present drought conditions by the SPI, providing a certain reference for disaster warning.

Year . | SPI . | Drought Level . | Year . | SPI . | Drought Level . | Year . | SPI . | Drought Level . |
---|---|---|---|---|---|---|---|---|

1973 | −0.20 | No Drought | 1975 | −0.35 | No Drought | 1976 | −0.45 | No Drought |

1977 | −0.52 | Light Drought | 1978 | −1.07 | Moderate Drought | 1981 | −0.66 | Light Drought |

1986 | −1.20 | Moderate Drought | 1988 | −1.42 | Moderate Drought | 1989 | −0.48 | No Drought |

1992 | −0.52 | Light Drought | 1993 | −0.41 | No Drought | 1994 | −1.72 | Severe Drought |

1995 | −0.13 | No Drought | 1997 | −1.68 | Severe Drought | 1999 | −1.36 | Moderate Drought |

Year . | SPI . | Drought Level . | Year . | SPI . | Drought Level . | Year . | SPI . | Drought Level . |
---|---|---|---|---|---|---|---|---|

1973 | −0.20 | No Drought | 1975 | −0.35 | No Drought | 1976 | −0.45 | No Drought |

1977 | −0.52 | Light Drought | 1978 | −1.07 | Moderate Drought | 1981 | −0.66 | Light Drought |

1986 | −1.20 | Moderate Drought | 1988 | −1.42 | Moderate Drought | 1989 | −0.48 | No Drought |

1992 | −0.52 | Light Drought | 1993 | −0.41 | No Drought | 1994 | −1.72 | Severe Drought |

1995 | −0.13 | No Drought | 1997 | −1.68 | Severe Drought | 1999 | −1.36 | Moderate Drought |

## CONCLUSIONS

Research on the spatial–temporal distribution of precipitation and the forecasting model using dynamic-statistical information fusion is a highly meaningful exercise. In this study, statistical analysis and EOF decomposition are employed to explore precipitation distribution in the Huai river basin. The dynamic-statistical forecasting model by the optimum interpolation assimilation method is established to predict basin precipitation and drought conditions with the SPI. The results are as follows:

*Time*: The changes in precipitation in annual years and in the flood season at each station are not completely the same, but there is a consistent trend from 1970 to 2000 on the whole. At most stations, although precipitation varies widely in different years, it fluctuates around the average with seldom continuously low or high values for a long time. The maximum precipitation occurs in 1998, which coincides with the occurrence of the ‘1998’ extraordinary flood in the Yangtze River. Perhaps, the correlation of extreme precipitation between river basins can be explored further.*Space*: For statistical charts in hydrology, the features of the annual precipitation and the precipitation in the flood season are similar to those in the river basin terrain. Precipitation is evenly distributed with natural transition, especially the blocking effect of the mountains in the western basin is huge, and the eastern coastal area is affected by marine factors from the Pacific Ocean and sedimentation from the Yellow River. For EOF decomposition in meteorology, the first mode reflects the average field. This indicates that the overall transition of precipitation distribution is relatively homogeneous. The high- or low-time coefficient corresponds to the year in which the extreme flood or drought appears. The second mode reflects the anti-phase inversion of the north and the south in the river basin. These are consistent with the slightly downward trend of the northern stations and the slightly upward trend of the southern stations.*SST*: Precipitation in the latter half year is significantly correlated with SST in the first half year of four sea areas in the Pacific Ocean. The Huai river basin is strongly affected by the monsoon from the western Pacific where the Huai river is flowing into. Particularly in summer, typhoons from the Pacific Ocean near the equatorial region also exercise great influence, resulting in extreme precipitation. So, it is necessary to select the appropriate period in advance to obtain the critical ocean area. Meanwhile, this shows that SST has more complex and changeable impacts on precipitation in the terrestrial watershed. The meteorological elements interact with each other in some regular pattern. Therefore, more influencing factors can be studied later.*Precipitation prediction*: For the statistical model, forecasting equation in the latter half of the year is established by SST data in the first half of the year. The overall trend of forecasting can be accepted, but the accuracy of extreme values needs to be improved. This means that the practical use for precipitation forecasting cannot be satisfied only with SST. Other factors may be integrated into the forecasting system to build one multivariate forecasting equation. For the dynamic prediction model, the CFSv2 original model selects a 500 hPa height field in this year and SST observation field in the first half of the year as predictor variables. The CFSv2 statistical downscaling model is set up by the field information coupling method. After being corrected, the correlation coefficient between predicted precipitation and observed data can be enhanced from 0.12 to 0.60. The CFSv2 statistical downscaling model has good prediction skill.*Infusion*: For the dynamic-statistical prediction model, when the prediction level of the CFSv2 original model is much lower than that of the statistical model, prediction error after the infusion of two models cannot be minimized. When the prediction results of the CFSv2 statistical downscaling model and the statistical model have more or less the same skill level, the infusion results are able to be improved further. So, forecasting deviation should be revised before information fusion. Dynamic-statistical infusion with the CFSv2 statistical downscaling model can be applied to objectively evaluate and analyze the drought conditions in the Huai river basin by the SPI, providing an important reference for decision-making.

## DISCUSSION

In terms of the theories and methods for precipitation in the Huai river basin, this study only attempts to improve and apply some commonly statistical methods and dynamic models in hydrology and meteorology. Next, more theories, methods, and different techniques and models can be used for the research on hydrometeorological cross coupling. Furthermore, in this study, only some preliminary ideas and results are involved, which must be supplemented in applications with inspection, correction, and development.

With regard to the precipitation spatial–temporal evolution characteristics, the study period is limited from the years 1970 to 2000, with few man-made interruptions, similar to the original state. The precipitation conditions in the subsequent period after 2000, which are more affected by climate change, underlying surface conditions, and human activities, could be given the comparative analysis later.

As for precipitation prediction and forecasting, this study only makes a little exploration from a combination of dynamic and statistical models. Uncertainties from spatial-scale differences between CFSv2 and gauge-based precipitation data always exist. With the progress of assimilation technology, computing power, and physical framework in the era of big data, there is still much room for CFSv2 model improvement in many aspects, such as system deviation, mid-high latitude circulation, and monsoon simulation. Then, suitable climate models can be selected to combine with statistical methods by advanced coupling technology, which is useful for short-term climate prediction and hydrological forecasting.

In the future, the proposed theory, method, and model of precipitation should be applied as a system to test its general applicability and reasonability in many more areas. The performance of the system will be evaluated, and the available information will be mined as much as possible, especially the prediction ability of the temporal and spatial-scale information concerned. In addition, the proposed system needs to be applied for a long time. Perhaps, its scope can be further expanded, and the occurrence time can be made more accurate in practical work.

## ACKNOWLEDGEMENTS

The authors acknowledge the support of the National Key Research and Development Program of China (Nos. 2021YFC3201101, 2016YFA0601501, 2017YFC1502405, 2017YFC1502706, and 2017YFA0605002), Project funded by the China Postdoctoral Science Foundation (Nos. 2020T130309 and 2019M651892), China Scholarship Council (Nos. 201808320128 and 201808320127), the Belt and Road Fund on Water and Sustainability of the State Key Laboratory of Hydrology-Water Resources and Hydraulic Engineering, China (No. 2020nkzd01), Jiangsu Water Resources Science and Technology Project (Nos. 2020022 and 2021024), Meteorological Open Research Fund in the Huai River Basin (No. HRM201702), and Nanjing University of Information Science & Technology Research Foundation (No. 2017r097). The authors also want to thank others for their helpful suggestions and corrections on the earlier draft of our study, according to which we improved the content.

## CONFLICT OF INTEREST

There is no conflict of interest in our paper.

## DATA AVAILABILITY STATEMENT

All relevant data are included in the paper or its Supplementary Information.