## Abstract

Natural streamflow reconstruction is highly significant to assess long-term trends, variability, and pattern of streamflow, and is critical for addressing implications of climate change for adaptive water resources management. This study proposed a simple statistical approach named NSR-SVI (natural streamflow reconstruction based on streamflow variation identification). As a hybrid model coupling Pettitt's test method with an iterative algorithm and iterative cumulative sum of squares algorithm, it can determine the reconstructed components and implement the recombination depending only on the information of change points in observed annual streamflow records. Results showed that NSR-SVI is suitable for reconstructing natural series and can provide the stable streamflow processes under different human influences to better serve the hydrologic design of water resource engineering. Also, the proposed approach combining the cumulative streamflow curve provides an innovative way to investigate the attributions of streamflow variation, and the performance has been verified by comparing with the relevant results in nearby basin.

## HIGHLIGHTS

A statistical approach is proposed to improve the accuracy of hydrological variation detection, and further reconstruct natural streamflow only depending on the variation information of streamflow.

The hybrid method has better performance on detecting the multiple change points in mean and variance.

The proposed approach combining the cumulative curve provides a way to investigate the attribution of streamflow variation.

### Graphical Abstract

## INTRODUCTION

Water resources management and hydraulic engineering design closely depend on the quality of hydrologic data used in plan and design. The data series is required to belong to a single statistical population, which is called the assumption of stationarity (Xiong & Guo 2004). Unfortunately, the requirement is often difficult to be met due to the non-stationarity in hydrological series which is triggered by climate change and large-scale human activities (Seidou & Ouarda 2007). The non-stationarity could make the current and future streamflow be different from the historical streamflow employed in designs, implying that the original design, operation, and management strategies of water resources projects and river ecological protection may no longer be appropriate in the current changing environment and may consequently impose a greater risk. As reported in the literature, many rivers in the world have been greatly altered by water resources projects to control flow for meeting human needs (Naiman *et al.* 2008). Some observed hydrological data from various countries and regions have demonstrated significant inconsistency or non-stationarity, which is influenced by water infrastructure, channel modifications, drainage works, land-cover change, and land-use change (Milly *et al.* 2008). This implies that the risk from river streamflow variations may be indistinguishable in today's world and makes a request for natural streamflow and stable streamflow series under different human-impacted environments in the engineering design and water management.

The variation in streamflow is often exhibited by a regime shift, known as a break, abrupt change, discontinuity or inhomogeneity in different areas, meaning a shift of the flow system from one regime to another, and the location of the regime shift in time is called the change point at which the parameters of the underlying distribution or the parameters of the model used to describe the time series abruptly change (Beaulieu *et al.* 2008). The changing parameters, including mean, variance, trend (regression), intercept, frequency, correlation coefficient, system information, and combinations thereof, are summarized by Beaulieu *et al.* (2008) and are also regarded as types of abrupt shifts. As the types of an abrupt shift are complex and interwoven, it may be difficult to identify change points with one method (Zhang *et al.* 2019). Indeed, different methods may be required for different climate/hydrological elements or the same climate/hydrological element at different time scales. However, different methods may yield conflicting conclusions when applied to the same series. Thus, a need has arisen for a careful discussion and comparison of these methods to offer general guidance for a methodology that is particularly suited to detect a certain type of abrupt shift or a combination of several types (Kundzewicz & Robson 2005; Reeves *et al.* 2007).

Recently, there have been an increasing number of studies using change point detection methods to detect artificial or natural discontinuities and regime shifts in climatic and hydrological series (Beaulieu *et al.* 2008; Zhang *et al.* 2016). Depending on the delay of detection, change point detection methods can be classified into two categories: real-time detection (or simply online detection) and retrospective detection (Adams & MacKay 2007). Real-time change point detection targets applications that require an immediate response, while retrospective change point detection requires a longer reaction period and tends to give more robust and accurate detection (Liu *et al.* 2013). The change point detection used in climatic and hydrological fields is usually retrospective detection.

Many change point detection methods have been developed for the retrospective detection of abrupt shifts. However, most methods focus on demonstrating the abrupt change in mean. For instance, methods, including Student's *t*-test, Bayesian analysis, Pettitt's test, Mann–Whitney U test, Wilcoxon–Mann–Whitney test, rank-sum test, cumulative sum test, Kruskal–Wallis test, cumulative deviations test, Worsley likelihood ratio test, the standard normal homogeneity test, Spearman's rho, Mann–Kendall test, Kendall's tau, seasonal Kendall test, regression-based method, intervention analysis, Lanzante method, Rodionov sequential method, and informational method, are for the detection of change in mean. Few methods can be used to obtain accurate results in the detection of variance shifts and other types of shifts due to overlapping variation information, which poses a challenge to reconstruct the natural streamflow statistically. Thus, this study was designed to build a hybrid model consisting of Pettitt's test, an iterative algorithm, the iterative cumulative sum of squares (ICSS) algorithm, and a segment over-whitening procedure to detect multiple types (mean and variance) of abrupt change in hydrological data, and obtain the information of locations and magnitudes of the abrupt shifts, with the purpose of reconstructing the natural streamflow series and stable streamflow series under certain human-impacted environments by statistical methods.

With this consideration in mind, the objectives of the present study are to (1) detect the information of change points in mean and variance and answer the question whether the abrupt changes occurred in the streamflow series and what types of change happened, (2) rebuild the natural streamflow series, according to all of the variation information including the number and locations of change points and the magnitudes of abrupt changes, (3) investigate the reliability of the reconstructed flow, and (4) explore joint application with the cumulative curve to quantify the contribution of precipitation variation and anthropogenic interference to streamflow decrease.

## STUDY AREA AND DATA

### Study area

Two basins, that is, the Mahuyu River basin and the Tuwei River basin in the core of the Loess Plateau, China, were selected (Figure 1(a)). Therein, the Mahuyu River basin without trend or abrupt change in rainfall was chosen as the target basin to test the hybrid model, and the Tuwei River basin was used as a case area to verify the applicability of the proposed model.

The Mahuyu River, a secondary order tributary of the Yellow River, originates from the Naopan Mountain in Hengshan County, Shaanxi Province, covering an area of 372 km^{2} (Figure 1(b)). It has a total length of 41.8 km from the estuary, with an average slope of 5.9. The Mahuyu River basin is characterized as having steep hillslopes with incised channels. The basin is exposed to an extratropical semiarid continental monsoon climate. The average annual precipitation in the catchment varies between 200 and 500 m of which 75% occurs during the flood season from June to September. Mahuyu gauge is the catchment outlet.

The Tuwei River, a branch of the Yellow River, flows from the northwest to the southeast (Figure 1(c)). The river has a total length of 139.6 km and covers an area of 3,253 km^{2} (Wang *et al.* 2011). The upper-middle reaches of the river are the sandy shoal region and loess hilly-gully region that are typical landscapes in this basin. The annual precipitation is 402 mm, and the pan evaporation is 1,853 mm (Yang *et al.* 2019). Gaojiachuan gauge is the control gauge of the Tuwei River basin.

The features of the two basins, common in the Loess Plateau, such as sparse vegetation and loose soil, make serious soil erosion (Huang *et al.* 2020). To prevent soil erosion, many check-dams have been built in above basins. However, the rivers are intercepted by the check-dams and the water is retained in the reservoirs, which lead to continually decreasing streamflow in the downstream channel. It is well known that the Loess Plateau is subject to severe water resource shortages and fragile ecological environment (Li *et al.* 2017; Zhao *et al.* 2019). The abrupt changes occurring in the streamflow series could have a serious influence on water security and environmental protection in this area. From this perspective, this study provides a new statistical method for restoring the natural streamflow and stable streamflow series under certain human-impacted environments to serve for water resources management and ecological protection in the Loess Plateau.

### Data

The time series data of regional average annual rainfall covering the period of 1962–2010 at the Mahuyu River basin and 1956–2005 at the Tuwei River basin, shown in Figure 2, were collected from precipitation stations in and around the study areas (see Supplementary Material). Figure 2(a) demonstrates that there is no apparent trend or abrupt change in precipitation series of the Mahuyu River basin, and an augmented Dickey–Fuller unit root (ADF) test (Dickey & Fuller 1979) returns a value of −5.8 less than −3.6 (at the 1% level), implying that the series is stationary. In contrast, breakpoints in mean and variance were checked out in precipitation data of the Tuwei River basin (Figure 2(b)).

Annual measured streamflow data at Mahuyu and Gaojiachuan gauges were obtained from the hydrological bureau of the Yellow River Conservancy Commission (YRCC). The measured runoff series at two hydrological gauges were both obviously nonstationary or inconsistent.

## METHODS

### Pettitt's test for testing single-step change point in mean

*X*(

_{t}*t*= 1, …,

*N*). It tests the null hypothesis, H

_{0}: The

*X*variables follow one or more distributions that have the same location parameter (no change), against the alternative hypothesis: a change point exists. Pettitt's test does not detect a change in the distribution if there is no change in location, and thus, the test is only suited for the detection of change in the mean. The test uses a version of the Mann–Whitney statistic,

_{t}*U*, that tests whether two sample sets,

_{t,N}*x*

_{1}, …,

*x*and

_{t}*x*

_{t}_{+1}, …,

*x*, are from the same population. The test statistic

_{N}*U*is given by (Gao

_{t,N}*et al.*2011):where sgn is a sign function and sgn(

*x*−

_{t}*x*) = 1 if

_{i}*x*−

_{t}*x*> 0, 0 if

_{i}*x*−

_{t}*x*= 0, and −1 if

_{i}*x*−

_{t}*x*< 0. The test statistic

_{i}*U*counts the number of times a member of the first sample exceeds a member of the second sample. The maximum of the absolute values, |

_{t,N}*U*|, gives the position of a possible change point if 1 ≤

_{t,N}*t*<

*T*. The statistic and the associated significance probability (

*p*) used in the test are given as follows:

A ‘downward shift’ in the level from the beginning of the series is indicated by a large *K*^{+}_{t}_{,N} = max_{1≤t<T}*U _{t,N}* (

*K*

^{+}

_{t}_{,N}notes positive

*K*

_{t}_{,N}), and an ‘upward shift’ is indicated by a large

*K*

^{−}

_{t}_{,N}= −min

_{1≤t<T}

*U*(

_{t,N}*K*

^{−}

_{t}_{,N}notes negative

*K*

_{t}_{,N}; Kropp & Schellnhuber 2011). The change point of the series is located at

*K*

_{t}_{,N}if the significance probability

*p*is equal to or greater than 0.95.

Pettitt's test is always considered a good exploratory tool for detecting change point because it requires no assumption about the distribution of data; because it is not sensitive to outliers and skewed distributions (Xie *et al.* 2014), it has some limits for application in hydrology. For example, the test works well for a single change point detection, and the assumption of independence or lack of serial correlation should be met before the test is used for the detection of change point (Busuioc & von Storch 1996). In addition, the test often fails or is invalidated when less hydrological or climate time series are used for testing. Thus, the over-whitening procedure was employed in this study to remove serial autocorrelation in the hydrological series, and the iterative procedure (Inclan & Tiao 1994) was introduced to perform multiple change point detection, which is described in what follows.

### Iterative procedure for identifying multiple change points

Three algorithms are usually used for identifying multiple change points, that is, the binary segmentation (BS; Edwards & Cavalli-Sforza 1965), segment neighborhood (SN; Auger & Lawrence 1989), and the pruned exact linear time (PELT; Killick *et al.* 2012), of which the BS is the most widely used multiple change point search method. In this study, the BS algorithm together with Pettitt's test was employed for detecting multiple change points together. The procedure applies the single change point test statistic to pieces of the series by dividing them consecutively after a possible change point is found, which can look for change points in an effort to isolate each point. The procedure continues until no change points are found in any part of the data. At this point, all possible change points are obtained. However, all these points, *CP*(*k*), are not always true change points because of the marking effect; thus, a systematic search is performed for possible points after previous iterations if there are two or more possible change points (Inclan & Tiao 1994). The first step is ordering all possible points, and then Pettitt's test is redone between *CP*(*i* − 1) and *CP*(*i* + 1) to check the point *CP*(*i*). The point is kept if the check result is the same as the previous result; otherwise, it is eliminated. If a new significant point is found in the retest, the new point should be added to *CP*(*k*). Two-stage checking is repeated with the above steps until the number of change points does not change and the locations do not move by more than a specified amount. Finally, when the algorithm has converged, all the true change points have been found.

### ICSS for detecting change points in variance

*et al.*2019). This algorithm includes two procedures: the centered cumulative sum of squares (CCSS) and the iterative procedure. The CCSS is regarded as the test statistics

*D*in this algorithm to estimate the number of changes, and the point in time of variance shifts and is calculated as follows:where

_{k}*N*is the length of the series

*x*(

*t*). According to the

*D*value, the sudden changes in variance can be identified in the time series as follows: if

_{k}*D*oscillates around zero, it can be interpreted as no change in the variance over the whole period. If

_{k}*D*departs from zero, it suggests that there are one or more shifts in variance, and if the maximum of exceeds the boundary values that are obtained from the asymptotic distribution of

_{k}*D*, assuming constant variance, a significant shift in variance occurs at

_{k}*k*. The 5% significance level was selected in the study, and ±1.358 was the asymptotic critical value.

For multiple breakpoints, however, the usefulness of the *D _{k}* function is questionable due to the ‘masking effect’. To avoid this, Inclan and Tiao designed an iterative procedure that uses successive application of the

*D*function at different points in the time series to look for a possible shift in the hydrological time series. The details of the iterative algorithm have been presented above. In addition, the ICSS algorithm often tends to overestimate the number of breakpoints because the assumption of independence in time series data is usually violated. Thus, the removal of autocorrelation embedded in the time series must be performed.

_{k}### Over-whitening procedure for autocorrelation removal

The over-whitening (O-W) procedure introduced above is a new approach developed by Şen (2017) to reduce the original time series serial dependence or to remove the autocorrelation pattern in time series data before the series are tested statistically for trend and change point. It adds a completely random (white noise) time series with zero mean and appropriate standard deviation to the original hydrological time series with the purpose of rendering the original series into a serially independent counterpart with the same trend component. For a smoothed time series, it is true that the procedure does not usually harm the trend component in the original series (Şen 2017). However, a white noise series with a standard deviation from the entire series could alter the original trend and change in some segments with a small standard deviation when the observed hydrological series exhibits nonstationary change. Thus, this study proposes the segment O-W procedure (Zhang *et al.* 2018).

The segment O-W procedure first removes trends from the original time series to eliminate the effect of the trend in the series on the serial correlation. Second, it segments the detrended series in accordance with the variance change and builds various white noise series based on different subseries. Finally, it adds the white noise series to the associated original data segment (including the trend component) to implement the over-whitening.

### Natural streamflow reconstruction based on streamflow variation identification

The methods of natural streamflow restoration mainly involve hydrological models (such as SWAT and VIC), restoration water volume (RWV) by investigating water consumption from different water departments, and rainfall–runoff models (Blöschl *et al.* 2013). The RWV approach investigating water consumptions of different departments is often influenced by the collected data that may be incomplete and inaccurate so that the result often shows large deviation from real natural runoff. The rainfall–runoff model generally fails in areas where the precipitation–runoff relationship is weak. Hydrological models, especially physically based distributed hydrological models, have good predictive performance, but these models are usually supported by massive amounts of various types of data (Yu *et al.* 2018). In summary, these restoration methods have their own application limits or data requirements, and they cannot be employed in some areas due to their inapplicability or because of data scarcity.

In this study, a simple approach depending only on the variation information of streamflow, namely streamflow variation identification (NSR-SVI), was proposed to reconstruct the natural streamflow and stable streamflow series under certain human-impacted environments. The NSR-SVI is a pure statistical method which couples the over-whitening procedure, Pettitt's test, iterative algorithm, and the iterative cumulative sums of squares algorithm. Therein, Pettitt's test was employed to detect the change point in the mean, and the iterative algorithm was used to perform multiple change point detection with it. Also, the over-whitening procedure was introduced to remove the autocorrelation in the observed data prior to change point detection because serial autocorrelation is unacceptable to Pettitt's test. In addition, the iterative cumulative sums of squares algorithm integrating the iterative algorithm and cumulative sums of squares algorithm was applied to detect multiple breakpoints in variance. The reconstruction procedure of the natural streamflow series and stable streamflow is as follows, shown in Figure 3:

Step 1: The observed streamflow series is divided into trend and residual series by piecewise linear regression, and the partial autocorrelation coefficient function (PACF) test is implemented on the residual. If the result is significant, the segment needs to be done based on change points in variance obtained by ICSS. Otherwise, the observed series is independent and meets the requirement of Pettitt's test.

Step 2: The segment O-W procedure is employed to remove the serial correlation in the segmented original series, and the significant autocorrelation-removed series (SAR series) can be achieved.

Step 3: Depending on the change point(s) in mean detected by Pettitt's test together with the BS iterative algorithm, the SAR series is segmented, and the segmental mean is obtained. Then, the segmental mean is removed from the SAR series, and a new series with a mean of 0 is generated, namely, a mean-removed series (MR series).

Step 4: The standard deviations of segmental data before and after the change point in the variance are calculated in the MR series. And the ratio of standard deviations before and after the point is counted, by which it can be determined whether the latter part should be magnified or minified. Then, a reconstructed sequence with 0 mean, or rather, a change in variance-removed sequence (CVR series) can be obtained, in which the latter segment has the same standard deviation as the front part of the series.

Step 5: The natural streamflow or stable streamflow under different environments can be achieved by combining the mean of the SAR series during certain periods and the CVR series.

## RESULTS AND DISCUSSION

### Removal of serial autocorrelation

The autocorrelation in the streamflow series was detected by a PACF which can provide lag-*n* autocorrelation coefficients without interactive influences. The lag-*n* autocorrelation coefficients of the residual series before and after the autocorrelation removal are presented in Figure 4(a) and 4(b). The result indicates that there is serial correlation in the observed streamflow time series data (Figure 4(a)). Thus, the data series needs to be performed with the segment O-W procedure for removing the serial correlation to ensure the correct detection result can be obtained by Pettitt's test. After the removal of serial correlation, the significant autocorrelation-removed series (SAR series) has no obvious autocorrelation on all lags (Figure 4(b)). This indicates that the segment O-W procedure is efficient in removing serial correlation (especially the higher-lag serial correlation) and the SAR series is considered to be independent and suitable for Pettitt's test.

Also, it should not be ignored in autocorrelation removal, is that the trend component in the original series may be harmed after the removal of serial correlation. And the damage has been doubted in the applications of pre-whitening (PW; von Storch 1995) and trend-free pre-whitening (TFPW; Yue *et al.* 2002) procedures. It was also a question in this research whether the over-whitening procedure reduced the serial dependence function of original time series without harm to the trend component. The answers are found in Figure 4(c) showing the original streamflow time series and the SAR streamflow series. The figure indicated the SAR series was similar with the original streamflow series, only with a small error, implying that the segment O-W procedure has also an effective control over the loss of trend component or information during the removal of autocorrelation. At this point, it is very clear that the over-whitening procedure has advantages in removing the serial correlation in hydrological time series data without significant harm to the trend component.

### Change point detection in mean

Pettitt's test method coupled BS iterative algorithm was employed in this study for detecting multiple change points in the mean. The SAR streamflow series at Mahuyu gauge was checked, as shown in Figure 5.

Figure 5(a) and 5(b) shows that two change points in mean were checked out in the SAR streamflow series at Mahuyu gauge, occurring in 1971 and 1997, respectively. The corresponding statistics were the maximum and the probabilities were greater than 0.90 at the change points, indicating that the two change points were both significant. However, the change point in 1997 was only found when using Pettitt's test. It hereby confirms that Pettitt's test works well in single change point detection and needs to be coupled with the iterative algorithm or others if employed for multiple change point detection.

Because the ICSS algorithm requires that the series to test had no trend change, the segmental mean was removed from the SAR streamflow series at Mahuyu gauge (Figure 6(a)). The MR series was used for the change point detection in variance (Figure 6(b)).

### Change point detection in variance

Figure 7 shows the result of variance detection and only a single change point occurring in 1979 was found in the MR series at Mahuyu gauge.

According to the location of change point in variance, the variation information in variance of the SAR streamflow series can be obtained, as shown in Figure 8(a). Then, variance magnification was implemented in the latter part, and a CVR series was generated (Figure 8(b)).

### Reconstruction of the natural streamflow

Three base periods representing different levels of human impact were determined depending on the detected change points in mean, that is, 1962–1971, 1971–1997, and 1997–2010, considering no variation occurred in precipitation series in the Mahuyu River basin. Therein, the period of 1962–1971 demonstrates the phase that few human activities influenced the hydrological processes, close to the natural streamflow, and the period of 1962–1971 and 1997–2010 were designed as two other phases with different levels of human influence. Depending on these phases, the natural streamflow and stable streamflow series under different human-impacted environments can be reconstructed, representing hydrological change without or with human interferences. Considering the requirement of verification of the reconstructed result, the first phase was selected in this study (Figure 9(a)), and mean of streamflow during this period was combined with the CVR series to reconstruct the natural streamflow in 1962–2010, namely NSR-SVIed streamflow, as shown in Figure 9(b).

### Model verification

To verify the reliability of the reconstructed result in statistic, the Kolmogorov–Smirnov test was introduced. The result showed that the latter segment data (after 1971) were not significantly different from the front data, indicating that the reconstructed natural streamflow data had the same distribution and satisfied the consistency or stationarity assumption required in the hydrological design.

In addition, the precipitation–runoff scatter diagram was employed to compare, and that of the measured and NSR-SVIed streamflow series is drawn together in Figure 10. As shown in this figure, the original series before 1971 was relatively stationary and the precipitation–runoff relationship was significantly linear (*R*^{2} = 0.65) under the influence of small-scale human interference. Compared with the measured data after 1971, it can be found that the reconstructed or NSR-SVIed natural data exhibited a better correlation with precipitation, most of which versus precipitation fell within the interquartile range of the measured streamflow series before 1971, confirming the acceptability of the NSR-SVI approach in reconstructing natural streamflow.

As is well known, the violin plot, just like the boxplot with kernel density plot, has an advantage in revealing the distribution of data. Figure 11 shows the violin plots of two natural streamflow obtained by the NSR-SVI approach and precipitation–runoff model (P-R). It can be seen from the figure that the proposed method resulted in the similar distribution with the P-R model, implying the statistical approach depending on flow variation has a satisfactory performance in natural streamflow reconstruction with the P-R model.

The above triple comparisons demonstrate that the proposed statistical method depending on flow variation is suitable for the reconstruction of natural streamflow series, and can serve the hydrologic design of water resource engineering. In application prospects, it can serve as a feasible method for areas where the hydrological models, restoring the water volume, etc. cannot be used due to the lack of data, and may also be considered as a cross-validation approach employed in a basin with sufficient data.

### Model application

As mentioned above, the NSR-SVI method has been checked to provide an alternative for rebuilding the natural streamflow series. In addition, a series of extensible applications can be conducted based on reconstructed streamflow. This study mainly explored the joint application of NSR-SVI with the cumulative curve for quantifying the contribution of precipitation variation and anthropogenic interference to streamflow decrease in the Tuwei River basin, aiming to guiding catchment adaptive water resources management (Li *et al.* 2020).

Specifically, a scenario considering precipitation variation was set up, different from natural (without precipitation variation and human impact) and measured streamflow (with precipitation and human impacts). And a streamflow under precipitation variation scenario or a precipitation variation-impacted (PV-impacted) streamflow can be reconstructed based on the natural streamflow, depending on the variation information detected in the observed precipitation. Furthermore, the accumulations of natural streamflow and PV-impacted streamflow were computed and drawn, respectively. Finally, the contribution of precipitation variation to streamflow change was estimated through the deviation between cumulative curves.

Therein, the calculation processes of PV-impacted streamflow in the Tuwei River basin are as follows: (1) Natural streamflow series at Gaojiachuan gauge was reconstructed by the NSR-SVI approach. (2) Change point in mean of basin precipitation series was tested and found in 1978, as shown in Figure 12(a). The precipitation variation in mean (PV_{m}) can be demonstrated as the rate of segmental means (MP_{1} and MP_{2}) before and after 1978. And the latter PV_{m}-impacted streamflow was redressed by the equation of MR_{2} = MR_{1}*MP_{2}/MP_{1} in Figure 12(b) where MR_{1} notes the mean of natural streamflow sequence before 1978 and MR_{2} represents the redressed mean. (3) Segmental means of the precipitation series and those of natural streamflow series were removed, respectively, and the MR series can be achieved, including MR precipitation (Figure 12(c)) and MR natural streamflow. (4) Change point in variance of the MR precipitation series is checked out in Figure 12(c), occurring in 1968, and the influence of the precipitation variation in variance (PV_{v}) can be applied to the segmental streamflow after the point by the rate of segmental standard deviation (SP_{1} and SP_{2}) before and after 1968. Then, the PV_{v}-impacted streamflow can be constructed by the equation of SS_{2} = SS_{1}*SP_{2}/SP_{1} in Figure 12(d) where SS_{2} indicates the standard deviation of the constructed PV_{v}-impacted streamflow, and SS_{1} shows that of natural streamflow. (5) The PV_{v}-impacted streamflow was combined with segmental mean in the PV_{m}-impacted streamflow, and the streamflow under precipitation variation, that is, PV-impacted streamflow, can be constructed, as shown in Figure 12(e).

Figure 12(f) provides a comparison between natural and PV-impacted streamflow, and it can be seen from the figure that the box of PV-impacted streamflow is the same as that of the natural streamflow before 1968, implying a stable phase without precipitation variation; during the period of 1968–1978, the quartile range of PV-impacted series is obviously smaller than that of natural series, indicating the impact of a significant abrupt change in variance of precipitation; the double decreasing changes in the quartile range and mean of PV-impacted streamflow after 1978 demonstrate the coupled impacts of precipitation variations in variance and mean.

To verify the validity of PV-impacted streamflow, the precipitation–runoff scatter diagram was employed and the result is shown in Figure 13.

Figure 13 shows that the data points of PV-impacted streamflow versus precipitation basically distributed around the cyan trend line before 1968, implying the stationarity of the streamflow in natural period. However, it is easily seen that orange marks were almost clustered below the cyan dashed line, which is consistent with the fact mentioned above that both the mean and variance of precipitation data had an abrupt decrease after 1968 and 1978, respectively. Thus, the above analysis indicates that the calculated PV-impacted streamflow is reasonable and the method used is feasible.

To further assess the contribution of precipitation variation on the streamflow decrease, the cumulative streamflow curve, a simple but comprehensive graphical method, was applied in this study. Figure 14 shows the annual accumulations of NSR-SVIed natural streamflow, PV-impacted streamflow, and the measured streamflow at Gaojiachuan hydrological gauge. It is apparent in the figure that the accumulation curve of natural streamflow by NSR-SVI was nearly a straight line, indicating the series was stationary and implying validity of the NSR-SVI approach in the natural streamflow reconstruction. Two other curves representing the observed and PV-impacted streamflow were found to deviate from the natural series after 1971 and 1978, respectively. It follows from the deviations that the streamflow in the Tuwei River basin should be mainly affected by human activities and the changed underlying surface conditions during 1971–1978, while climate change started to play an important role in the streamflow decrease after 1978. The calculation result showed that approximately 4.83 billion m^{3} streamflow decreased over the past five decades (1956–2005), including the total influence of precipitation variation and human interference, that is, Δ*R _{P}* + Δ

*R*in the figure. Depending on the accumulation curves, the contribution of precipitation variation to streamflow was obtained by Δ

_{H}*R*(Δ

_{P}/*R*+ Δ

_{P}*R*), accounting for 44% of the total decreased streamflow, and anthropogenic interference (e.g., human-induced water withdrawal and underlying surface change) contributed 56% of the streamflow reduction. This result was consistent with the study in near Kuye River basin (Guo

_{H}*et al.*2016) which indicates that the percentage of runoff reduction attributed to climate variations is 43.50 and 56.50% from human activities in strongly human-induced periods.

## CONCLUSIONS

In this study, a new statistical approach depending on the conjunctive use of the over-whitening procedure, Pettitt's test method, and ICSS algorithm, namely NSR-SVI, was proposed to detect streamflow variation (i.e., abrupt changes in mean and variance) and to reconstruct natural streamflow. Results showed that (1) the segmented over-whitening procedure was applicable to remove the serial correlation in hydrological time series without significant harm to the trend component; (2) the Pettitt's test method coupled the iterative algorithm and ICSS accurately detected multiple change points in mean and variance and gave the locations; (3) the NSR-SVI approach only depending on flow variation, was suitable for the reconstruction of natural streamflow series, and can provide a series of stable streamflow under various anthropogenic interferences, so as to better serve the area where the hydrological models, the restoring water volume, and rainfall–runoff model cannot be used due to the lack of data. Also, it can be used as a cross-validation approach for hydrologic design in a basin with abundant data.

Additionally, the joint application of NSR-SVI with cumulative curve was investigated in this study to quantify the contribution of precipitation variation and anthropogenic interference to streamflow decrease in the basin. The comparison with near Kuye River basin indicates the quantified contributions of precipitation variation and anthropogenic interference were 44 and 56% in the Tuwei River basin, consistent with the results in published literature, implying the proposed NSR-SVI approach combining the cumulative streamflow curve can provide an innovative way to investigate the attributions of catchment streamflow variation.

## ACKNOWLEDGEMENTS

This research is supported by the National Natural Science Foundation of China (51979005 and 51809005), the Natural Science Basic Research Program of Shaanxi (2020JM-250), and State Key Laboratory of Eco-hydraulics in Northwest Arid Region, Xi'an University of Technology (2018KFKT-4). Our cordial thanks should be extended to the editor and anonymous reviewers for their pertinent and professional suggestions and comments which are greatly helpful for further improvement of the quality of this paper.

## AUTHOR CONTRIBUTIONS

C. D.: validation, formal analysis, data curation, writing-original draft. H. Z.: conceptualization, supervision. V. P. S.: writing-review and editing, supervision. T. Z.: visualization. J. Z.: investigation. H. D.: validation.

## DATA AVAILABILITY STATEMENT

All relevant data are included in the paper or its Supplementary Information.