Rain gauge networks provide direct precipitation measurements and have been widely used in hydrology, synoptic-scale meteorology, and climatology. However, rain gauge observations are subject to a variety of error sources, and quality control (QC) is required to ensure the reasonable use. In order to enhance the automatic detection ability of anomalies in data, the novel multi-source data quality control (NMQC) method is proposed for hourly rain gauge data. It employs a phased strategy to reduce the misjudgment risk caused by the uncertainty from radar and satellite remote-sensing measurements. NMQC is applied for the QC of hourly gauge data from more than 24,000 hydro-meteorological stations in the Yangtze River basin in 2020. The results show that its detection ratio of anomalous data is 1.73‰, only 1.73% of which are suspicious data needing to be confirmed by experts. Moreover, the distribution characteristics of anomaly data are consistent with the climatic characteristics of the study region as well as measurement and maintenance modes of rain gauges. Overall, NMQC has a strong ability to label anomaly data automatically, while identifying a lower proportion of suspicious data. It can greatly reduce manual intervention and shorten the impact time of anomaly data in the operational work.

  • Quantitative and fusion application of radar and satellite data were addressed for the quality control (QC) of hourly rain gauge data.

  • Precipitation quality anomaly events were defined to improve the traceability of QC.

  • A phased QC strategy was adopted to reduce the misjudgment risk caused by the uncertainty of remote-sensing measurements.

Precipitation data are essential for numerous operational applications in hydrology, synoptic-scale meteorology, and climatology such as hydrological modeling (Mohammadi et al. 2024), weather forecasts (Imhoff et al. 2023), flood warning (Ma et al. 2024), drought forecasting (Mohammadi 2023), and decision-making service (Hassani et al. 2023). Precipitation data can be obtained by different means, e.g., rain gauge station, radar, and satellite. Among them, rain gauges provide direct precipitation measurements, which are routinely used as ground truth; radar and satellite precipitation, by contrast, are obtained based on the modern remote-sensing techniques, which have to be calibrated against the rain gauge data (Nešpor & Sevruk 1999; Martinaitis 2008; Qi et al. 2016; WMO 2021). In general, the direct precipitation measurements provided by gauge networks have higher accuracy than remote-sensing measurement systems. However, rain gauge observations are also subject to inaccuracies caused by random and systemic errors, and the main causes include wind-induced error, wetting and evaporation losses, instrumentation malfunctions, transmission errors, poor observation environment, and mistakes made during data processing (Groisman & Legates 1994; Nešpor & Sevruk 1999; Nešpor et al. 2000; Adam & Lettenmaier 2003; Yang et al. 2005; Baltas et al. 2016).

Quality control (QC) is the best known component of quality management systems to ensure the highest possible reasonable standard of accuracy for the optimum use of these data by all possible users (Zahumenský 2004). Although many sophisticated QC procedures have been carried out in various hydrological and meteorological research projects (Ren et al. 2010; Qi 2015; Blenkinsop et al. 2017), QC of rain gauge data has been a challenge, especially at hourly or sub-hourly scales, because of their high spatial and temporal variability with skewed intensity spectra (Li & Sun 2021; Sha et al. 2021).

In this study, the existing QC procedures are divided into two types: single-source QCs and multi-source QCs. Single-source QCs mainly use data from the checked rain gauge station or their neighbor stations, including limit value checks, station or regional extreme value checks, internal consistency checks, time consistency checks, and spatial consistency checks (Upton & Rahimi 2003; Kondragunta & Shrestha 2006; Ren et al. 2010; Schneider et al. 2014; Blenkinsop et al. 2017). Although these procedures have played an important role in many projects, the labeled suspected anomalous data are mostly required to make a further confirmation by human visual checks. With the continued growth of data volume, these procedures are resource-intensive and can cause delays. Meanwhile, radar and satellite products are being used in an increasing number of applications because of their wide spatial and temporal coverage (Lengfeld et al. 2020; Adane et al. 2021; Liu et al. 2021; Thiruvengadam et al. 2021; Gebremicael et al. 2022; Zhao et al. 2022). On the basis of this, some multi-source QCs have been developed to increase the efficiency of gauge precipitation QC by introducing data from different observation systems (Hill 2013; Qi & Zhang 2013; Qi 2015; Qi et al. 2016; Zhao et al. 2018; Sha et al. 2021), and some new technologies have been adopted for automatic labeling of the anomalies in data, e.g. decision trees (Qi et al. 2016), neural networks (Zhao et al. 2018), deep learning (Sha et al. 2021), but analysis shows that the magic of these methods is inseparable from the support of multi-source data.

The existing multi-source QCs mainly focus on the use of radar data, and there are few studies on the comprehensive application of radar and satellite. The accuracy of ground-based radar data is easily affected by complex terrain and its usage is limited in areas of poor or no radar coverage. However, satellites, especially geostationary satellites, can prove to be an excellent data source providing high spatial and temporal resolutions for regions where radar networks are missing or unevenly distributed. In this paper, a novel multi-source data quality control method called NMQC is proposed for the hourly rain gauge data by the quantitative and fusion application of radar and satellite data. Notably, although NMQC is designed to be as automatic as possible to improve timeliness and reduce the human workload, it is not fully automatic owing to the skewness of precipitation frequency distribution. The contributions are summarized as follows:

  • Precipitation quality anomaly event (PQAE) is defined with massive data exploration and analysis to improve the traceability management of QC, and several types of QC-oriented parameters are designed to support the implementation of related algorithms based on the multi-source data.

  • A phased strategy is applied to logically divide QC procedures into two steps of PQAE detection and PQAE diagnosis to reduce the misjudgment risk caused by the uncertainties from radar and satellite remote-sensing measurements.

  • A hybrid QC processing mode of automatic and human-based is reserved to avoid the incorrect elimination of rare extreme precipitation values occurring in the fully automatic QC procedures, which behave similarly to spurious outliers, as these true extremes are very important to describe the variability of precipitation.

Study area description

The study region is the Yangtze River Basin (24.50°–35.75°N, 90.55°–122.42°E) with a total area of ∼1.8 million km2, accounting for nearly 19% of China's land area. With a length of over 6,300 km, the Yangtze River is the longest river in Asia and the third longest river in the world. The Yangtze River Basin stretches from the eastern Tibetan Plateau to the East China Sea with a wide range of climate variability and diverse ecosystems (Zhang et al. 2014). Affected by its unique geographic location, special land-sea thermal differences, and seasonal variations of atmospheric circulation, the Yangtze River Basin is a typical East Asian monsoon region that is sensitive and vulnerable to climate changes, characterized by simultaneous rainy and hot weather in summer, and cool and dry conditions in winter. In recent years, against the background of climate warming, the annual precipitation in the Yangtze River Basin shows a trend of dry and wet polarization, and extreme precipitation events occur frequently, which brings new challenges to the QC of precipitation data (Lin et al. 2021; Cheng et al. 2022).

Three types of data are adopted in this study. The station data are from more than 24,000 automated hydro-meteorological observation stations, as shown in Figure 1, including the hourly precipitation, air temperature, weather phenomena, and station metadata. Satellite data are a set of L2 processing level nominal grid products obtained from the FengYun-4A (FY-4A), which is the first flight unit of the second-generation geostationary meteorological satellite in China (Lu et al. 2017; Shao et al. 2020), including Cloud Top Temperature (CTT), CLoud Type (CLT), and CLoud Mask (CLM). FY-4A products are updated every 15 min (or intensive observation of 7 times per hour) at a spatial resolution of 0.04° × 0.04°. Moreover, Cloud Total Amount (CTA) product is used for auxiliary cloud parameters, which are produced by China's 3D Cloud Analysis System and updated every 1 h at a spatial resolution of 0.05° × 0.05° (Shi et al. 2019). Radar data comprise the Composite Reflectivity (CR) product, which is produced by Severe Weather Automatic Nowcasting System and updated every 6 min at a spatial resolution of 0.01° × 0.01° (Han & Wo 2018). The time range of data is from January to December 2020. The hourly precipitation is the checked data for QC and others are auxiliary.
Figure 1

Spatial distribution map of hydro-meteorological observation stations in the Yangtze River Basin.

Figure 1

Spatial distribution map of hydro-meteorological observation stations in the Yangtze River Basin.

Close modal
For the convenience of data pre-processing, quality control reference domain is adopted to decide the valid spatial and temporal ranges of each data and make a best spatial-temporal match between the hourly rain gauge data and the other grid multi-source data. The hourly rain gauge data are accumulations over the hour and the rainfall is usually caused by the precipitation clouds around the checked station. According to the neighborhood principle, the valid spatial range of station data is a circular area with the given radius, and the valid spatial range of grid products is a rectangular area extending for given steps around the checked station located grid. For a given checked station, the quality control reference domain (QCRF) is defined as follows:
formula
(1)
where is the time range, is the hour of QC time, is the previous hour; is the spatial range of the station data, is the given radius; is the spatial range of grid products, is the spatial resolution, is the total grid cells, is the given extending steps; and is the temporal frequency of grid product. It is clear that , , and are known for the given QC time and product, only and are uncertain. In this study, they are obtained by extensive experiments, and the results show that NMQC works best when n is 3 for FY-4A and CTA products, n is 1 for CR product, and is from 50.0 to 70.0 with unit of km.

Definition of precipitation quality anomaly event

The concept of anomaly detection is introduced to improve the traceability of QC. The occurrence of anomalous precipitation data with large deviations from real precipitation is called a precipitation quality anomaly event (PQAE). Two categories of real-time and non-real-time events are summarily defined by comprehensively analyzing the causes, spatial-temporal distribution characteristics, and application sensitivity of a large number of anomaly data. Real-time events are gross errors in data, which have a great impact on the real-time applications such as weather forecasts and warnings, and usually can be labeled with a single time data. Non-real-time events are systemic errors in data, the elimination of which can provide high-quality data support for the non-real-time applications such as climate analysis and disaster assessment, and are needed to be labeled by analyzing the data changes over a period of time. NMQC focuses on the processing of real-time events, and non-real-time events would be discussed separately. Meanwhile, real-time events are subdivided into five types for more refined analysis, as presented in Table 1.

Table 1

Definition of real-time precipitation quality anomaly events

Event codesEvent namesEvent definitions
CSPQAE Clear sky precipitation quality anomaly event Anomaly precipitation is observed on a clear day; there is no limit on the amount of precipitation. 
PSPQAE Pseudo small precipitation quality anomaly event Anomaly precipitation is caused by equipment flipping without rain or fog, dew, frost, and snow gathering or melting; the amount of precipitation ranges from 0.1 to 0.3 mm. 
IPQAE Isolated precipitation quality anomaly event Only the checked station has precipitation within a specific spatial range; the amount of precipitation is greater than 0.3 mm. 
SLPQAE Single larger precipitation quality anomaly event Precipitation is significantly greater than that of all neighbor stations within a specific spatial range; the amount of precipitation is greater than or equal to 1.0 mm. 
LPQAE Larger precipitation quality anomaly event Precipitation is significantly greater than that of neighbor stations or historical extremes within a specific spatial range; the amount of precipitation is greater than or equal to 1.0 mm. 
Event codesEvent namesEvent definitions
CSPQAE Clear sky precipitation quality anomaly event Anomaly precipitation is observed on a clear day; there is no limit on the amount of precipitation. 
PSPQAE Pseudo small precipitation quality anomaly event Anomaly precipitation is caused by equipment flipping without rain or fog, dew, frost, and snow gathering or melting; the amount of precipitation ranges from 0.1 to 0.3 mm. 
IPQAE Isolated precipitation quality anomaly event Only the checked station has precipitation within a specific spatial range; the amount of precipitation is greater than 0.3 mm. 
SLPQAE Single larger precipitation quality anomaly event Precipitation is significantly greater than that of all neighbor stations within a specific spatial range; the amount of precipitation is greater than or equal to 1.0 mm. 
LPQAE Larger precipitation quality anomaly event Precipitation is significantly greater than that of neighbor stations or historical extremes within a specific spatial range; the amount of precipitation is greater than or equal to 1.0 mm. 

Calculation of QC-oriented parameters

Quality control factor

Quality control factors (QCFs) are designed based on station, radar, and satellite data with massive amounts of data exploration and analysis, which are sensitive to the quality confirmation of hourly rain gauge data. Station data are mainly used to design QCFs to characterize the precipitation extremes and spatial distribution around the checked station.

For radar data, although quantitative precipitation estimation (QPE) can be directly compared with the rain gauge data, it is usually calibrated by rain gauge data in real time to improve the accuracy of radar QPE (Koch et al. 2005; Xiao et al. 2008). Considering the coupling between rain gauge data and QPE product, only CR product is adopted, and QPE is directly estimated based on CR product according to the classical Z–R relationship (Fang et al. 2018).

For satellite data, CTT, CLM, CLT, and CTA products are used because the satellite rainfall estimation cannot be used directly in conjunction with gauge data, especially at hourly scale (Hughes 2006). Studies have shown that these parameters are closely related to ground precipitation (Liu et al. 2009; Han et al. 2011; Kim & Kwon 2011; Yuan & Hu 2015; Jin et al. 2018; Ombadi et al. 2021). The rainfall intensity has a good corresponding relationship with CTT, and CTT gradually decreases and its change range becomes more concentrated along with the increase of rainfall intensity, but CTT is less sensitive to small precipitation. Thus, CLM, CLT, and CTA are introduced to enhance the robustness of satellite QCFs when precipitation is small or not obvious.

The set of QCFs can be denoted as , which are calculated based on QCRF of the corresponding data or products. measures the history precipitation extremes around the checked station, represents the spatial frequency of stations where non-zero precipitation is observed, is the maximum of radar CR, is the average of QPE for the Z–R relationship of different precipitation clouds, is the minimum of CTT, is the average of CTA, is the cloud type with the highest proportion, represents the probability of clear sky, and the higher value corresponds to the lower probability of rainfall.

For the given checked station and QC time, denotes the time order, denotes the order number of grid cells, and QCFs can be described as follows
formula
(2)
formula
(3)
where , , and are the latitude, longitude, and altitude of station, is the month of QC time, is the extreme matching function and is used to obtain the monthly maximum hourly precipitation based on the defined QC parameter file, is the number of neighbor stations, and is the number of neighbor stations with rainfall. To reduce the boundary-effect influence on the stations located at the edge of the precipitation system, is calculated based on QCRF with and QCRF with denoted by and , respectively.
formula
(4)
formula
(5)
where is the maximum CR at time , is the CR of grid cell at time ; , , and is the hourly QPE of the Z–R relationship of convective, stratiform, and warm cloud precipitation, respectively.
formula
(6)
formula
(7)
where is the CTT of grid point at time ; is the cloud total amount of grid point .
The CLT product divides clouds into seven types: Clear, Water, Super Cooled, Mixed, Ice, Cirrus, and Overlap, which are abbreviated as , , , , , , and , respectively. Usually, precipitation will not occur in the area of Clear and Cirrus clouds. The valid values of CLM product include Cloud, Probably Cloud, Probably Clear, and Clear. And it is believed that precipitation will not occur in the area marked Clear.
formula
(8)
formula
(9)
where is the set of cloud types, s is the member of S, is the amount of s grid cell at time t; , is the amount of clear grid cell at time t; , is the amount of Clear cloud grid cell at time t, and is the amount of Cirrus cloud grid cell at time t.

Quality control decision index

To reduce the accuracy requirement for QCFs and improve the complexity of algorithms, the hierarchical fuzzy matching strategy is adopted to respectively further convert QCFs of radar and satellite into a quality control decision index, which is set into 10 grades from 0 to 9 to match with the pre-made hourly precipitation grade. It should be noted that the unit of hourly precipitation is millimeters (mm). In detail, 0 indicates that it may be a clear sky, 1 indicates it may be rain, 2 ∼ 9 respectively indicate that the hourly precipitation is in the interval [0.1,1.0), [1.0,2.5), [2.5,10.0), [10.0,20.0), [20.0,30.0), [30.0,40.0), [40.0,60.0), and ≥60.0. Besides, the default value M is set to improve the robustness of algorithms when data is missing, or the conversion is not feasible. The conversion rules of the quality control decision index are depicted as Figure 2, QCDIS and QCDIR denote the quality control decision index for satellite and radar, respectively. The diagnosis rules would be designed based on QCDIS and QCDIR.
Figure 2

Conversion rules of QCDIS and QCDIR.

Figure 2

Conversion rules of QCDIS and QCDIR.

Close modal

Station weather type classification

Oriented to the requirements of QC, an empirical parameterized method is adopted to classify the weather type around the checked station by comprehensively analyzing multi-source QCFs. The station weather type (SWT) is divided into three types of clear, rainless, and rain as presented by Equation (10), where clear indicates that it is clear sky, rainless is no obvious precipitation, and rain is for precipitation, and the values of related QCFs are fitted with extensive experiments and application analysis. Different diagnosis rules are set for the same PQAEs according to station weather types.
formula
(10)

The phased QC strategy for PQAEs

Because of uncertainties from radar and satellite remote-sensing measurements, the relationship is very complex between gauge rainfall and radar–satellite rainfall data; there is a high risk of misjudgment in use of radar and satellite data directly to determine the quality of gauge data (Hughes 2006; Hill 2013; Qi & Zhang 2013). Thus, a phased strategy is designed to logically divide QC procedures into the two steps of PQAE detection and PQAE diagnosis to reduce the misjudgment risk. The phased QC strategy for PQAEs is depicted as Figure 3. First, the PQAE detection algorithm mainly uses rain gauge data to label as much of the suspected anomaly data as possible, which is similar to single-source QCs. Secondly, the PQAE diagnosis algorithm makes a comprehensive analysis of the suspected anomaly data based on QC parameters, and the suspected PQAEs are relabeled as normal or anomaly PQAEs. The data for normal PQAEs are available, and the data for anomaly PQAEs are suspicious or erroneous. The available, suspicious, and erroneous data are denoted by T, D and F, respectively. Suspicious and erroneous data are the final anomaly data labeled by NMQC, and suspicious data need to be further confirmed by data quality analysis experts. Hence, the labeled suspicious data should be as little as possible to make NMQC more automatic and reduce manual intervention.
Figure 3

Flowchart of the phased QC strategy for PQAEs.

Figure 3

Flowchart of the phased QC strategy for PQAEs.

Close modal

PQAE detection algorithm

The PQAE detection algorithm is developed by combining spatial analysis with Isolation Forest to improve the efficiency of NMQC as shown in Figure 3(a). Spatial analysis is essential for rain gauge data QC. However, the efficiency will be limited when the amount of data is large because it works based on the calculation of neighbor stations’ data. And it mainly focuses on the detection of anomaly data for heavy precipitation instead of small precipitation. Isolation Forest is an unsupervised machine learning algorithm that uses the mechanism of isolating outliers to perform anomaly detection (Ding & Fei 2013; Yao et al. 2022), which does not need to prepare the labeled training dataset in advance, and does not make assumptions about the probability distribution of the checked data. It is one of the most widely used algorithms in the field of anomaly data detection. First, Isolation Forest is adopted to quickly detect and label the suspected anomaly data. Secondly, the labeled suspected anomaly data are checked using the percentile-based spatial analysis method to obtain more refined spatial distribution parameters. Finally, comprehensive analyses restricted by business rules are made to classify the corresponding PQAEs.

PQAE diagnosis algorithm

The PQAE detection algorithm is implemented as shown in Figure 3(b). The data for Clear Sky Precipitation Quality Anomaly Event (CSPQAE) is directly confirmed as erroneous, because the corresponding SWT is clear, which has been comprehensively assessed by multi-source data. For other PQAEs, considering that the recognition of small precipitation is more uncertain than heavy precipitation, the diagnosis rules are designed to vary with the hourly precipitation denoted with pre as follows: (1) when and there are no sufficient reasons for error, it is directly labeled with T as available. For instance, if SWT is rain, PSPQAE will be labeled with T; (2) when and there are no sufficient reasons for available or erroneous data, it is labeled with D as suspicious and confirmed by manual intervention if necessary. As defined above, 0 and 1 for QCDIS are confirmed by multiple QCFs, which have higher accuracy compared with the corresponding values for QCDIR. On the contrary, when precipitation reaches a certain grade, the relationship between radar QPE and gauge precipitation is more accurate than that between CTT and gauge precipitation. Therefore, the priority of QCDIS for small precipitation is higher than that of QCDIR, and the priority of QCDIR for heavy precipitation is higher than that of QCDIS.

To facilitate the expression, the hourly precipitation grade is denoted by PREGRD. According to the above analysis, the threshold method is adopted to design four different diagnosis rules of on the basis of SWT, PREGRD, QCDIR, and QCDIS. Taking as an example, when SWT is rainless and , it is used for comprehensive diagnosis of Isolated Precipitation Quality Anomaly Event (IPQAE), Single Larger Precipitation Quality Anomaly Event (SLPQAE), and Larger Precipitation Quality Anomaly Event (LPQAE). First, check if the value of QCDIR is 0,1, or M, and if so, the data are labeled according to the pre-designed decision tables. Then, if not, the following steps are performed: (1) the parameters and are, respectively, defined and threshold c is given; (2) when , if PREGRD is 4 or 5, QCDIR is 2, and QCDIS is 0, the data are flagged with F as erroneous, otherwise they are flagged with T as available; (3) when , if QCDIS is 1 or M, the data are flagged with D as suspicious, or if , they are flagged with T as available, otherwise they are flagged with D as suspicious.

NMQC is evaluated by analyzing the QC results of hourly rain gauge data from more than 24,000 hydro-meteorological stations in the Yangtze River Basin from January to December 2020. Data status flags are adopted to track and analyze the QC process of NMQC as presented in Table 2. During PQAE detection, data that are not labeled are available, which is flagged with C0, and the other labeled data are suspected anomaly data, which are flagged with C1. During PQAE diagnosis, C1 data are further confirmed as belonging to one of the three categories: available, suspicious, and erroneous, which are flagged with D0, D1, and D2, respectively. In addition, taking the hourly rain gauge data from June to August in Hubei Province as an example, a test dataset referred as tSet is created to evaluate the performance of NMQC, where the anomaly dataset referred as aSet is already labeled by data quality analysis experts.

Table 2

Data status flags

FlagDescription
C0 Detected as true, data is available 
C1 Detected as suspected anomaly data 
D0 Detected as suspected anomaly data, diagnosed as available data 
D1 Detected as suspected anomaly data, diagnosed as suspicious data 
D2 Detected as suspected anomaly data, diagnosed as erroneous data 
FlagDescription
C0 Detected as true, data is available 
C1 Detected as suspected anomaly data 
D0 Detected as suspected anomaly data, diagnosed as available data 
D1 Detected as suspected anomaly data, diagnosed as suspicious data 
D2 Detected as suspected anomaly data, diagnosed as erroneous data 

On the basis of that, the detection ratio, hit ratio, false detection ratio, and miss detection ratio are used as metrics to evaluate the detection performance quantitatively, and the anomaly ratio for a single station is defined to understand the spatial distribution of anomaly data. The metrics are presented as follows:
formula
(11)
formula
(12)
formula
(13)
formula
(14)
formula
(15)
formula
(16)
formula
(17)
formula
(18)
where , , , and are the detection ratios of C1, D1, D2, and D1 + D2 data, is the total amount of QC data, , , and are the total amounts of C1, D1, and D2 data; , , and are the hit, false detection, and miss detection ratios of the anomaly data, is the total amount of data in aSet, is the total amount of anomaly data labeled by NMQC in tSet, is the total amount of anomaly data labeled by NMQC as well as that appears in aSet, is the total amount of anomaly data labeled by NMQC but that does not appear in aSet; is the anomaly ratio for a given station, is the total amount of QC data for the station, and is the total amount of anomaly data labeled by NMQC for the station.

Detection performance of anomaly data

According to the statistical analysis of QC results, , , , and are 3.7,0.03, 1.7, and 1.73‰, respectively. It is clear that the detection ratio of anomaly data is reduced significantly after PQAE diagnosis. Concretely, the ratios of D0, D1, and D2 in C1 data are 53.74, 0.87, and 45.39%, respectively. Only 46.26% of the suspected anomaly data labeled during PQAE detection is ultimately confirmed as anomaly data during PQAE diagnosis, and the proportion of suspicious data is 1.73%. The monthly detection ratios of C1 data and D1 + D2 data are counted as shown in Figure 4. It can be seen that both the detection ratios for C1 and D1 + D2 data are lower in the summer half year (from April to September) than in the winter half year (from January to March, October to December), but the differences between them are higher in the summer half year than in the winter half year. In other words, NMQC has high false detection ratio in the summer half year during PQAE detection.
Figure 4

Monthly detection ratio of C1 data and C1 + C2 data.

Figure 4

Monthly detection ratio of C1 data and C1 + C2 data.

Close modal

The metrics of hit ratio, false detection, and miss detection ratio for NMQC are verified based on the test dataset tSet. The results are listed in Table 3; is 7,231, is 7,215, is 124, and is 7,101. The statistical results show that the values of these metrics for anomaly data below 1.0 mm are better than that greater than 1.0 mm, and the total , , and are 98.20, 1.71, and 1.80%, respectively.

Table 3

Verification results for NMQC on the test dataset tSet

Precipitation grades
≥1.0 413 403 13 390 
[0.1,1.0) 6,818 6,812 101 6,711 
Total 7,231 7,215 124 7,101 
Precipitation grades
≥1.0 413 403 13 390 
[0.1,1.0) 6,818 6,812 101 6,711 
Total 7,231 7,215 124 7,101 

Distribution characteristics of anomaly data

The distribution characteristics of anomaly data detected by NMQC are addressed to further verify the rationality of QC results and provide references for the application of the method. The main results are shown in Figure 5. First, the type differences and monthly change of anomaly data are explored by analyzing the distribution of PQAEs. Second, the proportion of erroneous and suspicious anomaly data and the temporal distribution of anomaly data are discussed to understand the automatic labeling ability of NMQC. Third, the spatial distribution of anomaly data is described to know the quality of hourly rain gauge data in different regions. Finally, the grades distribution of anomaly data is described in words to understand the quality of different grades of precipitation as well as the main grades of each PQAE.
Figure 5

Distribution characteristics of anomaly data.

Figure 5

Distribution characteristics of anomaly data.

Close modal

PQAEs distribution

To understand the event distribution of anomaly data, the ratio of different PQAEs is counted and the results show that the highest proportion belongs to PSPQAE with 89.52%, IPQAE is 4.84%, SLPQAE is 3.00%, CSPQAE is 2.58%, and LPQAE is the lowest with 0.06%. The monthly distribution of PQAE is as shown in Figure 5(a). Due to the large difference in the proportion of different PQAEs, especially the proportion of PSPQAEs, which is much higher than others, the data amount is logarithmically converted to make the analysis clearer. It can be seen that the amount of PSPQAE is less in the summer half year than in the winter half year; the amount of IPQAE is higher in March and August; the amount of SLPQAE is higher in March and from June to August; the amount of CSPQAE is higher from February to April; and the amount of LPQAE is higher in March and July.

Temporal distribution

Since the proportion of suspicious data is much lower than that of erroneous data, the data amount is logarithmically converted to better analyze the monthly change of anomaly data as shown in Figure 5(b). It can be seen that the anomaly data are more in the winter half year than in the summer half year. Meanwhile, the amount of erroneous data in the summer half year is less than that in other months, but the amount of suspicious data in the summer half year is more than that in other months. The main reason is that due to the influence of extreme weather such as local strong convection, it is more difficult to confirm the quality of data in the summer half year. The monthly change of anomaly data is consistent with that of erroneous data because the proportion of erroneous data is as high as 98.27%.

Spatial distribution

The statistical results show that the ratio of stations with anomaly data is 83.75%. Among these stations, the proportion of stations with is 58.72%, is 35.57%, is 4.18%, is 1.21%, and is 0.32%. The proportion of stations with () is 94.29%. Although most stations have anomaly data, the stations with high anomaly data ratio are relatively fewer. The spatial distribution of stations with higher anomaly ratio ( ≥ 0.57%) is as shown in Figure 5(c); it can be seen that the distribution of stations is relatively uniform and has no obvious regional characteristics except for areas with sparse station density. According to the analysis of stations with higher , the main anomaly data are CSPQAE and PSPQAE, and these data are mainly the result of poor observation environment and instrument maintenance. Taking Yimencun station an example: it can be observed that it has the highest anomaly ratio of 5.01%, and among the anomaly data, the ratio of PSPQAEs is 59.09%, CSPQAE is 38.41%, and others are 2.50%.

Grades distribution

The non-zero precipitation is analyzed to make a further understanding of the distribution of anomaly data in different precipitation grades. The results show that anomaly data are mainly concentrated at less than 1.0, which accounts for 94.36%. This is consistent with the fact that the observation samples of small precipitation are more. Moreover, the proportion of PQAEs to their own amount in different precipitation grades is counted to understand the precipitation grades distribution of different PQAEs. The results show that PSPQAEs are all in the [0.4,1.0), which is consistent with the definition; CSPQAEs occur at any grades, but are mainly concentrated below 1.0 accounting for 93.05%; IPQAEs are greater than or equal to 0.4, among which [0.4,1.0) account for 63.96%, and ≥1.0 account for 36.04%; SLPQAEs are equal to or greater than 1.0, among which [1.0, 20.0) account for 90.68%, and ≥20.0 account for 9.32%; LPQAEs are greater than or equal to 1.0, among which [1.0, 20.0) account for 6.18%, [20.0, 60.0) account for 86.8%, and ≥60.0 account for 7.01%.

Case study

Pseudo small precipitation quality anomaly event

There are 102 stations showing PSPQAE at 10:00 on January 29, 2020 (Beijing time, the same as after) in the southeastern Sichuan and border areas of Chongqing, Hubei, and Guizhou (28.30°-32.30°N;103.30°-109.30°E), and these stations account for 1.7% of all stations in the region. The analysis of QC parameters shows that SWT is rainless, QCDIS is 0, and QCDIR is 0. The data are automatically confirmed as erroneous according to the DRULE1 diagnosis rule.

Some analyses are given to better understand PSPQAs as shown in Figure 6. The hourly average temperature of national stations from 27 to 31 January 2020 is depicted as Figure 6(a). According to the analysis, there was a large-scale precipitation process on 27 January 2020, which resulted in low temperature. The precipitation gradually stopped and the temperature slightly increased on the 28th. The weather cleared up on the 29th, the night radiation cooling effect and early precipitation provide favorable conditions for the emergence of fog, dew, and frost. The weather phenomenon data from 128 national stations on the 29th in the region are analyzed, and one or more phenomena within fog, dew, and frost occurred at 96 stations, and ice phenomena occurred at 18 high altitude stations. The spatial distribution of stations with PSPQAE and national stations recording fog, dew, frost, and ice phenomena is shown as Figure 6(b); it can be seen that the large-scale condensation phenomena occurred on 29 January 2020 in the region. Moreover, the hourly number of stations showing PSPQAEs from 21:00 on the 28th to 20:00 on the 29th is shown as Figure 6(c). There was no obvious change at night on the 28th. As the temperature went up, the number rapidly increased on the morning of the 29th, reaching a maximum of 102 stations at 10:00, and then it quickly declined again. These facts are consistent with the formation and vanishing of fog, dew, and frost.
Figure 6

The analysis of PSPQAEs in the southeastern Sichuan and border areas of Chongqing, Hubei, and Guizhou at 10:00 on 29 January 2020.

Figure 6

The analysis of PSPQAEs in the southeastern Sichuan and border areas of Chongqing, Hubei, and Guizhou at 10:00 on 29 January 2020.

Close modal

Clear sky precipitation quality anomaly event

Approximately 3.9, 3.2, and 2.3 mm of rainfall was observed at Loutai, Xiaojiagou, and Santai stations at 10:00 on 30 June 2020, respectively, which are detected as CSPQAEs. The analysis of QC parameters shows that the distance between each station is less than 10.0 km; is lower than 0.9%, and there is no precipitation at other neighbor stations; is 0.4%, and only one station has 0.5mm precipitation; SWT is clear. According to the multi-source data diagnosis rules, the data are automatically confirmed as erroneous. For further understanding, the spatial distribution of CR, CTT, and CTA around the station at 10:00 on 30 June 2020 are shown in Figure 7. According to the manual verification, the above precipitation is caused by station instrument maintenance. Due to the close distance and maintenance time between each station, the similar anomaly data are easily falsely detected by the single-source QCs. Fortunately, this can be avoided and the anomaly data are automatically labeled because of the usage of multi-source data.
Figure 7

The spatial distribution of multi-source data around Loutai, Xiaojiagou, and Santai stations at 10:00 on 30 June 2020.

Figure 7

The spatial distribution of multi-source data around Loutai, Xiaojiagou, and Santai stations at 10:00 on 30 June 2020.

Close modal

Isolated precipitation quality anomaly event

About 23.0mm rainfall was observed at Guihua station at 14:00 on 28 August 2020, which is detected as IPQAE. It is labeled as C1 data during PQAE detection and automatically confirmed as available according to the DRULE3 diagnosis rule during PQAE diagnosis. The analysis of QC parameters shows that all the 52 neighbor stations within the 50.0km radius region have no precipitation, and all the 33 neighbor stations within the 50.0 ∼ 70.0km radius region have no precipitation; is 43.5 dBz; is 205.0 K; is Super Cooled Cloud; is 0.0%; is 100%; SWT is rainless, PREGRD is 6, QCDIR is 5, and QCDIS is 8. The spatial distribution of CR, CTT, and CTL around the station at 14:00 on 28 August 2020 is shown in Figure 8. It can be seen that the precipitation of Guihua Station is stimulated by the small-scale strong convective cloud clusters. If there is no multi-source data diagnosis, this would be falsely labeled as anomaly data because of the uneven spatial distribution. The use of multi-source data can reduce the false detection ratio.
Figure 8

The spatial distribution of multi-source data around Guihua stations at 14:00 on 28 August 2020.

Figure 8

The spatial distribution of multi-source data around Guihua stations at 14:00 on 28 August 2020.

Close modal

In this paper, NMQC is proposed for the hourly rain gauge data, which is a multi-source data quality control method. First, PQAEs are defined to improve the traceability management of the QC process. Second, several types of QC-oriented parameters are designed by the quantitative and fusion application of radar and satellite multi-source data to support the implementation of algorithms. Third, the phased QC strategy has been adopted to logically divide the QC procedure into the two steps of PQAE detection and PQAE diagnosis to reduce the misjudgment risk caused by the uncertainty from radar and satellite remote-sensing measurements. Finally, NMQC is evaluated by elaborating the QC results of hourly rain gauge data from more than 24,000 hydro-meteorological stations in the Yangtze River Basin from January to December 2020.

Overall, NMQC shows a good performance on the detection of anomaly data and has a strong ability to automatically label erroneous data. Compared with the operationally used single-source QC method (Ren et al. 2010), NMQC has a higher anomaly data detection ratio and a lower proportion of suspicious data detected (Section 3.1, , , and for NMQC is 0.03, 1.7, and 1.73‰, respectively, and the proportion of suspicious data is 1.73%). However, , , and for the operationally used QC method is 0.44‰, 0.18‰, and 0.62‰, respectively, and the proportion of suspicious data is 70.97%. The main reasons are as follows: (1) The operationally used QC method mainly focuses on the QC of heavy precipitation, but NMQC increases the QC of small precipitation, and the observation samples of small precipitation are much larger than that of heavy precipitation; (2) The confirmation of anomaly data is restricted by information resources for the single-source QC method because it is mainly based on spatio-temporal checks within the gauge network itself. But NMQC makes a comprehensive judgment from multiple perspectives by introducing satellite and radar data, which can greatly improve the automatic labeling ability for anomaly data.

Moreover, it should be noted that NMQC is not a fully automatic QC procedure compared with some existing multi-source QC methods; the suspicious anomaly data will be further confirmed by data quality analysis experts in the subsequent business operations. Qi et al. (2016) developed an automated gauge QC scheme based on the consistency of hourly gauge and radar QPE observations to benefit the making of gridded QPE products in radar coverage area. Notably, NMQC is designed for processing the gauge data itself. In other words, their application scenarios are not consistent. We believe that this is an important issue that must be carefully considered when designing QC procedures. Sha et al. (2021) present a supervised automated QC system using convolutional neural networks with grid precipitation and elevation analyses data as input for a sparse gauge station observation network; it is a meaningful exploration. Although manual QC is required to be performed on the raw gauge observations to create binary classification quality labels (good or bad) for the supervised training dataset, we think it is a possible future improvement direction for NMQC, which can be used as a better tool for making multi-classification labels.

Additionally, the results shown in Section 3 are consistent with the climatic characteristics of the Yangtze River Basin as well as the measurement and maintenance modes of rain gauges. Especially, NMQC has high false detection ratio in the summer half year during PQAE detection, and the anomaly data are more in the winter half year than in the summer half year after PQAE diagnosis. On the one hand, the rain and heat are observed over the same period in the Yangtze River Basin, and the local short-duration heavy precipitation and showers occur frequently in the summer half year (Cheng et al. 2022). The spatial distribution of such precipitation is usually similar to that of anomaly precipitation, and it can be easily falsely detected during PQAE detection. However, the authenticity of such precipitation can be determined during PQAE diagnosis by using satellite and radar data to eliminate false detection. On the other hand, the tipping bucket rain gauges are the main observation mediums, which have the inability to accurately measure solid precipitation (Colli et al. 2014; Martinaitis et al. 2015; Choi et al. 2022). There is snow or sleet weather in most areas of the Yangtze River Basin in the winter half year (Zhang et al. 2016). Although tipping bucket rain gauges are disabled in areas with prolonged snowfall of upper reaches, they are still used in middle-lower reaches during the sleet weather, which can cause much anomaly data. Meanwhile, data quality is also affected by the relatively loose rain gauge maintenance requirements because of the low probability of high-impact severe rainfall weather in winter in the Yangtze River Basin. For these reasons, it is suggested that the tipping bucket rain gauges precipitation data in the winter half year should be used cautiously in combination with the synoptic background. More in-depth research would be required for the QC of winter precipitation in the future.

Eventually, the present diagnosis rules are sensitive to the values of the relevant QC parameters, which are obtained through a large number of experiments. In the future, efforts will be made to develop more intelligent methods to set parameter values. Besides, although the hit, false, and miss detection ratios for NMQC are discussed on the labeled test dataset, the volume of the test dataset needs to be further expanded for more comprehensive evaluation.

In this study, we presented a novel quality control method for the hourly rain gauge data called NMQC to enhance the automatic labeling ability of anomaly data and the robustness of algorithms by the quantitative and fusion application of radar and satellite multi-source data in the Yangtze River Basin. The main conclusions are summarized as follows:

  • (1)

    NMQC has a strong ability to label anomaly data automatically, which can greatly reduce manual intervention and shorten the impact time of anomaly data on applications. The detection ratio of anomaly data for NMQC is 1.73‰, of which only 1.73% is suspicious data. And the hit, false detection, and miss detection ratios on the test dataset are 98.20, 1.17, and 1.80%, respectively.

  • (2)

    The distribution characteristics of anomaly data detected by NMQC are consistent with climatic characteristics of the Yangtze River Basin as well as the measurement and maintenance modes of rain gauges. This implies the feasibility of NMQC. Specifically, the amount of different PQAEs varies greatly, with the highest proportion belonging to PSPQAE with 89.52%, and the anomaly data are mainly concentrated at less than 1.0 mm. The stations with high anomaly data ratios account for a small proportion and their spatial distribution is relatively uniform with no obvious regional characteristics. The anomaly data in the summer half year is less than that in the other months, but the amount of suspicious data in the summer half year is more than that in other months.

In summary, NMQC has a good performance on the detection of anomaly data with a lower proportion of suspicious data. Meanwhile, it also has a strong ability to automatically label erroneous data. Consequently, it can greatly reduce manual intervention and shorten the impact time of anomaly data. That is feasible for operational work.

This work was supported by the Yangtze River Basin Meteorological Open Fund Project (No.CJLY2022Y08), the Key Laboratory of South China Sea Meteorological Disaster Prevention and Mitigation of Hainan Province Open Fund Project (No.SCSF202209), and the China Yangtze Power Company Limited Scientific Research Project (No.2423020002).

Data cannot be made publicly available; readers should contact the corresponding author for details.

The authors declare there is no conflict.

Adam
J. C.
&
Lettenmaier
D. P.
2003
Adjustment of global gridded precipitation for systematic bias
.
Journal of Geophysical Research: Atmospheres
108
(
D9
),
4257
.
Blenkinsop
S.
,
Lewis
E.
,
Chan
S. C.
&
Fowler
H. J.
2017
Quality control of an hourly rainfall dataset and climatology of extremes for the UK
.
International Journal of Climatology
37
(
2
),
722
740
.
Cheng
G. W.
,
Liu
Y. L.
,
Chen
Y.
&
Gao
W.
2022
Spatiotemporal variation and hotspots of climate change in the Yangtze river watershed during 1958–2017
.
Journal of Geographical Sciences
32
(
1
),
141
155
.
Choi
J. H.
,
Chang
K. H.
,
Kim
K. E.
&
Bang
K. S.
2022
Improvement of rainfall measurements by using a dual tipping bucket rain gauge
.
Asia-Pacific Journal of Atmospheric Sciences
59
(
2
),
271
280
.
Colli
M.
,
Lanza
L. G.
,
Barbera
P. L.
&
Chan
P. W.
2014
Measurement accuracy of weighing and tipping-bucket rainfall intensity gauges under dynamic laboratory testing
.
Atmospheric Research
144
,
186
194
.
Gebremicael
T. G.
,
Deitch
M. J.
,
Gancel
H. N.
,
Kumar
L.
,
Haile
G. G.
,
Beyene
A. N.
&
Croteau
A. C.
2022
Satellite-based rainfall estimates evaluation using a parsimonious hydrological model in the complex climate and topography of the Nile River Catchments
.
Atmospheric Research
266
,
105939
.
Groisman
P. Y.
&
Legates
D. R.
1994
The accuracy of United States precipitation data
.
Bulletin of the American Meteorological Society
75
(
2
),
215
228
.
Han
F.
&
Wo
W. F.
2018
Design and implementation of SWAN2. 0 platform
.
Journal of Applied Meteorological Science
29
(
1
),
25
34
.
Han
D.
,
Yan
W.
,
Ren
J. Q.
&
Zhao
X. B.
2011
Cloud type classification algorithm for CloudSat satellite based on Support Vector Machine
.
Transactions of Atmospheric Sciences
34
(
5
),
583
591
.
Hassani
M. R.
,
Niksokhan
M. H.
,
Janbehsarayi
S. F. M.
&
Nikoo
M. R.
2023
Multi-objective robust decision-making for LIDs implementation under climatic change
.
Journal of Hydrology
617
,
128954
.
Hill
D. J.
2013
Automated Bayesian quality control of streaming rain gauge data
.
Environmental Modelling & Software
40
,
289
301
.
Hughes
D. A.
2006
Comparison of satellite rainfall data with observations from gauging station networks
.
Journal of Hydrology
327
(
3–4
),
399
410
.
Imhoff
R. O.
,
Cruz
L. D.
,
Dewettinck
W.
,
Brauer
C. C.
,
Uijlenhoet
R.
,
Heeringen
K. V.
,
Velasco-Forero
C.
,
Nerini
D.
,
Ginderachter
M. V.
&
Weerts
A. H.
2023
Scale-dependent blending of ensemble rainfall nowcasts and numerical weather prediction in the open-source pysteps library
.
Quarterly Journal of the Royal Meteorological Society
149
(
753
),
1335
1364
.
Jin
Y.
,
Cossuth
J. H.
,
Bankert
R. L.
,
Doyle
J. D.
&
Ryglicki
D. R.
2018
Applying Infrared Satellite Brightness Temperature to Understand Forecast Errors of Rapid Intensifying Hurricanes
. In:
33rd Conference on Hurricanes and Tropical Meteorology, American Meteorological Society
, Vol.
16C.8
, pp.
16
20
.
Kim
D. R.
&
Kwon
T. Y.
2011
Characteristics of satellite brightness temperature and rainfall intensity over the life cycle of convective cells-case study
.
Atmosphere
21
(
3
),
273
284
.
Koch
S. E.
,
Ferrier
B.
,
Stoelinga
M. T.
,
Szoke
E.
&
Kain
J. S.
2005
The use of simulated radar reflectivity fields in the diagnosis of mesoscale phenomena from high-resolution WRF model forecasts
. In:
Preprints, 11th Conference on Mesoscale Processes, American Meteorological Society
, Vol.
J4J.7
, pp.
1
9
.
Kondragunta
C. R.
&
Shrestha
K.
2006
Automated real-time operational rain gauge quality-control tools in NWS hydrologic operations
. In:
20th Conference on Hydrology
, Vol.
P2.4
.
Lengfeld
K.
,
Kirstetter
P. E.
,
Fowler
H. J.
,
Yu
J.
&
Gourley
J. J.
2020
Use of radar data for characterizing extreme precipitation at fine scales and short durations
.
Environmental Research Letters
15
(
8
),
085003
.
Lin
Q.
,
Chen
J.
,
Li
W.
,
Huang
K. L.
,
Tan
X. Z.
&
Chen
H.
2021
Impacts of land use change on thermodynamic and dynamic changes of precipitation for the Yangtze River Basin, China
.
International Journal of Climatology: A Journal of the Royal Meteorological Society
41
(
6
),
3598
3614
.
Lu
F.
,
Zhang
X. H.
,
Chen
B. Y.
,
Liu
H.
,
Wu
R. H.
,
Han
Q.
,
Feng
X. H.
,
Li
Y.
&
Zhang
Z. Q.
2017
FY-4 geostationary meteorological satellite imaging characteristics and its application prospects
.
Journal of Marine Meteorology
37
(
2
),
1
12
.
Martinaitis
S. M.
2008
Effects of Multi-Sensor Radar and Rain Gauge Data on Hydrologic Modeling in Relatively Flat Terrain
.
Master thesis
,
Florida State University
.
Martinaitis
S. M.
,
Cocks
S. B.
,
Qi
Y.
,
Kaney
B. T.
&
Howard
K.
2015
Understanding winter precipitation impacts on automated gauge observations within a real-time system
.
Journal of Hydrometeorology
16
(
6
),
2345
2363
.
Mohammadi
B.
,
Vazifehkhah
S.
&
Duan
Z.
2024
A conceptual metaheuristic-based framework for improving runoff time series simulation in glacierized catchments
.
Engineering Applications of Artificial Intelligence
127
,
107302
.
Nešpor
V.
&
Sevruk
B.
1999
Estimation of wind-induced error of rainfall gauge measurements using a numerical simulation
.
Journal of Atmospheric and Oceanic Technology
16
(
4
),
450
464
.
Nešpor
V.
,
Krajewski
W. F.
&
Kruger
A.
2000
Wind-induced error of raindrop size distribution measurement using a two-dimensional video disdrometer
.
Journal of Atmospheric and Oceanic Technology
17
(
11
),
1483
1492
.
Ombadi
M.
,
Nguyen
P.
,
Sorooshian
S.
&
Hsu
K. L.
2021
How much information on precipitation is contained in satellite infrared imagery?
Atmospheric Research
256
,
105578
.
Qi
Y.
2015
Quality Control of 11-Year Hourly Rain Gauge Data Over CONUS Based on Radar and Atmospheric Environmental Data
. In:
AGU Fall Meeting Abstracts
, pp.
H51I
1512
.
Qi
Y.
&
Zhang
J.
2013
A real-time automated quality control of rain gauge data based on multiple sensors
. In:
AGU Fall Meeting Abstracts
, pp.
H41I
-
1359
.
Qi
Y.
,
Martinaitis
S.
,
Zhang
J.
&
Cocks
S. B.
2016
A real-time automated quality control of hourly rain gauge data based on multiple sensors in MRMS system
.
Journal of Hydrometeorology
17
(
6
),
1675
1691
.
Ren
Z. H.
,
Zhao
P.
,
Zhang
Q.
,
Zhang
Z. F.
&
Chen
Z.
2010
Quality control procedures for hourly precipitation data from automatic weather stations in China
.
Meteorological Monthly
36
(
7
),
123
132
.
Schneider
U.
,
Becker
A.
,
Finger
P.
,
Meyer-Christoffer
A.
,
Ziese
M.
&
Rudolf
B.
2014
GPCC's new land surface precipitation climatology based on quality-controlled in situ data and its role in quantifying the global water cycle
.
Theoretical and Applied Climatology
115
,
15
40
.
Sha
Y.
,
Gagne
D. J.
,
West
G.
&
Stull
R.
2021
Deep-learning-based precipitation observation quality control
.
Journal of Atmospheric and Oceanic Technology
38
(
5
),
1075
1091
.
Shi
C. X.
,
Pan
Y.
,
Gu
J. X.
,
Xu
B.
,
Han
S.
,
Zhu
Z.
,
Zhang
L.
,
Sun
S.
&
Jiang
Z. W.
2019
A review of multi-source meteorological data fusion products
.
Acta Meteorologica Sinica
77
,
774
783
.
Thiruvengadam
P.
,
Indu
J.
&
Ghosh
S.
2021
Radar reflectivity and radial velocity assimilation in a hybrid ETKF-3DVAR system for prediction of a heavy convective rainfall
.
Quarterly Journal of the Royal Meteorological Society
147
(
737
),
2264
2280
.
Upton
G. J. G.
&
Rahimi
A. R.
2003
On-line detection of errors in tipping-bucket raingauges
.
Journal of Hydrology
278
(
1–4
),
197
212
.
WMO
2021
Guide to Instruments and Methods of Observation WMO-No.8
.
World Meteorological Organization
.
Available from: https://library.wmo.int/idurl/4/41650 (accessed 22 January 2023)
.
Xiao
Y. J.
,
Liu
L. P.
&
Shi
Y.
2008
Study of methods for three-dimensional multiple-radar reflectivity mosaics
.
ACTA Meteorologica Sinica
22
(
3
),
351
361
.
Yang
D. Q.
,
Kane
D.
,
Zhang
Z. P.
,
Legates
D.
&
Goodison
B.
2005
Bias corrections of long-term (1973–2004) daily precipitation data over the northern regions
.
Geophysical Research Letters
32
(
19
),
312
321
.
Yao
K.
,
Fan
S.
,
Wang
Y.
,
Wan
J.
,
Yang
D.
&
Cao
Y.
2022
Anomaly detection of steam turbine with hierarchical pre-warning strategy
.
IET Generation, Transmission & Distribution
12
,
16
.
Yuan
Y.
&
Hu
X.
2015
Bag-of-words and object-based classification for cloud extraction from satellite imagery
.
IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing
8
(
8
),
4197
4205
.
Zahumenský
I.
2004
Guidelines on quality control procedures for data from automatic weather stations
.
World Meteorological Organization, Switzerland
955
,
2
6
.
Zhang
Y.
,
Song
C.
,
Zhang
K.
,
Cheng
X.
&
Zhang
Q.
2014
Spatial–temporal variability of terrestrial vegetation productivity in the Yangtze River Basin during 2000–2009
.
Journal of Plant Ecology
7
(
1
),
10
23
.
Zhang
D. W.
,
Cong
Z. T.
&
Ni
G. H.
2016
Snowfall changes in China during 1956–2010
.
Journal of Tsinghua University(Science and Technology)
56
(
4
),
381
386
.
Zhao
B.
,
Dai
Q.
,
Zhuo
L.
,
Mao
J.
,
Zhu
S.
&
Han
D.
2022
Accounting for satellite rainfall uncertainty in rainfall-triggered landslide forecasting
.
Geomorphology
398
,
108051
.
This is an Open Access article distributed under the terms of the Creative Commons Attribution Licence (CC BY-NC-ND 4.0), which permits copying and redistribution for non-commercial purposes with no derivatives, provided the original work is properly cited (http://creativecommons.org/licenses/by-nc-nd/4.0/).