The teleconnection modeling of hydro-climatic events is a complex problem with highly uncertain circumstances. In contrast to the classic fuzzy logic methods, by using the Z-number in addition to the constraint of information, and by evaluating the data reliability, it is possible to characterize the degree of ambiguity of data. In this regard, this study investigates the performance of the Z-number-based model (ZBM) in prediction of classified monthly precipitation (MP) events of two synoptic stations in Iran (up to five months in advance). To this end, the sea surface temperature (SST) of adjacent seas was used as a predictor. The suggested model, by using Z-number directly and applying fuzzy Hausdorff distance to determine weights of if-then rules, predicted MP events of both the stations with over 70% confidence. Analysis of the results in the test step showed that the ZBM compared to the traditional fuzzy approach improved the results by 69% for Kermanshah and 112% for Tabriz. Overall, the Z-number concept by assessing events reliability can be used in various sectors of water resources management such as decision-making and drought monitoring.

  • In this study, the performance of the Z-number-based model in teleconnection modeling is investigated.

  • In contrast to classic fuzzy logic, Z-numbers consist of both restraint and reliability of data.

  • The performance of the Z-number-based model and the conventional fuzzy model are compared.

  • The Z-number-based model can predict classified monthly precipitation based on SST variations.

Oceanic-atmospheric teleconnection patterns could affect hydro-climatic events over large distances across the world. The accurate prediction of hydro-climatic events (such as maximum precipitation or drought events) can help decision-makers to improve planning to mitigate the adverse impacts and take advantage of beneficial conditions (Dhanya & Nagesh Kumar 2009; Moser & Hart 2015). From the early 1900s, various climatic and oceanic parameters had been used as predictors for hydro-climatic events prediction. Thus, if the association of the hydro-climatic events with the climatic and oceanic parameters is identified, this can be used for designing an effective risk management system for facing the extremes of adverse impacts (Webster et al. 1998).

Given the significance of the hydro-climatic events, previous studies have looked into the effects of large-scale ocean-atmospheric factors on these events. For example, the influence of persistent positive phases of the North Atlantic Oscillation (NAO) on Romania's drought was reported by Stefan et al. (2004). In the research by Ghasemi & Khalili (2008), the wet conditions in Iran were found to be characterized by a negative SST anomaly in the Mediterranean and the Black Sea, while dry conditions were found to be characterized by a positive SST anomaly in the Mediterranean and the Black Sea. The substantial link between southwest Iran's streamflow and the Mediterranean Sea's sea surface temperature (SST) was reported by Meidani & Araghinejad (2014). The findings of this study revealed that utilizing SST (as a predictor of streamflow in southwest Iran) produced improved outcomes compared to using other indices like NAO. The influence of regional SST variations on tropic rainfall was recently reported by Ying et al. (2019). In such studies, the ocean-atmospheric factors have been found to be important in coping with hydro-climatic occurrences.

Traditional approaches (e.g., linear and nonlinear regression or correlation) were employed in the above-mentioned studies (as well as the majority of prior efforts) merely to uncover the possible teleconnection between hydro-climatic parameters, and almost no prediction has been made. However, to deal with large-scale hydro-climatic events with highly uncertain circumstances (e.g., maximum monthly precipitation) and to predict their long-term states, fuzzy logic approaches might be a good alternative to traditional methods (Dhanya & Nagesh Kumar 2009; Nourani et al. 2021). Teleconnection patterns among hydro-climatic factors are complicated, and precise forecasting of their future conditions is challenging; in such situations, fuzzy logic might partially represent such uncertainty. Fuzzy logic has been increasingly utilized to describe complex systems in recent decades, due to its high ability to cope with the uncertainty of systems, which seems to be prevalent in hydro-climatic issues (e.g. see, (Ashrafi et al. 2019; Malik et al. 2019). To model with fuzzy logic methods, it is essential to determine the if-then rules. The construction of if-then rules is a difficult task due to the intricacy and uncertainty of far-away teleconnection mechanisms. In this regard, data mining (e.g., association mining) can be an appropriate manner for extracting the patterns and the construction of if-then rules (Tadesse et al. 2004). For example, Dadaser-Celik et al. (2013) utilized association mining to investigate the connections between streamflow and meteorological factor for Kzlrmak River Basin in Turkey.

The main flaw of the traditional fuzzy-based technique is that it struggles to deal with the ambiguous situations that are common in real-world scenarios (Aliev et al. 2016; Glukhoded & Smetanin 2016; Zadeh 2011). Since traditional fuzzy techniques merely contain restrictions and do not give reliability, it is important to discuss the reliability of the studied data. In this regard, researchers are now interested in Zadeh's proposal of Z-number, introduced in 2011. The Z-number is a pair of fuzzy numbers ordered and indicated by the symbol Z=(A, B). The first element, A, sets a constraint on the ambiguous variable X. The second element, B, is a degree of the reliability of the first element. The majority of known approaches for mathematical computations on such linguistic variables have focused on turning Z-numbers into conventional fuzzy numbers (Glukhoded & Smetanin 2016; Kang et al. 2018). However, they may miss valuable information, and they might not be applicable for all fuzzy numbers. In this study, by applying the concept of Z+-numbers, a Z-number is directly used for computation ( Zadeh 2011; Aliev et al. 2016). Also, the concept of fuzzy Hausdorff distance is applied to allocate weights to the rules (Aliev et al. 2016). The information loss is reduced in this method, but it is more difficult than traditional fuzzy logic and needs non-linear optimization procedures. As a result, a comprehensive comparison between the suggested Z-number-based model (ZBM) and the classical fuzzy logic method is needed to confirm the effectiveness and validity of the suggested model. Regarding contributions and innovations, this research, by evaluating the data reliability, investigated the use of SSTs as predictors to predict the MP events up to five months in advance. To this end, the association mining tool was used to extract (explain) the teleconnection pattern between SSTs and MP events. In the suggested ZBM, by applying fuzzy Hausdorff distance, a Z-number was directly used for computation.

In this study, a system was designed to predict the long-term MP events of two stations in northwestern Iran (up to five months in advance). In this regard, the ZBM was employed and a comparison study was performed with the conventional fuzzy model.

Study area and data

MP data from the two synoptic stations in northwestern Iran, situated at 38.05°N, 46.17°E (Tabriz) and 34.21°N, 47.90°E (Kermanshah), as well as SSTs from the adjacent seas (Black, Mediterranean, and Red), were used to apply the suggested approach (see Figure 1). The main reasons behind the selection of the Tabriz and Kermanshah stations are their locations and length of available data that provide an appropriate situation for both temporal and spatial assessments of the results. The distance between Tabriz and Kermanshah cities is approximately 422 km and they have different climatic regimes. The elevations of these cities are 1361 m and 1318.6 m above sea level, respectively. The mean monthly precipitation at the Kermanshah synoptic station is about 12 mm higher than the mean monthly precipitation at the Tabriz synoptic station, and the mean monthly temperature at the Kermanshah synoptic station is about 2.5 ̊C higher than the Tabriz. Tabriz has a semi-arid climate with regular seasons, but Kermanshah climate is heavily influenced by the proximity of the Zagros Mountains, classified as a hot dry summer Mediterranean climate.

Figure 1

Overview of the study region, adjacent seas, and Tabriz and Kermanshah synoptic stations.

Figure 1

Overview of the study region, adjacent seas, and Tabriz and Kermanshah synoptic stations.

Close modal

The MP time series were obtained from the Iran Meteorological Organization, and the monthly SST data were downloaded from the National Oceanic and Atmospheric Administration, NOAA website (http://www.esrl.noaa.gov/psd/cgi-bin/data/timeseries/timeseries1.pl) where the monthly SST values are available for 1° grid squares of the seas. The SST as the fundamental physical parameter in the earth's climate system could be used for the long-term precipitation modeling. The modeling approach included time series spanning 65 years, from 1955 to 2019. 75% of the values were used for training, while the last 25% of values (from 2003 to 2019) were used for the test (see Table 1).

Table 1

An overview of the data's statistical analysis (for 1955–2019)

Statistical parameterSST(°C)
MP(mm)
Black Sea
Mediterranean Sea
Red Sea
Tabriz Station
Kermanshah Station
traintesttraintesttraintesttraintesttraintest
Mean 11.598 12.659 19.864 20.661 25.194 26.045 24.278 21.504 37.587 34.754 
Maximum 24.814 26.102 27.435 28.028 31.340 31.754 128.400 91.300 295.400 163.600 
Minimum − 1.587 − 0.635 14.171 14.975 16.754 17.973 0.000 0.000 0.000 0.000 
Standard deviation 7.417 7.823 4.010 4.191 4.105 4.136 23.972 21.393 44.081 38.726 
Coefficient of variation (dimensionless) 0.639 0.618 0.202 0.203 0.163 0.159 0.987 0.995 1.173 1.114 
Statistical parameterSST(°C)
MP(mm)
Black Sea
Mediterranean Sea
Red Sea
Tabriz Station
Kermanshah Station
traintesttraintesttraintesttraintesttraintest
Mean 11.598 12.659 19.864 20.661 25.194 26.045 24.278 21.504 37.587 34.754 
Maximum 24.814 26.102 27.435 28.028 31.340 31.754 128.400 91.300 295.400 163.600 
Minimum − 1.587 − 0.635 14.171 14.975 16.754 17.973 0.000 0.000 0.000 0.000 
Standard deviation 7.417 7.823 4.010 4.191 4.105 4.136 23.972 21.393 44.081 38.726 
Coefficient of variation (dimensionless) 0.639 0.618 0.202 0.203 0.163 0.159 0.987 0.995 1.173 1.114 

The threshold limitation of T = 35% (65th percentile) was applied for classifying the MP data into two categories of high (H) and low (L) values (other threshold values may also be used). The threshold of T = 35% or 65th percentile can be defined as the lowest value that is greater than 65% of the values computed by sorting the MP values of all months from high to low. So, the threshold precipitation of T35% = 27 mm for Tabriz and T35% = 47.3 mm for Kermanshah synoptic stations were determined and applied to the data.

Proposed methodology

This study's suggested technique comprises four phases (data pre-processing, association rule mining, modeling with ZBM and traditional fuzzy tools, and lastly comparing and assessing the results). Figure 2 depicts the suggested methodology's schematic approach.

Figure 2

Schematic representation of the modeling process with suggested ZBM and conventional fuzzy method.

Figure 2

Schematic representation of the modeling process with suggested ZBM and conventional fuzzy method.

Close modal

Firstly, the monthly SSTs data were categorized into five categories: very high (VH), high (H), medium (M), low (L) and very low (VL) within the boundaries of μ ± iσ (where μ and σ are respectively mean and standard division of data and i = 0.5, 1.5). The derived rules from the association mining are dependent on these categories, and appropriate rules (in terms of confidence and support criteria) may not be derived if the categorization is inappropriate. This categorization was performed using expert judgment (as well as trial and error procedure) and earlier work; however, the number of categorized components in the inputs and outputs might be lower or higher. To this end, different methods were explored to determine the optimal thresholds. According to the literature (e.g. see, Tadesse et al. 2004; Danandeh Mehr et al. 2017; Nourani et al. 2021), different values for i were tried, including i = 1,1.5 (5 categories), i = 0.5,1.5 (5 categories), and i = 0.5,1,1.5 (7 categories). However, the technique that specifies i (i = 0.5, 1.5) as thresholds for the categories generated rules with better quality. Similarly, the reliability was divided into different classes. However, by performance analysis, the seven categories were utilized to categorize the degree of reliability. In addition, the MP time series were categorized into H and L classes (binary classification). Usually, there is no exact threshold for the determination of extreme events. For example, Rahimikhoob (2010) considered T = 25% (or percentile 75) as extreme events while Danandeh Mehr et al. (2017) examined different thresholds (15, 25 and 35%) as extreme events for the precipitation monitoring. In this regard, due to the arid to the semi-arid condition of western Iran, the threshold precipitation of T = 35% (or percentile 65) was determined and applied to the data (but other threshold values may also be tried within the suggested methodology).

In the second phase, the association rule mining approach was utilized to find patterns between the SST and MP categories. To this end, the teleconnection patterns between the binary MP(t) and SST(t-i) data (SST at different lags) were discovered using the training data set. After generating the patterns, the confidence measure of the association rule was calculated to assess the degree of rules reliability (for the consequent part). In addition, the degree of the reliability for the antecedent part of the rules was determined with the help of the probability of the occurrence of SST categories (according to past events).

In the third phase, traditional fuzzy and ZBM were performed. To this end, if-then rules were created using the patterns discovered in the second phase. Then the rules were weighted using the fuzzy Hausdorff distance and the chosen rules were aggregated.

Finally, depending on the efficiency criteria employed, the outcomes of both approaches were reviewed and compared.

The major component of the suggested technique (i.e., Z-number) is briefly described in the next sub-section, and brief descriptions of association rules, the addition of discrete Z-numbers and efficiency measures are provided in Appendix A.

Description of Z-number concept

The concept of a Z-number relates to the issue of reliability of information and is utilized to conduct computations using information that is not very reliable. A Z-number has two components, Z = (A, B). The first component, A, is a restriction (constraint) on the values by which a real-valued uncertain variable, X, is allowed to take. The second component, B, is a measure of reliability (certainty) of the first component. The Z-number definitions are briefly discussed in the following; for additional information readers can refer to Aliev et al. (2016) and Zadeh (2011).

Discrete Z-number

A fuzzy subset A of the real line R with convex membership function: R → [0, 1] is a discrete fuzzy number if its support is finite; that is, there exists x1, …, xs ∈ R with x1< x2 <…< xs such that supp(A) = {x1, …, xs}. A discrete Z-number is an ordered pair Z = (A, B) of discrete fuzzy numbers A and B. A plays a role of a fuzzy constraint on values that a random variable X may take. B is a discrete fuzzy number with a membership function μB: {b1, …, bs} → [0, 1], {b1, …, bs} ⊂ [0, 1], playing the role of a fuzzy constraint on the probability measure of A, , P(A) ∈ supp(B).

The definition of a discrete Z+-number is similar to the discrete Z-number. The Z +-number, Z + = (A, R), is a pair consisting of a fuzzy number, A, and a random number R, where A performs the same character as it does in a discrete Z-number and R performs the character of the probability distribution for B (Aliev et al. 2016).

Z-valued if-then rules-based reasoning

Zadeh (2011) tackled the issue of Z-interpolation, which is the interpolation of Z-rules. The extension of fuzzy rule interpolation (Kóczy & Hirota 1991) is the key to this problem. The proposed interpolation method is based on the notion that the distance between the resultant output and consequents components is equivalent to one between the present input and antecedent components (Kóczy & Hirota 1991). This means that for Z-rules, the output is calculated as:
(1)
where,
(2)
(3)
(4)
(5)
where,
(6)
(7)
where Zy is the Z-number valued consequence of the jth rule,, = 1, …, n are coefficients of linear interpolation, and n is the number of Z-rules. D denotes the distance between the current ith Z-number valued input and the ith Z-number valued antecedent of the jth rule. As a result, ρ calculates the distance between a current input vector and the vector of the antecedents of the jth rule. The rules are weighted (based on Equations (3)–(7)) and only the best rules are applied to Equation (1). The chosen rules should be re-weighted based on Equation (2) to conform to the superposition principle . Although a single low-weight rule has a little negative influence on the performance, adding many low-weight rules at the same time can have considerable negative effects. In this regard, only high-weight rules (rules with a weight of not less than 0.8 of maximum weight) were used in the adopted technique of this study.

The weights of the selected rules should be multiplied by Z-numbers. It is worth noting that Zy = λ. Zx (Ax, Bx) is the same as Zy = Zx ( λ.Ax, Bx). As a result, multiplying by λ has no effect on Bx (for more details, see appendix A).

In this study, by developing a MATLAB code and extracting the teleconnections patterns between SSTs of surroundings seas (Black, Mediterranean, and Red Seas) and MP events, a novel method is suggested using the Z-number theory.

Accordingly, the SSTs were categorized into interval sets such as VL, VH, etc (see Table 2). Also, the degree of reliability was categorized into seven categories, VL, L, low medium (LM), high medium (HM), H, VH, and extremely high (EH)) and MP time series (to H or L). Then by dividing them into training and testing sets, the association rule mining was utilized to find patterns between the SSTs and MP events (training data set) to create Z-rules. Figure 3 indicates that the codebook (or fuzzy sets) was created by the expert judgment. In this regard, the intervals by rigid limits were transformed into fuzzy sets and were used to create if-then rules. Finally, by performing the considered models, their performances were investigated, and the findings were compared with each other.

Table 2

The classes of monthly SSTs

VLLMHVH
Black Sea temperature (°C) < 0.46 [0.46–7.89] [7.89–15.31] [15.31–22.73] > 22.73 
Mediterranean Sea temperature (°C) < 13.84 [13.84–17.86] [17.86–21.87] [21.87–25.85] > 25.85 
Red Sea temperature (°C) < 19.03 [19.03–23.14] [23.14–27.25] [27.25–31.36] > 31.36 
VLLMHVH
Black Sea temperature (°C) < 0.46 [0.46–7.89] [7.89–15.31] [15.31–22.73] > 22.73 
Mediterranean Sea temperature (°C) < 13.84 [13.84–17.86] [17.86–21.87] [21.87–25.85] > 25.85 
Red Sea temperature (°C) < 19.03 [19.03–23.14] [23.14–27.25] [27.25–31.36] > 31.36 
Figure 3

The codebook for antecedents, consequences and degrees of reliability: (a) the Black Sea, (b) the Mediterranean Sea, (c) the Red Sea, (d) reliability of Z-numbers, (e) Tabriz precipitation and (f) Kermanshah precipitation.

Figure 3

The codebook for antecedents, consequences and degrees of reliability: (a) the Black Sea, (b) the Mediterranean Sea, (c) the Red Sea, (d) reliability of Z-numbers, (e) Tabriz precipitation and (f) Kermanshah precipitation.

Close modal

Results of suggested ZBM and conventional fuzzy method

It is worth noting that the association rule technique treats the linguistic terms as intervals, but these interval sets must be transformed to fuzzy sets for fuzzy logic-based modeling. In this regard, +5% and −5% (95% confidence) were applied to the minimum and maximum values of each interval, respectively. Furthermore, the trapezoidal membership functions were used for min and max fuzzy numbers, or , but triangular membership function was considered for other fuzzy numbers (see Figure 3). Figure 3 illustrates a codebook for antecedents, consequences, and degrees of reliability. For example, according to the above-mentioned rigid bounds at Table 2, for the Mediterranean Sea the interval [13.84, 17.86] could be considered as low (or L) but by converting it to the fuzzy set, the triangular fuzzy number should be considered as L.

The teleconnection patterns between the MP(t) and SSTs data were discovered by association mining, and then if-then rules were created. In Z-rules, for the consequent part, the confidence measure of each association rule was determined to specify the degree of rule reliability. In addition, for the antecedent part, the degree of reliability was determined with the probability of occurrence of SST categories. So, the degrees of reliability for SSTs categories of all three seas were determined similarly as:

  • I.

    If the class of SST is VL or VH, the degree of reliability is VL.

  • II.

    If the class of SST is L or H, the degree of reliability is L.

  • III.

    If the class of SST is M, the degree of reliability is HM.

To discover the most dominant lags between SSTs and MP events, numerous input combinations with varied lagged inputs were explored to calibrate and verify the models. The dominant lags were found using cross-correlation functions (CCFs) between MP(t-i) and SST(t) time series. In addition, seasonal-differencing (SST(t)-SST(t-12) was employed to eliminate trends from the SST series, and various combinations of the de-trended SST lags were utilized in the simulation. However, no significant rule was extracted for these lags by the association mining. Because of CCF's linear characteristics, there is no guarantee that the delays (lags) listed above are the best options for accounting for non-linear connections. The findings of the comparative analysis revealed that the performance of the models for SST with no pre-processing (and utilizing the same time delays for all three seas) might lead to superior outcomes in terms of de-trended data modeling results. This might be owing to the nonlinear filters used in ZBM (e.g., membership functions), which eliminate the need for other data pre-processing approaches (e.g., de-trending). Therefore, with trial and error procedure, the best delays were identified to be 1 to 5 months and were utilized in the simulation.

As a result, to construct Z-rules, the potential of extracted patterns via association mining were investigated. Thus, if the extracted patterns were acceptable in terms of confidence and support criteria (i.e., existing patterns with confidence >0.6), the required Z-rules for modeling with Z numbers were constructed. The Z-rules were created using the previously established linguistic variables (see Figure 3). Examples of examined rules are given in Table 3.

Table 3

Examples for Z if-then rules for Kermanshah precipitation station, lag 3

Rule No.If
Then
Black Sea (t − 3) temp isMediterranean Sea (t − 3) temp isRed Sea (t − 3) temp isKermanshah Precipitation (t) is
(M,HM) (M,HM) (VL,VL) (H,EH) 
(L,L) (M,HM) (M,HM) (H,EH) 
(M,HM) (VH,VL) (H,L) (H,EH) 
(VH,VL) (M,HM) (H,L) (H,EH) 
(L,L) (M,HM) (L,L) (H,H) 
(M,HM) (H,L) (M,HM) (H,H) 
(H,L) (VH,VL) (M,HM) (H,H) 
(L,L) (M,HM) (VL,VL) (H,HM) 
(M,HM) (M,HM) (L,L) (H,HM) 
10 (L,L) (L,L) (VL,VL) (H,HM) 
11 (VL,VL) (L,L) (L,L) (H,LM) 
12 (H,L) (H,L) (H,L) (H,LM) 
13 (VH,VL) (H,L) (H,L) (H,LM) 
14 (VH,VL) (VH,VL) (H,L) (H,LM) 
15 (M,HM) (M,HM) (M,HM) (H,LM) 
16 (VL,VL) (L,L) (VL,VL) (H,LM) 
17 (M,HM) (L,L) (L,L) (H,L) 
18 (H,L) (H,L) (M,HM) (H,L) 
19 (H,L) (VH,VL) (H,L) (H,L) 
20 (L,L) (L,L) (L,L) (H,L) 
21 (M,HM) (L,L) (M,HM) (H,VL) 
22 (M,HM) (M,HM) (H,L) (H,VL) 
23 (L,L) (L,L) (M,HM) (H,VL) 
24 (H,L) (M,HM) (H,L) (H,VL) 
25 (H,L) (M,HM) (M,HM) (H,VL) 
26 (VL,VL) (M,HM) (VL,VL) (H,VL) 
Rule No.If
Then
Black Sea (t − 3) temp isMediterranean Sea (t − 3) temp isRed Sea (t − 3) temp isKermanshah Precipitation (t) is
(M,HM) (M,HM) (VL,VL) (H,EH) 
(L,L) (M,HM) (M,HM) (H,EH) 
(M,HM) (VH,VL) (H,L) (H,EH) 
(VH,VL) (M,HM) (H,L) (H,EH) 
(L,L) (M,HM) (L,L) (H,H) 
(M,HM) (H,L) (M,HM) (H,H) 
(H,L) (VH,VL) (M,HM) (H,H) 
(L,L) (M,HM) (VL,VL) (H,HM) 
(M,HM) (M,HM) (L,L) (H,HM) 
10 (L,L) (L,L) (VL,VL) (H,HM) 
11 (VL,VL) (L,L) (L,L) (H,LM) 
12 (H,L) (H,L) (H,L) (H,LM) 
13 (VH,VL) (H,L) (H,L) (H,LM) 
14 (VH,VL) (VH,VL) (H,L) (H,LM) 
15 (M,HM) (M,HM) (M,HM) (H,LM) 
16 (VL,VL) (L,L) (VL,VL) (H,LM) 
17 (M,HM) (L,L) (L,L) (H,L) 
18 (H,L) (H,L) (M,HM) (H,L) 
19 (H,L) (VH,VL) (H,L) (H,L) 
20 (L,L) (L,L) (L,L) (H,L) 
21 (M,HM) (L,L) (M,HM) (H,VL) 
22 (M,HM) (M,HM) (H,L) (H,VL) 
23 (L,L) (L,L) (M,HM) (H,VL) 
24 (H,L) (M,HM) (H,L) (H,VL) 
25 (H,L) (M,HM) (M,HM) (H,VL) 
26 (VL,VL) (M,HM) (VL,VL) (H,VL) 

In this work, Mamdani fuzzy inference system (FIS) with the min implication, max for aggregation, and centroid technique for de-fuzzification were employed (Jayawardena et al. 2014). In general, by assuming n inputs with m classes, the number of rules might be up to mn (in this example, up to 53 = 125) in traditional fuzzy logic approaches. However, owing to the association rule mining, only around 26 rules were evaluated and analyzed for each model in this study (e.g., see Table 3 as an example). In such an instance (lack of rules), the classical reasoning procedures are ineffective in generating an outcome for the sample covered by no rules (Aliev et al. 2016). In this study, the inference approaches (weighting the Z-rules) were utilized to conduct the approximation reasoning in the absence of matching rules (named Z-interpolated method).

For verification and comparison reasons, the ZBM output as a Z-number (pair of fuzzy numbers) must be transformed to a single value. Based on its reliability, the output as Z-number could be transformed to an interval value. Due to the codebook, if the middle value of the reliability part (in this case, triangular fuzzy number) is greater than 0.49, then the Z-number first part is approved. For instance, if the output = then Tabriz precipitation will be high (0.7 > 0.49) or if the output = then Tabriz precipitation will be low (0.3 < 0.49).

Here, the prediction of the future state with the suggested ZBM is illustrated with an example. Assume the calculation of Kermanshah precipitation at time t according to values of SSTs at time t-3 (Black Sea = 14.365 °C, Mediterranean Sea = 24.881 °C, Red Sea = 27.805 °C) observed in October 2015:

  • I.

    Based on SST's categorization and Figure 3, the numerical inputs are transformed to fuzzy numbers. As a result, the numbers (14.365, 24.881, 27.805) are transformed to (M, H, H).

  • II.

    By evaluating the reliability of fuzzy sets, the (M, H, H) are transformed to Z-numbers ((M,HM), (H,L), (H,L)).

  • III.

    By weighting the rules, the most appropriate rules are chosen (based on Equations (1)–(7)). In this case, three rules are selected.

  • IV.

    Now, using the approach given in Appendix A, the consequences of the specified rules (as Z-numbers) are aggregated by taking into account their estimated weights. The consequences of 3 selected rules are (precipitation = H, reliability = EH), (precipitation = H, reliability = H) and (precipitation = H, reliability = VL), so the output is computed as . According to the codebook (Figure 3), this output with very high reliability indicates that the Kermanshah precipitation in January 2016 will be high.

To fully explain how to apply the suggested ZBM to model MP, an example has been presented in Appendix A.

In this research as a binary type modeling, two types of output could be achieved (i.e., high or low MP). So the results of the modeling could be expressed as:

  • I.

    True High (TH) indicates the numbers of high predicts that are true.

  • II.

    True Low (TL) indicates the numbers of low predicts that are true.

  • III.

    False Low (FL) indicates the numbers of low predicts that are false. In other words, the numbers of high observations are predicted as low.

  • IV.

    False High (FH) indicates the numbers of high predicts that are false.

As shown in Table 4, these four possible results form a matrix called the confusion matrix (Danandeh Mehr et al. 2017). Its rows represent the observed data, while its columns represent the predicted data. In addition, f indicates the total number of predictions.

Table 4

Confusion matrix for binary modeling

fPredicted as HighPredicted as Low
Observed High TH FL 
Observed Low FH TL 
fPredicted as HighPredicted as Low
Observed High TH FL 
Observed Low FH TL 

To evaluate the performances of the ZBM and traditional fuzzy method (as a benchmark model) the total accuracy (TA) and Heidke Skill Score (HSS) were computed and presented in Table 5. Due to the necessity for numerical results or crisp values, traditional assessment measures such as determination coefficient cannot be employed in binary classification situations (e.g., see (Sharghi et al. 2018). The TA (ranges from 0 to 100%) and HSS (ranges from -∞ to 1) may be suitable alternatives in this case (Nourani et al. 2021). HSS = 1 denotes the best model, 0 means no skill, and negative values indicate that the chance forecast is better (see Appendix A for more details about used efficiency criteria). It is worth noting that the TA criterion may not accurately reflect the model's performance on its own. With the given threshold (T = 35%), the number of H data (only 35% of the total data) is lower than L data. Therefore, biased predictions may have an impact on the TA. For example, the TA criterion for low events as a major group is predicted to be always greater than H events as a minority group. As a result, the HSS was used to compare the suggested ZBM to a traditional fuzzy model. To this end, the confusion matrices for both the stations and all considered lags were provided (see Table 6, as an example). The most common occurrence, as shown in Table 6, is correctly predicted as low events. This indicates the potential of TA findings to be biased, so the TA criterion, in conjunction with HSS, can be used to accurately assess the ability of models.

Table 5

The performance results of the ZBM and traditional fuzzy model

StationsLag No.Z-number
Traditional Fuzzy
TA
HSS
TA
HSS
TrainTestTrainTestTrainTestTrainTest
Kermanshah 1 77.95 70.27 0.53 0.37 61.98 57.81 0.32 0.28 
70.49 69.79 0.35 0.29 56.94 53.13 0.25 0.19 
72.74 70.31 0.36 0.26 53.47 50.52 0.21 0.18 
74.31 69.79 0.45 0.34 55.73 46.88 0.24 0.14 
72.92 70.06 0.46 0.36 65.80 58.33 0.37 0.27 
Tabriz 1 76.14 75.25 0.44 0.42 60.59 60.49 0.30 0.29 
74.83 72.64 0.35 0.30 65.10 56.99 0.34 0.24 
74.83 73.40 0.39 0.38 69.44 65.05 0.38 0.29 
76.53 75.37 0.37 0.35 48.26 46.24 0.13 0.12 
69.27 68.97 0.38 0.37 55.90 43.55 0.23 0.10 
StationsLag No.Z-number
Traditional Fuzzy
TA
HSS
TA
HSS
TrainTestTrainTestTrainTestTrainTest
Kermanshah 1 77.95 70.27 0.53 0.37 61.98 57.81 0.32 0.28 
70.49 69.79 0.35 0.29 56.94 53.13 0.25 0.19 
72.74 70.31 0.36 0.26 53.47 50.52 0.21 0.18 
74.31 69.79 0.45 0.34 55.73 46.88 0.24 0.14 
72.92 70.06 0.46 0.36 65.80 58.33 0.37 0.27 
Tabriz 1 76.14 75.25 0.44 0.42 60.59 60.49 0.30 0.29 
74.83 72.64 0.35 0.30 65.10 56.99 0.34 0.24 
74.83 73.40 0.39 0.38 69.44 65.05 0.38 0.29 
76.53 75.37 0.37 0.35 48.26 46.24 0.13 0.12 
69.27 68.97 0.38 0.37 55.90 43.55 0.23 0.10 
Table 6

The confusion matrices of ZBM for the Kermanshah station

Lag No.Train
Test
Predicted MP
Predicted MP
fa = 576HighLowfb = 192HighLow
MP Observed High 156 48 MP Observed High 40 22 
Low 79 293 Low 33 97 
MP Observed High 116 89 MP Observed High 29 33 
Low 81 290 Low 25 105 
MP Observed High 95 110 MP Observed High 23 39 
Low 47 324 Low 18 112 
MP Observed High 140 64 MP Observed High 38 25 
Low 84 288 Low 33 96 
MP Observed High 168 35 MP Observed High 51 12 
Low 121 252 Low 50 79 
Lag No.Train
Test
Predicted MP
Predicted MP
fa = 576HighLowfb = 192HighLow
MP Observed High 156 48 MP Observed High 40 22 
Low 79 293 Low 33 97 
MP Observed High 116 89 MP Observed High 29 33 
Low 81 290 Low 25 105 
MP Observed High 95 110 MP Observed High 23 39 
Low 47 324 Low 18 112 
MP Observed High 140 64 MP Observed High 38 25 
Low 84 288 Low 33 96 
MP Observed High 168 35 MP Observed High 51 12 
Low 121 252 Low 50 79 

aDenotes the number of train samples.

bDenotes the number of test samples.

Comparing the obtained results

According to the HSS criterion (Table 5), at the test step for unseen data, the ZBM outperformed the traditional fuzzy model by an average of 69% for Kermanshah and 112% for Tabriz. This indicates that the ZBM, in addition to providing the reliability of the output, outperforms the traditional fuzzy model even by transforming its output to an interval (H or L). For example, if Z-number output = , this indicates high MP events with extremely high reliability for Tabriz (see Figure 3).

As a result, the ZBM can be a credible alternative for prediction from a practical standpoint especially for unseen data due to its aforementioned benefits in dealing with the lack of rules and evaluating the data reliability. The HSS had max values of 0.42 for Tabriz (delay 1) and 0.37 (delay 1) for Kermanshah at the test phase. The TA values for these delays as the best ZBMs were 75% for Tabriz and 70% for Kermanshah. As a result, the ZBM described H and L occurrences of MP by simultaneously using SSTs from the Black, Mediterranean, and Red Seas, with over 70% confidence.

The most effective lag of the SST time series for the ZBM may be found by comparing the HSS criterion, which is obtained using SST with distinct delays. Table 5 indicates that for the both stations, the optimum ZBM was obtained at lag 1. This might be owing to the fact that the overall distances between the stations and the adjacent seas are almost similar (5200 km). As a result, the obtained findings not only show the dynamic teleconnections between MP events and SSTs of adjacent seas but also confirm the validity of the chosen delays (1–5 delays).

SST is a very important variable in the earth's climate system. Being at the interface of the ocean and the atmosphere, SST is critical to both, and to the exchanges of heat, moisture, momentum, and gases between the two (O'Carroll et al. 2019). High MP occurrences happened with combinations of M, L, and VL categories of SSTs at both stations owing to the derived rules for lag 1 with reliability greater than HM (H, VH, and EH), and H or VH category of SSTs did not exist at the derived rules. This confirms the previous results (e.g., Ghasemi & Khalili 2008) that the wet circumstances in Iran are often accompanied by a negative SST anomaly in the Mediterranean and Black Sea. For Tabriz station, the frequencies of high MP events (with T = 35%) were 32% for winter, 40% for spring, 2.5% for summer and 25.5% for autumn. Nevertheless, for Kermanshah station, the frequencies of high MP events were 49% for winter, 17.5% for spring, 0% for summer and 33.5% for autumn. With summer coming and the occurrence of H and VH SSTs, MP occurrences were infrequent for both stations, and no rule with H or VH SST category was derived. Consequently, for both stations, high MP events occurred by a combination of the M, L and VL of SSTs classes of the antecedents. It is notable that according to the binary assumption of precipitation in this study (L or H), when the consequence of the rule is precipitation = ‘H with reliability = ‘VL’, it means precipitation = ‘L’ with reliability = ‘EH’. Furthermore, while high MP events (the MP values higher than 90th percentile) were infrequent compared to L events, the rules with high reliability for L precipitation events were far more numerous than those for high MP events.

In terms of geographical analysis, the ZBM for Tabriz performed better than Kermanshah for all examined lags (see Table 5). Higher values of the coefficient of variation and standard deviation for Kermanshah compared to Tabriz indicate that Kermanshah data is more irregular (see Table 1). This resulted in better modeling performance for Tabriz. Also, as shown in Table 5, the increase in modeling efficiency using Z-number compared to traditional fuzzy for Tabriz (up to 112%) was better than that for Kermanshah (up to 69%). This might be owing to the linear correlation between SSTs and MP, which in Kermanshah (∼ 0.6) is greater than in Tabriz (∼ 0.4). However, ZBM has improved the performance of the model by using strong nonlinear filters, which has been higher for Tabriz compared to Kermanshah.

The Zagros and Alborz Mountains, as the two large mountain chains of Iran are situated in the northwest, west, and north of Iran. Precipitation fluctuations of different regions of Iran are not significant and such fluctuations can be found only in western Iran (Raziei et al. 2009). In addition, only the west of Iran is affected by SSTs of the Red, Mediterranean and Black Seas. For this reason, in this study, the data of Tabriz station (in northwestern Iran) and Kermanshah station (close to the center of western Iran) which have data with appropriate quality and quantity were used as the representative of western Iran. Previous studies (e.g. see, Nazemosadat et al. 2006; Raziei et al. 2009; Dezfuli et al. 2010; Hosseinzadeh Talaee et al. 2014) have mostly focused on the correlation of El Niño-Southern Oscillation (ENSO) and NAO and Iran's regional climate. However, Iran is surrounded by seas and using their SSTs in modeling might enhance the forecast findings. The results of this study confirmed this hypothesis and indicated that SSTs, even without using any other indices, are appropriate predictors for the prediction of future states of MP events in western Iran. This may be because approximately 70% of Iran's precipitations originated in either the Black Sea or the Mediterranean Sea. The other 30% originates in North Africa and the Red Sea and comes to Iran via Saudi Arabia and the Persian Gulf (Kendrew 1922; Ghasemi & Khalili 2008).

In this research as a binary type modeling, two classes of output were considered in the modeling (i.e., high or low MP). However, the output could be classified into more classes if needed. In this study, the MP events of only two stations were investigated but to further explore the ability of ZBM for teleconnection modeling between MP events and SSTs, it is recommended to apply it to multiple precipitations series, with diverse characteristics over the whole country. In addition, this study only applied the SSTs as predictors but other indices such as NAO, ENSO, etc., could be used to teleconnection modeling of hydro-climatic events.

Due to the uncertainties of hydro-climatic systems, fuzzy logic has been increasingly utilized to describe the ambiguity of such systems. The classic fuzzy logic methods do not consider the reliability of the information. However, by using the Z-number in addition to the constraint of information, it is possible to characterize the degree of reliability of data. Predicting MP events (H or L) as a complex natural process is associated with high uncertainty. It seems necessary to develop models that can control this uncertainty, especially in regions such as Iran which is categorized as arid to semi-arid. In this regard, by developing the ZBM model, monthly SSTs of Black, Mediterranean, and Red Seas were used to predict the classified the MP of the two stations in the northwest of Iran. The derived outcomes were compared to the outcomes of the classic fuzzy approach using the TA and HSS criteria.

According to the obtained results, the teleconnection parameters such as the SSTs of surrounding seas at different lags could be applied as predictors to predict MP events. The results indicated that even for test data (unseen data), by evaluating the data reliability and assigning weights to the if-then rules, the ZBM compared to the conventional fuzzy model improved the results by 69% for Kermanshah and 112% for Tabriz. In addition, the performance of the ZBM for Tabriz was better than for Kermanshah because of the distinct precipitation patterns over these regions. Therefore, the ZBM can be an effective tool for the prediction goals especially for the unseen test data in the case of incomplete or lack of matching rules because of using inference techniques for the approximate reasoning.

Consequently, the Z-number idea, by assessing events reliability, can be used in various sectors of water resources management, and it improves the modeling efficiency.

All relevant data are available from an online repository or repositories (https://psl.noaa.gov/cgi-bin/data/timeseries/timeseries1.pl And for monthly precipitation data: https://www.irimo.ir/).

Aliev
R. A.
,
Pedrycz
W.
,
Huseynov
O. H.
&
Eyupoglu
S. Z.
2016
Approximate reasoning on a basis of Z-number-valued if–then rules
.
IEEE Transactions on Fuzzy Systems
25
,
1589
1600
.
Dadaser-Celik
F.
,
Celik
M.
&
Dokuz
A.
2013
Associations between stream flow and climatic variables at Kizilirmak river basin in Turkey
.
Global NEST Journal
14
,
354
361
.
https://doi.org/10.30955/gnj.000881
.
Danandeh Mehr
A.
,
Nourani
V.
,
Hrnjica
B.
&
Molajou
A.
2017
A binary genetic programing model for teleconnection identification between global sea surface temperature and local maximum monthly rainfall events
.
Journal of Hydrology
555
,
397
406
.
https://doi.org/10.1016/j.jhydrol.2017.10.039
.
Dezfuli
A. K.
,
Karamouz
M.
&
Araghinejad
S.
2010
On the relationship of regional meteorological drought with SOI and NAO over southwest Iran
.
Theoretical and Applied Climatology
100
,
57
66
.
https://doi.org/10.1007/s00704-009-0157-2
.
Dhanya
C. T.
&
Nagesh Kumar
D.
2009
Data mining for evolution of association rules for droughts and floods in India using climate inputs
.
Journal of Geophysical Research: Atmospheres
114
,
1
15
.
Glukhoded
E. A.
&
Smetanin
S. I.
2016
The method of converting an expert opinion to Z-number
.
Proceedings of the Institute for System Programming of the RAS
28
,
7
20
.
https://doi.org/10.15514/ISPRAS-2016-28(3)-1
.
Hosseinzadeh Talaee
P.
,
Tabari
H.
&
Sobhan Ardakani
S.
2014
Hydrological drought in the west of Iran and possible association with large-scale atmospheric circulation patterns
.
Hydrological Processes
28
,
764
773
.
Jayawardena
A. W.
,
Perera
E. D. P.
,
Zhu
B.
,
Amarasekara
J. D.
&
Vereivalu
V.
2014
A comparative study of fuzzy logic systems approach for river discharge prediction
.
Journal of Hydrology
514
,
85
101
.
Kang
B.
,
Deng
Y.
,
Hewage
K.
&
Sadiq
R.
2018
A method of measuring uncertainty for Z-number
.
IEEE Transactions on Fuzzy Systems
27
,
731
738
.
Kendrew
W. G.
1922
The Climates of the Continents
.
Oxford University Press
,
Oxford
,
UK
.
Kóczy
L. T.
&
Hirota
K.
1991
Rule interpolation by α-level sets in fuzzy approximate reasoning
.
J. BUSEFAL, Automne, URA-CNRS
46
,
115
123
.
Nazemosadat
M. J.
,
Samani
N.
,
Barry
D. A.
&
Molaii Niko
M.
2006
ENSO forcing on climate change in Iran: precipitation analysis
.
Iranian Journal of Science and Technology Transaction B: Engineering
30
,
555
565
.
Nourani
V.
,
Najafi
H.
,
Sharghi
E.
&
Roushangar
K.
2021
Application of Z-Numbers to monitor drought using large-scale oceanic-atmospheric parameters
.
Journal of Hydrology
598
,
126198
.
O'Carroll
A. G.
,
Armstrong
E. M.
,
Beggs
H. M.
,
Bouali
M.
,
Casey
K. S.
,
Corlett
G. K.
,
Dash
P.
,
Donlon
C. J.
,
Gentemann
C. L.
&
Høyer
J. L.
2019
Observational needs of sea surface temperature
.
Frontiers in Marine Science
6
,
420
.
Rahimikhoob
A.
2010
Forecasting of maximum monthly precipitation of Ilam using data mining techniques
.
Iranian Journal of Soil and Water Research
42
,
1
7
(In persian)
.
Raziei
T.
,
Saghafian
B.
,
Paulo
A. A.
,
Pereira
L. S.
&
Bordi
I.
2009
Spatial patterns and temporal variability of drought in western Iran
.
Water Resources Management
23
,
439
.
Sharghi
E.
,
Nourani
V.
,
Najafi
H.
&
Molajou
A.
2018
Emotional ANN (EANN) and Wavelet-ANN (WANN) approaches for markovian and seasonal based modeling of rainfall-runoff process
.
Water Resources Management
32
,
3441
3456
.
https://doi.org/10.1007/s11269-018-2000-y
.
Stefan
S.
,
Ghioca
M.
,
Rimbu
N.
&
Boroneant
C.
2004
Study of meteorological and hydrological drought in southern Romania from observational data
.
International Journal of Climatology
24
,
871
881
.
Tadesse
T.
,
Wilhite
D. A.
,
Harms
S. K.
,
Hayes
M. J.
&
Goddard
S.
2004
Drought monitoring using data mining techniques: a case study for Nebraska, USA
.
Natural Hazards
33
,
137
159
.
Webster
P. J.
,
Magana
V. O.
,
Palmer
T. N.
,
Shukla
J.
,
Tomas
R. A.
,
Yanai
M. U.
&
Yasunari
T.
1998
Monsoons: processes, predictability, and the prospects for prediction
.
Journal of Geophysical Research: Oceans
103
,
14451
14510
.
Zadeh
L. A.
2011
A note on Z-numbers
.
Information Sciences
181
,
2923
2932
.
https://doi.org/10.1016/j.ins.2011.02.022
.
This is an Open Access article distributed under the terms of the Creative Commons Attribution Licence (CC BY 4.0), which permits copying, adaptation and redistribution, provided the original work is properly cited (http://creativecommons.org/licenses/by/4.0/).

Supplementary data