Abstract
This paper studies, through the principles of fuzzy set theory, groundwater response to meteorological drought in the case of an aquifer system located in the plains at the southeast of Xanthi, NE Greece. Meteorological drought is expressed through standardized Reconnaissance Drought Index (RDISt) and Standardized Precipitation Index (SPI), which are calculated for various reference periods. These drought indices are considered as independent variables in multiple fuzzy linear regression based on Tanaka's model, while the observed water table regarding two areas is used as a dependent variable. The fuzzy linear regression of Tanaka is characterized by the inclusion constraints where all the observed data must be included in the produced fuzzy band. Hence, each fuzzy output can get an interval of values where a membership degree corresponds to each of them. A modification of the Tanaka model by adding constraints is proposed in order to avoid irrational behavior. The results show that there was a significant influence of the meteorological drought of the previous hydrological year, while geology plays an important role. Furthermore, the use of RDISt improves the results of fuzzy linear regressions in all cases. Two suitability measures and a measure of comparison between fuzzy numbers are used.
HIGHLIGHTS
The Fuzzy linear regression of Tanaka is applied between meteorological drought indices and groundwater level in an aquifer system of Xanthi plain.
Two suitability measures are used. The first one includes the objective function of Tanaka's model, while the second one is a fuzzified measure.
A ranking measure is used in order to compare the fuzzified measures of suitability.
INTRODUCTION
Drought is a recurrent phenomenon caused by natural perturbations of precipitation and is associated with moisture deficits. During droughts, the availability of water resources is below the standard conditions of the system for a considerable time period, a situation which can be extended to a significant area (Rossi et al. 1992; Tsakiris et al. 2013). The normal conditions of a system differ from region to region and between climatic zones; thus, drought must be considered not as an absolute, but as a relative condition that may occur both in high- and low-rainfall areas (Wilhite et al. 2014). Drought is characterized as a creeping phenomenon, since it affects slowly and accumulatively the systems. Therefore, the accurate determination of its onset and termination is not simple (Wilhite et al. 2014). Apart from its creeping trend, the multidimensional character of the phenomenon and the process by which it is propagated through the land phase of the hydrological cycle increase its complexity (Peters et al. 2006; Van Loon & Laaha 2015). Compared to other natural disasters, drought has the most hazard characteristics for environmental and anthropogenic systems (Bryant 1991; Wilhite 2000; Mishra & Singh 2010). Generally, drought is divided into meteorological drought, hydrological drought and agricultural drought. At the same time, scientists have identified the socioeconomic drought by taking into account its economic effects on human societies. Analysis of these types of drought is based mainly on the use of drought indices. These indices make more comprehensible the information of climatic anomalies and may be used to determine the severity, temporal, and spatial extent of drought. These three elements constitute the three dimensions of drought (Guttmann 1998; Hayes 2000; Nohegar & Heydarzadeh 2016).
The effect of drought on groundwater is of particular importance, given that groundwater is the main water source during droughts due to the depletion of surface water recourses (Barkey & Bailey 2017). Shallow aquifers, rather than deeper ones, are more affected by local and not by regional climate conditions (amount of rainfall in an area and temperature) (Apaydin 2009), while rainfall and temperature are the two main meteorological drought parameters. Moreover, in lowland areas, unconfined aquifers often meet irrigation needs and therefore are under additional stress.
In recent years, there is a growing number of studies investigating the effect of meteorological drought on groundwater through various approaches. The researches focus on exploring the relationship between meteorological drought parameters and hydrogeological ones. For instance, Tallaksen et al. (2006) use time series of rainfall, streamflow and potential evapotranspiration in order to simulate groundwater recharge, head and discharge of an unconfined aquifer (in most parts) by using SWAP (Soil Water Atmosphere Plant) and MODFLOW. Then, all the above hydrometeorological parameters are utilized in order to determine drought events in time and space on the basis of the methodology of Peters et al. (2006) where the outputs of their model are drought duration, average area in drought and average deficit volume.
The investigation of the relationship between meteorological drought and groundwater may also be conducted through drought indices. The well-known SPI (Standardized Precipitation Index) (McKee et al. 1993) is correlated with hydrological and hydrogeological variables (Chamanpira et al. 2014; Bloomfield et al. 2015; Lorenzo-Lacruz et al. 2017), which can be expressed in terms of indices (Mendicino et al. 2008; Bloomfield & Marchant 2013). In the research of Soltani et al. (2013), SPI is directly correlated and related to the observed groundwater-level data from 236 wells for various time scales. Another drought index with growing use in assessing the effects of meteorological drought on groundwater is the Reconnaissance Drought Index (RDI) suggested by Tsakiris & Vangelis (2005), while a more comprehensive description of it was presented in other publications (e.g. Tsakiris et al. 2007). Due to the fact that RDI takes into account both precipitation (P) and potential evapotranspiration (PET), it is a suitable index particularly for arid and semi-arid regions where high temperature and precipitation deficit occur. It is a useful and robust index as far as meteorological drought severity is concerned and independent from the PET calculation method (Vangelis et al. 2013). RDI can be calculated through three different forms: the initial, the normalized and the standardized. The standardized nature of RDI allows comparison between regions with different drought conditions.
In the research of Safioela et al. (2015), both RDI and SPI indices were used in order to examine the drought impacts on groundwater recharge of Belgian aquifers. Their methodology includes trend analysis on a long time series of rainfall and temperature for those two drought indices, while statistical tests are used to evaluate potential trends. Analysis is conducted in annual, semester and trimester time scales, while the study area is divided into cells of 10 × 10 km. In another research, Nohegar & Heydarzadeh (2016) analyze meteorological drought by using RDI and SPI and evaluate the effects of drought on groundwater resources for several reference periods (3, 6, 9, 12, 18 and 24 months). For the purposes of the analysis, the observations of groundwater level, which were measured every 6 months (August–February) from 1991 to 2009, were utilized in the case of both shallow and deeper aquifers. Shallow aquifers range between 5 and 35 m, while the deeper ones fluctuate between 80 and 440 m approximately. Then, RDI and SPI values are correlated with water table (WT) records where, according to the results, the greatest correlation, in almost the entire study region, is related to a long-term period. However, among several examined plains, only in the case of shallow aquifer of Jaghin to Kahur plain, groundwater level is highly correlated to the indices of the 3-month period.
All these studies significantly contribute to a better understanding of how meteorological drought may influence the groundwater. In all cases, analysis is carried out at a catchment or a regional scale, while long-term series of hydrometeorological variables are available. This research aims to investigate the effect of meteorological drought, through fuzzy linear regression analysis, on WT at a local scale in the case of lack of data. During the last two decades, fuzzy set theory has been applied in the modeling and management of hydrological problems by using a variety of autonomous or hybrid fuzzy methodologies (Makropoulos & Butler 2004; Mpimpas et al. 2008; Fu & Kapelan 2013; Theodoridou et al. 2017). Fuzzy logic has the ability to model the vagueness of operation mechanism of a system, which may arise either due to the lack of precise knowledge of its operation or due to the lack of available data or measurement errors, and thus, the uncertainty of complex systems can be incorporated through the analysis (Rajabi & Ataie-Ashtiani 2016). Particularly, fuzzy regression methodologies have been applied in the analysis of both qualitative and quantitative parameters (Bardossy et al. 1990; Guan & Aral 2004; Mathon et al. 2008; Chachi et al. 2014; Uddameri et al. 2014; Khan & Valeo 2015; Tzimopoulos et al. 2016a; Evangelides et al. 2017; Spiliotis et al. 2017; Tzimopoulos et al. 2018), and furthermore, where there is a lack of data (e.g. Ganoulis 2004). The system uncertainty is incorporated into regression coefficients, which are considered fuzzy numbers.
Other fuzzy techniques are based on a system of fuzzy rules IF-THEN-based systems. Inference with fuzzy rules provides a way for the elicitation of knowledge of experts (Ruan & Kerre 2000). Fuzzy inference systems can cooperate with artificial neural networks developing hybrid methods, which have the potential to combine the advantages of both in a single framework (e.g. Dixon 2005). Such a method is the ANFIS (Adaptive Neuro-Fuzzy Inference System), which has been widely applied in hydrologic problems (e.g. Dahiya et al. 2007; Jalalkamali 2015). Analysis with ANFIS or similar methods requires many data because these models must be trained. The majority of the applications of fuzzy sets and logic in hydrology is focused on the use of ANFIS, which is a hybrid fuzzy-neural approach, and other machine learning techniques. However, even if the errors remain within an acceptable range, sometimes the rational and logical basis of the application is erroneous (Sen 2010; Spiliotis et al. 2018a, 2020). However, these approaches are very useful, since the hydrogeological systems are complex, and hence, it is very difficult to apply the physical-based models.
Fuzzy methodologies have also wide application in problems of multicriteria nature, often in combination with GIS techniques, as they can incorporate uncertainty and imprecision in the decision process (e.g. Kazakis et al. 2018; Kourgialas et al. 2018; Spiliotis et al. 2020). In addition, the conflict of the criteria creates a gray region during the decision. Several applications can be found in this topic, either based on the fuzzy multicriteria analysis (e.g. Kazakis et al. 2018; Spiliotis et al. 2020) or based on fuzzy multicriteria optimization (e.g. Tsakiris & Spiliotis 2006), since the problem of water resources management (a) involves several criteria and stakeholders and (b) there is an inherent uncertainty in the water cycle itself. Another important challenge is the application of fuzzy pattern recognition and its connection with multicriteria categorization methods (Xuesen et al. 2009; Spiliotis et al. 2020), since each point (e.g. alternative) could belong to several categories to some degree.
In this research, fuzzy multiple linear regressions based on Tanaka's approach (Tanaka 1987) are conducted in order to determine a fuzzy linear relationship between the standardized form of RDI (RDISt) and SPI, both of which are calculated for various reference periods, and WT. Fuzzy linear regression analysis is preferred in this application because fuzzy regression methods can be effective even though the sample size is very small (Bardossy et al. 1990; Kim et al. 1996; Kim & Bishu 1998). The fuzzy regression of Tanaka concludes with a constraint optimization problem where all data must be included in the produced fuzzy zone whose width is minimized. Observations of WT, rainfall and temperature are utilized. The area under investigation is an aquifer system in Xanthi Prefecture plain region (Greece).
CASE STUDY
The area under investigation is the aquifer system of the agricultural plain, at the southeast of the city of Xanthi in the Prefecture of Xanthi, NE Greece (Figure 1). From a geological perspective, the wider region of the study area belongs to the Tertiary extension of Rhodope massif, and borders with the Tertiary trough of Vistonida Basin (Pisinaras et al. 2012; Pliakas et al. 2015). The bedrock of the plain region appears in the north mountain massifs where the Tertiary granite has intruded the Rhodope metamorphic rocks. The aquifers are located in recent Quaternary deposits which transition from fine up to silty clay as approaching the lake (Sakkas et al. 1998). The study area is characterized by a significant variability of its alluvial composition, which results in a considerable variation of the hydrogeological characteristics of the aquifers. According to the thorough investigations of Sakkas et al. (1998), Pliakas et al. (2003) and Pliakas et al. (2005), the area around Nea Kessani is composed of an upper clayey sand formation with a thickness of 8–80 m of low permeability, an intermediate permeable formation with a thickness of 10–70 m consisting of gravel sand and a lower impermeable clayey silt formation extended to a depth of 30–90 m. The aquifers of the upper formation are unconfined and become semi-confined up to confined at the intermediate and deeper formation. While approaching Vafeika settlement, the area is divided into a shallow unconfined up to semi-confined aquifer system with a thickness of 30–40 m and an underlying confined aquifer system hosted in Plio-Pleistocene clayey sediments.
In general, the aquifers located to the upper geological formations of the study area are unconfined up to semi-confined, and become less permeable, with an SE or E direction, as the Kosynthos river moves away and Vistonida Lake approaches. Hydraulic conductivity (K) within a zone varies (up to 180 cm depth) from 10−4 to 10−6 m/s (Figure 2), while outside the zone, the hydraulic conductivity decreases down to 10−9 m/s (Pliakas et al. 2015), transmissivity (T) ranges from 10−4 to 10−2 m2/s and storage coefficient (S) ranges between 10−2 and 10−4 (Sakkas et al. 1998; Pliakas et al. 2003).
Monthly precipitation (P) and monthly temperature data (T0) are derived from the meteorological station of Genisea Agricultural Research, which was functioning until 2000. After 2000, the meteorological station of the Hydraulic Works Section of the Civil Engineering Department (Democritus University of Thrace, Greece) was installed at the same place. Drought indices are calculated based on monthly precipitation and temperature data available for the time period from 1996–1997 to 2015–2016 (in hydrological years, i.e. October–September). Then, the values of WT, for the time period from 1998–1999 to 2006–2007, are related to the values of drought indices of the common time period. The WT values have been recorded by two WT loggers installed near Nea Kessani settlement (altitude 10 m) and Vafeika settlement (altitude 19.5 m), which are considered representative points of the wider area (Pliakas et al. 2015).
MATERIALS AND METHODS
Meteorological drought indices
The normalization of SPI makes the index suitable for both wetter and drier climates and reliable to monitor wet as well as dry periods. Positive SPI values indicate greater than median precipitation, while negative values of index indicate less than median precipitation (Angelidis et al. 2012). Table 1 below presents the characterization of meteorological drought based on SPI values. For instance, extreme droughts (SPI ≤−2) have an event probability of 2.3% (Lloyd-Hughes & Saunders 2002). The same values are utilized in the case of using RDISt (Nalbantis & Tsakiris 2009).
SPI/RDISt values . | Characterization of drought conditions . | Probability (%) . |
---|---|---|
SPI ≥ 2.0 | Extreme wet | 2.3 |
1.50 ≤ SPI ≤ 1.99 | Severely wet | 4.4 |
1.00 ≤ SPI ≤ 1.49 | Moderately wet | 9.2 |
0.00< SPI ≤ 0.99 | Mildly wet | 34.1 |
−0.99 ≤ SPI ≤ 0.00 | Mild drought | 34.1 |
−1.00 ≤ SPI ≤ −1.99 | Moderate drought | 9.2 |
−1.50 ≤ SPI ≤ −1.99 | Severe drought | 4.4 |
SPI ≤ −2.00 | Extreme drought | 2.3 |
SPI/RDISt values . | Characterization of drought conditions . | Probability (%) . |
---|---|---|
SPI ≥ 2.0 | Extreme wet | 2.3 |
1.50 ≤ SPI ≤ 1.99 | Severely wet | 4.4 |
1.00 ≤ SPI ≤ 1.49 | Moderately wet | 9.2 |
0.00< SPI ≤ 0.99 | Mildly wet | 34.1 |
−0.99 ≤ SPI ≤ 0.00 | Mild drought | 34.1 |
−1.00 ≤ SPI ≤ −1.99 | Moderate drought | 9.2 |
−1.50 ≤ SPI ≤ −1.99 | Severe drought | 4.4 |
SPI ≤ −2.00 | Extreme drought | 2.3 |
Fuzzy linear regression analysis
Fundamentals of fuzzy logic and sets
Fuzzy logic is a logic to circumscribe vagueness. It can be used in various domains in which information is ambiguous and incomplete. The principles of fuzzy logic and sets are to simulate the way of human thinking where truth may appear as an inference from inaccurate or partial knowledge. It is a many-valued logic in which there are not only two truth values, as it happened in Aristotle's logical calculus, and thus, it can be considered a generalization of Boolean logic. In fuzzy set theory, the membership of an element x in a set is gradually assessed, which means that an element x can take all the membership values defined in unit interval [0,1]. In contrast, in classical set theory, an element x either belongs {1} or does not belong {0} to a given set. There are plenty of methodologies based on the principles of fuzzy logic and sets, which can be either autonomous (e.g. a set of fuzzy rules based on the Mamdani approach (e.g. Tayfur & Brocca 2015)) or hybrid where the uncertainty of complex issues can be incorporated through analysis. The fundamentals of fuzzy logic and sets applied in this research are presented briefly in the following text.
A fuzzy number defined on is a special kind of fuzzy sets with the following properties (Klir & Yuan 1995):
is a normal set, meaning that such that .
must be a closed interval .
the strong zero-cut, , which is called the support set of must be bounded.
The second property implies that a fuzzy number is a convex set as well. According to Equation (8) above, the strong 0-cut is an open interval and does not contain the boundaries. To have closed interval that contains the boundaries, Hanss (2005) introduced the worst-case interval W, which is the union of the strong 0-cut and the boundaries. The total fuzziness is taken into account when the strong 0-cut is used.
From the theorem of global existence for the maxima and minima of functions with many variables, it is known that if the domain of a real function is closed and bounded and the real function is continuous, then the function will have its absolute minimum and maximum values at some points in the domain (Marsden & Tromba 2003). Based on this theorem, it is evident that the α-cut for any real continuous function with real variables in this domain can be determined, given that the inputs are fuzzy triangular numbers (Tsakiris & Spiliotis 2016; Saridakis et al. 2020).
Fuzzy linear regression based on Tanaka's model
The level h denotes that the observation of WT, , is contained in the support set of the corresponding fuzzy estimate with a membership degree greater than h (Spiliotis et al. 2018b).
Checking the suitability of the fuzzy linear regression model
It is proved (Ubale & Sananse 2016; Papadopoulos et al. 2019) that the above measure G takes into account both the Euclidean distance between the central value and the left-right boundaries, and the (double) Euclidean distance between the central value and the observation for all points in time. It is highlighted that in the fuzzy least-squares model suggested by Diamond (1988), the measure G is used as a function which is minimized. Diamond applies its model when both the independent variables of the fuzzy regression and the dependent ones are fuzzy numbers. Tzimopoulos et al. (2016b), using STFNs, among others apply Diamond's model in the case of a pair of experimental data, which means that both independent and dependent variables are crisp numbers. The research of Tzimopoulos et al. (2016b) shows that the possibilistic model of Tanaka works better than Diamond's model in the case of crisp experimental data, while when a pair of fuzzy data is used, the fuzzy least-square model works very well. It is mentioned that in all cases, the output of the fuzzy linear regression models is a fuzzy number. Papadopoulos et al. (2019) use the G as an objective function into the fuzzy linear regression model of Tanaka (1987) thus taking additionally into account the inclusion constraints.
The closer is to unit, the better the fuzzy liner regression model. Since is a fuzzy number, it can be described by the left-right boundaries and its central value.
It can be proved that the right boundary of the will always be equal to unit (for α ≤ h) in the case of the use of the possibilistic model of Tanaka (1987). Because of the inclusion constraints, each observation will be included into the support of the corresponding fuzzy estimate for a selected h-level.
Hence, the result of each algebraic operation of Equation (19) concerning the includes, for α ≤ h, the corresponding . The right boundary of is determined by the maximum value (Equation (20)), which can be obtained when Equation (19) is performed. The maximum value of Equation (19) is the unit, since , which is always possible for α ≤ h, as aforementioned, according to the inclusion constraints of the Tanaka model. It is noted that Papadopoulos et al. (2019) use a similar suitability measure by taking into account the Euclidean distance between the observed value and the values of the produced fuzzy output (left-right boundaries and central value) in the case of strong zero-cut, and thus, the measure is a crisp number.
Another difference between and R2 is the use of the sample median in the denominator of . In the current study, the authors chose the sample median, , because there is a small sample size of discrete variables. Besides, the use of mean premises an assumption of a probability distribution, and thus, such an assumption is avoided. In addition, the sample median seems to better describe the center of the observations than the sample mean. This is because of the relatively high variance of the historical sample due to the occurrence of extreme values (WT values during drought). Hence, the sample mean affected by the extreme values is a little bit lower than the sample median. Αs presented in the following section ‘Results of the applied methodology’, the use of slightly improves -values.
Alternatively, multiple fuzzy linear regression, based on SPI, will produce another fuzzy output, and hence, another fuzzy measure will be estimated. To select the most suitable fuzzy regression model, a comparison (in terms of smaller or greater) between these two fuzzy measures (fuzzy numbers) has to be carried out. Therefore, a computationally efficient method of comparing fuzzy numbers (comparison measure S) (Nguyen 2017) is presented in Supplementary Material, Appendix B. The measure S is also used in order to compare produced by either using the or the sample mean. This measure is also used in the research of Papadopoulos et al. (2021) so as another fuzzified suitability measures to be developed.
RESULTS OF THE APPLIED METHODOLOGY
This study proposes a multiple fuzzy linear regression based-methodology for relating RDISt and SPI values to the WT in order to assess the groundwater response during or after drought periods in the aquifer system of case study. After the estimation of RDISt and SPI, a drought period was indicated during the hydrological years from 1999–1900 to 2001–2002. Analysis focuses on the groundwater behavior during and after drought. The WT values, chosen as dependent variables for the fuzzy linear regression model, correspond to the hydrological years from 1998–1999 to 2006–2007 (i.e. m = 9). It is noted that there is no WT record for the hydrological year 2003–2004 from the WT logger of Nea Kessani, and hence, this year is not taken into account in the fuzzy linear regression for this area (i.e. m = 8). The steps of the proposed methodology are the following:
- 1.
Calculation of monthly PET through the Thornwaite (1948) method based on the historical sample.
- 2.
Performance of Kolmogorov–Smirnov test for checking the goodness of fit of the ratio P/PET and P in the assumption of log-normal distribution and Gamma distribution, respectively.
- 3.
Based on Equations (1)–(6), SPI and RDISt are calculated for the reference periods k = 1, 2, 3 and 4 (RDISt is additionally calculated for 24 months).
- 4.
Multiple Fuzzy linear regressions based on Tanaka's model (Equation (12)) are applied between the RDIST/SPI values and the observed WT values both for Nea Kessani and for Vafeika.
- 5.
Calculation of suitability measure G based on Equation (18) and fuzzified suitability measure based on Equations (19) and (20).
- 6.
Based on comparison measure S (Supplementary Material, Appendix B), the fuzzy suitability measures , which correspond to the two fuzzy linear regressions resulted by the use of RDISt and SPI, respectively, are compared.
The results of the Equations (21) and (22), including suitability measures (G and ) and comparison measure (S) for the areas of Nea Kessani and Vafeika, are presented in Tables 2 and 3, respectively.
Fuzzy regression coefficients . | Objective function . | Suitability measures . | Comparison measure of . | ||||||
---|---|---|---|---|---|---|---|---|---|
. | . | . | . | J . | G . | . | . | . | S . |
RDISt | 6.9421 ± 0.1455 | 0.0935 ± 0.0935 | 1.1312 ± 0.1773 | 3.2700 | 5.1627 | 0.6974 | 0.9335 | 1.0000 | 0.9046 |
SPI | 6.9903 ± 0.2646 | 0.2246 ± 0.1151 | 1.0754 ± 0.0770 | 3.5037 | 5.7808 | 0.6536 | 0.9209 | 1.0000 | 0.8890 |
Fuzzy regression coefficients . | Objective function . | Suitability measures . | Comparison measure of . | ||||||
---|---|---|---|---|---|---|---|---|---|
. | . | . | . | J . | G . | . | . | . | S . |
RDISt | 6.9421 ± 0.1455 | 0.0935 ± 0.0935 | 1.1312 ± 0.1773 | 3.2700 | 5.1627 | 0.6974 | 0.9335 | 1.0000 | 0.9046 |
SPI | 6.9903 ± 0.2646 | 0.2246 ± 0.1151 | 1.0754 ± 0.0770 | 3.5037 | 5.7808 | 0.6536 | 0.9209 | 1.0000 | 0.8890 |
Fuzzy regression coefficients . | Objective function . | Suitability measures . | Comparison measure of . | ||||||
---|---|---|---|---|---|---|---|---|---|
. | . | . | . | J . | G . | . | . | . | S . |
RDISt | 14.5670 ± 0.2486 | 0.2820 ± 0.2820 | 0.5642 ± 0.0000 | 4.4985 | 9.3845 | 0.1674 | 0.8114 | 1.0000 | 0.7345 |
SPI | 14.5876 ± 0.2707 | 0.3510 ± 0.3268 | 0.5552 ± 0.0000 | 4.8502 | 10.5409 | 0.0654 | 0.7939 | 1.0000 | 0.7053 |
Fuzzy regression coefficients . | Objective function . | Suitability measures . | Comparison measure of . | ||||||
---|---|---|---|---|---|---|---|---|---|
. | . | . | . | J . | G . | . | . | . | S . |
RDISt | 14.5670 ± 0.2486 | 0.2820 ± 0.2820 | 0.5642 ± 0.0000 | 4.4985 | 9.3845 | 0.1674 | 0.8114 | 1.0000 | 0.7345 |
SPI | 14.5876 ± 0.2707 | 0.3510 ± 0.3268 | 0.5552 ± 0.0000 | 4.8502 | 10.5409 | 0.0654 | 0.7939 | 1.0000 | 0.7053 |
In Tables 2 and 3, fuzzy regression coefficient A1 (central value ± semi-width) corresponds to and , respectively. Likewise, fuzzy coefficient A2 corresponds to and (annual RDIST and annual SPI of the previous hydrological year), while fuzzy coefficient A0 corresponds to the constant term. The terms , and denote the left-right boundaries and the central value of the fuzzy measure , respectively. As aforementioned, comparison measure S is used in order to compare two fuzzy numbers (Supplementary Material, Appendix B). The last column of Tables 2 and 3 presents the results of the comparison between every of each multiple fuzzy linear regression (Equations (21) and (22)). The higher S-value denotes the greater fuzzy measure and, therefore, the more suitable fuzzy linear regression.
It is worth noting that if the constraints of Equation (17) were not established, then the semi-width of the coefficient A1 (which is related to the ) would have a value slightly greater than the central value, which indicates an overtraining behavior since it leads to irrational monotony.
The results of Equations (21) and (22), regarding Nea Kessani and Vafeika, are presented in both two and three dimensions. For illustration purposes, 2D graphs (Figures 4(a)–4(d) and 5(a)–5(d)) represent the fuzzy relationship between the dependent variable and independent variables separately. The 2D figures show clearly that all the primary WT data are included in the produced fuzzy band, while conventional regression is also depicted. 3D graphs (Figures 4(e), 4(f) and 5(e), 5(f)) illustrate the left-right boundaries (red-colored layers) and central value (blue-colored layer) of the fuzzy outputs.
The results of Tables 2 and 3 show that the fuzzy regression coefficients of the annual (k = 4) drought indices, concerning the previous hydrological year (i − 1), are significantly higher than the fuzzy regression coefficients of the drought indices, for the reference period k = 1 (October–December) of the examined hydrological year i, although the fuzzy coefficients of these are not negligible. In addition, objective function J and suitability measure G indicate that multiple linear regression based on RDISt values as independent variables has a better performance than the corresponding one based on SPI values, since they get their lowest value (bold values) both for the area of Nea Kessani and that of Vafeika. Based on the fuzzified measure , the fuzzy linear regression model, through the use of RDISt values as independent variables, is once again more accurate, since its gets the highest value in both areas. This is shown by comparison measure S, the highest values of which (bold values) imply the greater . Comparison between both of the with respect to Nea Kessani (Table 2) is represented in Figure 6(a). It is stated that in the case of the area of Nea Kessani, the coefficient of determination regarding crisp linear regression between WT–RDIST (Equation (21)) is equal to , and the adjusted coefficient of determination is equal to . As far as the area of Vafeika is concerned, the corresponding values are and .
The value of is improved when, in its denominator, the sample median rather than the sample mean is used. Table 4 presents the resulted -values with the use of sample mean, while their S-values are also presented. For space-saving reasons, this comparison (between the S-values of Table 4 and the corresponding ones (bold values) of Tables 2 and 3) is depicted only in the case of fuzzy regression between WT-RDISt with regard to WT data of the area of Nea Kessani (Figure 6(b)).
Fuzzy linear regression: . | Fuzzy measure D with the use of sample mean . | |||
---|---|---|---|---|
. | . | . | S-value . | |
Nea Kessani | 0.6891 | 0.9317 | 1.0000 | 0.9020 |
Vafeika | 0.1068 | 0.7977 | 1.0000 | 0.7151 |
Fuzzy linear regression: . | Fuzzy measure D with the use of sample mean . | |||
---|---|---|---|---|
. | . | . | S-value . | |
Nea Kessani | 0.6891 | 0.9317 | 1.0000 | 0.9020 |
Vafeika | 0.1068 | 0.7977 | 1.0000 | 0.7151 |
DISCUSSION POINTS
The authors select the use of fuzzy regression of Tanaka (1987) instead of conventional regression methodologies due to lack of data on the one hand and in order to incorporate the type of uncertainty caused by vagueness, and not by randomness, on the other hand. Tanaka's model is characterized by the inclusion constraints where all the observed data must be included in the produced fuzzy band. Hence, each fuzzy output can get intervals where one membership degree corresponds to each of them. The spread of this fuzzy band is minimized by the minimization of the objective function J.
WT data of two groundwater-level loggers (Pliakas et al. 2015) are related to RDISt and SPI values, and they are estimated for 3 and 12 months. Summarizing the results of the methodology's implementation, the following key points can be formulated:
Calculation of RDISt and SPI, based on monthly precipitation and temperature data for the hydrological years from 1996–1997 to 2015–2016, indicates a period of consecutive drought years starting from 1998–1999 to 2001–2002. Therefore, the data include some years with drought.
Fuzzy regression coefficient of the annual RDISt of the previous hydrological year gets the highest values in all cases, implying the significant influence of the meteorological drought of the previous hydrological year on WT values of January. However, the influence of the first trimester of the examined hydrological year i is not negligible. These results may be justified by the geology of the area. According to Pliakas et al. (2015), the upper unconfined up to semi-confined aquifers are located in an alluvial plain consisting of formations of fine sediments, which means that the natural recharge of groundwater takes place at a slower pace. Pliakas et al. (2015), through the use of statistical techniques, found a time lag, ranged from 2 to 4 months, regarding groundwater-level response on precipitation. The influence of the drought, as a creeping phenomenon, is more cumulative, while it is related to the conditions of the system below its standard condition.
In both of the two areas (Nea Kessani and Vafeika), RDISt values as independent variables are better related to WT records than SPI values, since fuzziness of the multiple linear regression (values of the objective function J and suitability measure G) decreases and the fuzzified suitability measure increases (according to its S-value) with respect to the corresponding ones produced based on SPI (Tables 2 and 3). Therefore, these results are in accordance with the assumption that RDISt is a suitable drought index for semi-arid climate, in which significant variability of temperature occurs within the hydrological year, and the precipitation is below the potential evapotranspiration.
According to the objective function J and the suitability measures (Tables 2 and 3), the multiple fuzzy linear regressions have better performance in the case of the area of Nea Kessani than in Vafeika. This fact is additionally justifiable because of the geology of the study area, since, according to the research of Pliakas et al. (2015), alluvial composition and thickness of the upper formations of the plain vary significantly. Particularly, the area of Nea Kessani consists of a clayey sand formation of low permeability, while approaching to the Kosynthos River, with an N-NW direction, geological formations become more permeable.
The use of a sample median instead of a sample mean in the denominator of the fuzzified suitability measure is preferred for this application in which discrete variables are used and the sample size is small. Thus, the assumption that the historical sample of WT follows a probability distribution is avoided. Furthermore, sample median is considered more representative in order to describe the center of the data set, since the observations of WT show high variability due to extreme values. -values are improved when sample median is used (Figure 6(b)).
The right boundary of the proposed measure of suitability is equal to one for every α ≤ h, where h is the selected level of the fuzzy linear regression model. This holds in the case of Tanaka's approach (Tanaka 1987) because of inclusion constraints and by taking into account the theorem of global existence for the maxima and minima of functions.
As pointed out, the case study takes place at an agricultural plain region, and therefore, every year there is a pumping volume that cannot be considered stable given the changes in conditions. This aspect could be taken into account in a future investigation.
CONCLUSIONS
The proposed research uses a fuzzy linear regression methodology based on Tanaka's model in order to produce a fuzzy linear relationship between meteorological drought and WT of an aquifer system located in an alluvial plain at the southeast of the city of Xanthi, Prefecture of Xanthi, NE Greece.
Drought analysis is carried out with the aid of RDISt and SPI drought indices where a period of four successive drought years is indicated. WT data of two areas (Nea Κessani and Vafeika) are utilized to study the groundwater response, during and after the drought period, by relating the independent variables (i.e. RDISt and SPI values) to a dependent variable (i.e. WT). The results highlight the significant influence of meteorological drought, of the previous hydrological year, on the WT value of January of the examined hydrological year. The geology of the study area, constituting various sections of fine sediments, seems to have a key role for this long-term response, especially in the area of lower permeability (i.e. Nea Kessani), for which the multiple fuzzy linear regression is improved. Therefore, the produced fuzzy linear relationship between WT and meteorological drought, which is best achieved regarding the WT values of January, concerns only the local climate and local hydrogeological conditions. A modified fuzzy regression model is proposed in this article based on the physical problem. Hence, the central value of the independent variable coefficients must be greater than the semi-width. Since the problem concludes with a linear programming problem, these constraints can be easily added.
To evaluate the solution, several suitability measures are used. In addition in this article, a fuzzified version similar to the coefficient of determination, R2, is proposed. The properties of the new measure were discussed accordingly.
By applying the proposed fuzzy regression, based on the aforementioned suitability measures, the RDISt-based model seems to be more suitable for drought analysis of local conditions in semi-arid climate, since its use as an independent variable improves the multiple fuzzy linear regression model. In addition, the fuzzy regression coefficients of the annual (k = 4) drought indices, concerning the previous hydrological year (i−1), are significantly higher than the fuzzy regression coefficients of the drought indices, for the reference period k = 1 (October–November–December) of the examined hydrological year i. Furthermore, the uncertainty increases in the case of extremely drought and extremely wet years.
CONFLICTS OF INTEREST
The authors declare no conflict of interest.
FUNDING
This research is co-financed by Greece and the European Union (European Social Fund – ESF) through the Operational Programme ‘Human Resources Development, Education and Lifelong Learning’ in the context of the project ‘Strengthening Human Resources Research Potential via Doctorate Research’ (MIS-5000432), implemented by the State Scholarships Foundation (ΙΚΥ).
DATA AVAILABILITY STATEMENT
Data cannot be made publicly available; readers should contact the corresponding author for details.