Abstract
One of the weaknesses of water resources management is the neglect of the nonstructural aspects that involve the most important relationships between water resources and socioeconomic parameters. Particularly, socioeconomic evaluation for different regions is crucial before implementing water resources management policies. To address this issue, 14 countries in the world that have continuous increasing trends of using renewable water per capita (RWPC) during 1998–2017 were used for the estimation of eight socioeconomic parameters associated with four key indicators (i.e., economy, demographics, technology communication, and health sanitation) by using four different data-driven methods, including artificial neural networks, support vector machines (SVMs), gene expression programming (GEP), and wavelet-gene expression programming (WGEP). The performances of the models were evaluated by using correlation coefficient (R), root-mean-square error (RMSE), and mean absolute error (MAE). It was found that the WGEP model had the best performance in estimating all parameters. The mathematical expressions for these socioeconomic parameters were explored and their potential to be expanded in different spatial and temporal dimensions was assessed. The derived equations provide a quantitative means for the future estimation of the socioeconomic parameters in the studied countries.
HIGHLIGHTS
The relationships between water resources and socioeconomic parameters were evaluated.
The mathematical equations of the hydro-socioeconomic parameters were explored.
Different data-driven methods were compared in the estimation of hydro-socioeconomic parameters and the best ones were determined.
INTRODUCTION
One of the main goals of all engineering disciplines is to create more prosperity for communities. Therefore, any decision that is away from the needs and interests of society will lose its value. All communities are dependent on water; water is needed for agricultural production, energy generation, health service, and industrial manufacture. The sustainability of the communities depends directly or indirectly on the quantity, quality, reliability, and affordability of water. The water resources and socioeconomic systems are well interconnected. On the one hand, decisions on water resources can create social challenges, and on the other hand, social behaviors can change the status of water systems.
A wide range of human sciences is needed to solve water issues, including economics (Langer 2020), behavioral and perceptual studies, decision-making, social values, community psychology, and politics. In many places, basic human water needs cannot be met. On the other hand, plenty of water is available in some places for human needs and industrial use. Both cases pose challenges for water resources management. Over many decades, efforts have been made to benefit society through effective water resources management.
In the last decade, data-driven soft computing methods such as artificial neural network (ANN), adaptive neuro-fuzzy inference system (ANFIS), gene expression programming (GEP), multivariate adaptive regression splines (MARS), M5 tree model, support vector machines (SVMs), random forest (RF), multi-linear regression (MLR), and hybrid wavelet methods have been successfully employed to address both water quality and quantity issues. Shabani et al. (2016) forecasted water demand of the City of Kelowna (CKD), Canada, using intelligent soft computing models and found that the GEP models were more sensitive to data classification, genetic operators, and optimum lag time than other intelligent soft computing models. Based on a review of 43 papers about the applications of the ANN method, Maier & Dandy (2000) concluded that ANNs have been increasingly used for the prediction of water resources. Mohammadrezapour et al. (2019) estimated monthly potential evapotranspiration in an arid region by using the SVM, ANFIS, and GEP models in Sistan and Baluchestan Province, Iran and indicated that the SVM, GEP, and ANFIS models, respectively, took the first, second, and third places in the estimation of monthly potential evapotranspiration. Roboredo et al. (2016) used an aggregate index of social-environmental sustainability to evaluate the social-environmental quality for a watershed in the southern Amazon. Soil, water, vegetation, socioeconomic, and social organization qualities were considered as indicators in their study. Pande & Sivapalan (2017) examined the human impacts on water resources and found that technology, economy, and trade were closely relevant to water sustainability. Li et al. (2019) demonstrated how socioeconomic development affected water quality in Tai Lake by analyzing population, per capita gross domestic production, and sewage discharge and their relationships with water quality.
Using various data-driven methods, Najafzadeh et al. (2018) estimated scour depth under clear water conditions in rectangular channels. Kisi et al. (2019) modeled the separation (transition) zone using the GEP, MARS, M5T, and DENFIS techniques. Surono et al. (2022) forecasted the air quality by using genetic algorithm-fuzzy k-medoids clustering (GA-FKM) and fuzzy k-medoids clustering particle swarm optimization (FKM-PSO). In addition, ZamanZad-Ghavidel et al. (2021) applied GEP models to 14 countries to determine the appropriate hydro-socioeconomic index (HSEI) for the evaluation of the sustainability of water resource systems. To improve the estimation of socioeconomic parameters for those 14 countries, different data-driven methods are used in this study. The main goal of this study is to determine the best data-driven methods to estimate the socioeconomic parameters for the future since few studies have been conducted to address this issue (Dong et al. 2019; Zhang et al. 2020). Artificial intelligence methods have been widely used in the field of water resources (Bozorg-Haddad et al. 2017) and some efforts have been made to use these methods to address the hydro-socioeconomic issues. Nowadays, it is particularly imperative to understand and determine the complex relationships between water resources and socioeconomic factors/parameters for future water resources management.
METHODOLOGY
Selection of key hydro-socioeconomic indicators and parameters
Interdisciplinary approaches are generally needed for managing water systems. The uncertainties in the status of future water resources and the response of a community to them make management more difficult. Given the interactions between the physical water system and the socioeconomic dynamics, effective water resources management is a complex process. For example, some of the questions that need an interdisciplinary answer are (Sivapalan et al. 2012) as follows:
How do social systems relate to water resources systems?
How do water resources decisions affect socioeconomic parameters?
Countries . | Average RWPC (cubic meters) . | Average GDP (US dollars) . | Average PD (people per sq. km) . |
---|---|---|---|
Albania | 10.16 | 3,044.45 | 108.56 |
Belarus | 6.00 | 4,330.60 | 47.60 |
Bosnia | 10.14 | 3,587.80 | 72.29 |
Bulgaria | 2.82 | 5,153.35 | 69.54 |
Croatia | 24.16 | 10,801.65 | 78.09 |
Estonia | 9.50 | 12,696.00 | 31.78 |
Georgia | 15.63 | 2,434.45 | 71.16 |
Hungary | 10.37 | 10,879.15 | 111.43 |
Latvia | 16.15 | 10,033.15 | 34.91 |
Lithuania | 7.69 | 10,240.20 | 51.12 |
Poland | 1.59 | 9,811.60 | 124.61 |
Romania | 10.16 | 6,387.40 | 90.90 |
Serbia | 22.13 | 4,359.55 | 83.84 |
Ukraine | 3.75 | 2,262.50 | 80.85 |
Countries . | Average RWPC (cubic meters) . | Average GDP (US dollars) . | Average PD (people per sq. km) . |
---|---|---|---|
Albania | 10.16 | 3,044.45 | 108.56 |
Belarus | 6.00 | 4,330.60 | 47.60 |
Bosnia | 10.14 | 3,587.80 | 72.29 |
Bulgaria | 2.82 | 5,153.35 | 69.54 |
Croatia | 24.16 | 10,801.65 | 78.09 |
Estonia | 9.50 | 12,696.00 | 31.78 |
Georgia | 15.63 | 2,434.45 | 71.16 |
Hungary | 10.37 | 10,879.15 | 111.43 |
Latvia | 16.15 | 10,033.15 | 34.91 |
Lithuania | 7.69 | 10,240.20 | 51.12 |
Poland | 1.59 | 9,811.60 | 124.61 |
Romania | 10.16 | 6,387.40 | 90.90 |
Serbia | 22.13 | 4,359.55 | 83.84 |
Ukraine | 3.75 | 2,262.50 | 80.85 |
Introduction to the selected socioeconomic parameters
In this study, RWPC is considered as an indicator of water resources status (i.e., hydro), while the socioeconomic parameters include GDPC, II, EI, HDI, PD, IU, MR, and RPPW (Figure 1). The distribution of the population in each country varies according to its natural parameters and characteristics. Therefore, the particular access to water resources plays an important role in PD. With the awareness of the water resources of each area, the facilities needed for residents can be estimated. The HDI is an indicator for the social evaluation of a society, which consists of life expectancy, education index, and II. This index is dependent on several main factors, including water resources (Sinha & Sengupta 2019). MR is an index for measuring the number of deaths. One of the causes of the disease is the lack of adequate water resources or their pollution, which have a great impact on the health of the people who live in such areas, Keshavarz et al. (2013) highlighted the great impact of water scarcity on the health of people who lived in two villages of Shiraz Province, Iran. The GDP is the total value of all finished goods and services produced in a country over a specific period, indicating the overall economic condition of the country. Another economic index used in this study is the II, which is obtained by dividing the gross national income (GNI) by the population of the country. Both economic indicators depend on the amount of water resources in the area. EI is another economic parameter used in this study, which is also dependent on the water resources. The number of people with IU is important in this regard because, as an educational tool, the internet can have a significant impact on people's awareness of water issues (Aerts et al. 2018). Table 2 lists the abbreviations and units of the selected socioeconomic parameters.
Parameters . | Abbreviations . | Units . |
---|---|---|
Renewable water per capita | RWPC | Cubic meters |
GDP per capita | GDP | US dollars |
Income index | II | Score |
Exports and imports | EI | US dollars |
Human development index | HDI | Score |
Population density | PD | People per sq. km |
Internet users | IU | Percent |
Mortality rate | MR | Deaths per 1,000 live births |
Population served with piped water | RPPW | Percent |
Parameters . | Abbreviations . | Units . |
---|---|---|
Renewable water per capita | RWPC | Cubic meters |
GDP per capita | GDP | US dollars |
Income index | II | Score |
Exports and imports | EI | US dollars |
Human development index | HDI | Score |
Population density | PD | People per sq. km |
Internet users | IU | Percent |
Mortality rate | MR | Deaths per 1,000 live births |
Population served with piped water | RPPW | Percent |
Different data-driven methods
Artificial neural network
Artificial neural networks are computational systems that are inspired by biological neural networks. ANN approaches include three main layers (i.e., input, output, and hidden layers). The Levenberg–Marquardt (LM) algorithm is one of the faster and more reliable back propagation (BP) algorithms. The detailed theory of ANNs can be found in Haykin (1998).
The characteristics of the ANN models can be summarized as follows:
Applied algorithm: The LM algorithm with three layers was applied for training of the ANN estimation models.
Functions of activation: The logsig, tansig, and pureline functions were applied for the necessary need nodes.
Determination of the neuron number: The trial-and-error method is the best way to determine the optimal number of neurons in the third layer of the ANN models (Barzegar et al. 2016). The ANN program code was written using the MATLAB in the current study.
Support vector machines
SVMs are powerful data-driven methods introduced by Vapnik (1995). The major advantages of SVMs over ANNs include their improved generalization ability, unique and globally optimal architectures, and the ability to be rapidly trained.
Gene expression programming
The GEP model is based on the Darwin's theory of natural selection. The fundamental steps of this model include (1) selecting the terminal dataset; (2) selecting the function set; (3) selecting the indicators of model evaluation; (4) determining the control components; and (5) determining the requirements/criteria to stop the program run. The GEP model has many advantages. One of the most important advantages of this approach is to generate the express tree and formalization, which can be very useful in the engineering fields (Ferreira 2006).
The characteristics of the GEP model developed in this study can be summarized as follows:
The functions set (F): Different mathematical functions are applied to compare and evaluate the estimation models:
The terminal set (T): The terminal set includes RWPC. Other characteristic parameters used in the GEP model include number of chromosomes = 30, head length h = 7, and genes per chromosome = 3 (function set defined in Genexprotools). Additional values were selected to link the sub-trees. In this study, the Genexprotools 4.0 was utilized to estimate the socioeconomic parameters.
Wavelet analysis
The augmented Dickey–Fuller (ADF) test (Dickey & Fuller 1979) is a unit root test for stationarity of a time series. The null hypothesis is defined as the presence of a unit root (i.e., non-stationary). In general, a p-value less than 5% implies that the null hypothesis can be rejected, and the time series is stationary. In this study, the ADF test is first performed on the time series of the existing parameters using the EVIEWS software. EVIEWS supports various types of information criteria. In this study, the Schwarz information criterion (SIC) is used for the ADF test. Moreover, to consider the 20-year data for each country in the macroeconomic series, a 20-year interval is defined for each country (i.e., 1–20 for the first country, 21–40 for the second country, and so on). Therefore, the lag length is 20. For instance, the first year of the second country must be defined for the software analysis as a new initiation of data for better recognition of pattern. Since the main purpose of this study is to predict socioeconomic parameters, the stationarity of data is very important. Thus, stationary time series have been used, as verified by the results of ADF test. According to the p-values shown in Table 3, all the time series of data used in this study have a p-value less than 0.005.
Parameter . | p-value . |
---|---|
RWPC | <0.001 |
GDP | 0.0018 |
II | 0.0215 |
EI | 0.0179 |
HDI | 0.0038 |
PD | <0.001 |
IU | 0.0228 |
MR | 0.0058 |
RPPW | 0.0186 |
Parameter . | p-value . |
---|---|
RWPC | <0.001 |
GDP | 0.0018 |
II | 0.0215 |
EI | 0.0179 |
HDI | 0.0038 |
PD | <0.001 |
IU | 0.0228 |
MR | 0.0058 |
RPPW | 0.0186 |
Assessment of model performance
RESULTS AND DISCUSSION
ANN, SVM, and GEP models
The three data-driven approaches (i.e., ANN, SVM, and GEP) were applied to estimate eight different socioeconomic parameters with consideration of economy, demographics, technology communication, and health sanitation. All the selected output parameters showed significant correlations with renewable water consumption per capita. Thus, with increasing RWPC, PD decreased because there was no need to focus on the population in a particular area to use water resources and the population was spread in different parts of the country. MR had an indirect relationship with RWPC, having access to adequate water resources and strengthening the agricultural sector, which could help reduce the majority of diseases. As a result, increasing the quantity of available water resources could improve people's health and reduce the number of deaths due to water-related diseases. Other selected parameters showed a direct relationship with RWPC. The ANN with the LM and one hidden layer was applied and the number of neurons of the hidden nodes, ranging from 1 to 10, was determined by applying the trial-and-error method. The numbers of neurons in the hidden layer of the models were 3, 2, 4, 4, 3, 3, 4, and 2 for GDPC, II, EI, HDI, PD, IU, MR, and RPPW, respectively. The activation functions of the hidden nodes of the ANN models were obtained by tangent sigmoid for all parameters. The activation functions of the output nodes were obtained by tangent sigmoid for HDI and IU, and linear functions for GDPC, II, EI, PD, MR, and RPPW parameters.
Aspects . | Parameters . | WGEP . | ANN . | SVM . | GEP . | ||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|
R . | RMSE . | MAE . | R . | RMSE . | MAE . | R . | RMSE . | MAE . | R . | RMSE . | MAE . | ||
Economy | GDPC | 0.857 | 0.188 | 0.133 | 0.706 | 0.259 | 0.195 | 0.739 | 0.249 | 0.192 | 0.763 | 0.24 | 0.177 |
II | 0.872 | 0.16 | 0.119 | 0.785 | 0.205 | 0.161 | 0.793 | 0.201 | 0.158 | 0.803 | 0.195 | 0.147 | |
EI | 0.876 | 0.167 | 0.108 | 0.677 | 0.259 | 0.191 | 0.709 | 0.243 | 0.175 | 0.745 | 0.227 | 0.156 | |
Demographics | HDI | 0.918 | 0.135 | 0.097 | 0.885 | 0.16 | 0.117 | 0.889 | 0.157 | 0.117 | 0.895 | 0.153 | 0.114 |
PD | 0.999 | 0.011 | 0.008 | 0.999 | 0.016 | 0.012 | 0.999 | 0.015 | 0.011 | 0.999 | 0.014 | 0.011 | |
Technology communication | IU | 0.934 | 0.172 | 0.134 | 0.831 | 0.238 | 0.187 | 0.867 | 0.235 | 0.187 | 0.877 | 0.228 | 0.174 |
Health sanitation | MR | 0.931 | 0.175 | 0.136 | 0.856 | 0.218 | 0.169 | 0.877 | 0.215 | 0.158 | 0.89 | 0.212 | 0.158 |
RPPW | 0.936 | 0.158 | 0.132 | 0.888 | 0.194 | 0.161 | 0.893 | 0.192 | 0.16 | 0.901 | 0.189 | 0.152 |
Aspects . | Parameters . | WGEP . | ANN . | SVM . | GEP . | ||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|
R . | RMSE . | MAE . | R . | RMSE . | MAE . | R . | RMSE . | MAE . | R . | RMSE . | MAE . | ||
Economy | GDPC | 0.857 | 0.188 | 0.133 | 0.706 | 0.259 | 0.195 | 0.739 | 0.249 | 0.192 | 0.763 | 0.24 | 0.177 |
II | 0.872 | 0.16 | 0.119 | 0.785 | 0.205 | 0.161 | 0.793 | 0.201 | 0.158 | 0.803 | 0.195 | 0.147 | |
EI | 0.876 | 0.167 | 0.108 | 0.677 | 0.259 | 0.191 | 0.709 | 0.243 | 0.175 | 0.745 | 0.227 | 0.156 | |
Demographics | HDI | 0.918 | 0.135 | 0.097 | 0.885 | 0.16 | 0.117 | 0.889 | 0.157 | 0.117 | 0.895 | 0.153 | 0.114 |
PD | 0.999 | 0.011 | 0.008 | 0.999 | 0.016 | 0.012 | 0.999 | 0.015 | 0.011 | 0.999 | 0.014 | 0.011 | |
Technology communication | IU | 0.934 | 0.172 | 0.134 | 0.831 | 0.238 | 0.187 | 0.867 | 0.235 | 0.187 | 0.877 | 0.228 | 0.174 |
Health sanitation | MR | 0.931 | 0.175 | 0.136 | 0.856 | 0.218 | 0.169 | 0.877 | 0.215 | 0.158 | 0.89 | 0.212 | 0.158 |
RPPW | 0.936 | 0.158 | 0.132 | 0.888 | 0.194 | 0.161 | 0.893 | 0.192 | 0.16 | 0.901 | 0.189 | 0.152 |
The bold values represent the best values for each criteria among different methods.
Wavelet analysis
The one-dimensional Daubechies-4 (db4) wavelet was used to decompose the data into subseries. The Daubechies-4 wavelet has been applied in many studies (e.g., Barzegar et al. 2016). In the current study, the number of data is 280. So, the level of wavelet decomposition is 2. The discrete db4 wavelet decomposed GDPC, II, EI, HDI, PD, IU, MR, RPPW, and RWPC parameters at level 2. The values of the A2, D2, and D1 analyses are shown in Table 5. For example, the values of the L-frequency A2 at level 2 for the economy-related parameters GDPC, II, and EI signals vary from −0.033 to +1.036, from −0.026 to +1.080, and from −0.041 to +1.058, respectively. The values of the H-frequency parts D (2 and 1), which contain the signal details, range from −0.206 to +0.152 and from −0.367 to +0.375 for GDPC.
Parameters | GDPC | II | EI | ||||||
Wavelet analyses | A2 | D2 | D1 | A2 | D2 | D1 | A2 | D2 | D1 |
Min | −0.033 | −0.206 | −0.367 | −0.026 | −0.168 | −0.368 | −0.041 | −0.346 | −0.455 |
Max | 1.036 | 0.152 | 0.375 | 1.080 | 0.175 | 0.383 | 1.058 | 0.343 | 0.516 |
Parameters | HDI | PD | IU | ||||||
Wavelet analyses | A2 | D2 | D1 | A2 | D2 | D1 | A2 | D2 | D1 |
Min | −0.012 | −0.154 | −0.363 | −0.110 | −0.157 | −0.432 | −0.059 | −0.163 | −0.349 |
Max | 1.082 | 0.181 | 0.383 | 1.024 | 0.152 | 0.366 | 1.086 | 0.186 | 0.353 |
Parameters | MR | RPPW | RWPC | ||||||
Wavelet analyses | A2 | D2 | D1 | A2 | D2 | D1 | A2 | D2 | D1 |
Min | −0.077 | −0.145 | −0.368 | −0.032 | −0.172 | −0.341 | −0.031 | −0.161 | −0.367 |
Max | 1.071 | 0.189 | 0.349 | 1.119 | 0.225 | 0.352 | 1.112 | 0.152 | 0.432 |
Parameters | GDPC | II | EI | ||||||
Wavelet analyses | A2 | D2 | D1 | A2 | D2 | D1 | A2 | D2 | D1 |
Min | −0.033 | −0.206 | −0.367 | −0.026 | −0.168 | −0.368 | −0.041 | −0.346 | −0.455 |
Max | 1.036 | 0.152 | 0.375 | 1.080 | 0.175 | 0.383 | 1.058 | 0.343 | 0.516 |
Parameters | HDI | PD | IU | ||||||
Wavelet analyses | A2 | D2 | D1 | A2 | D2 | D1 | A2 | D2 | D1 |
Min | −0.012 | −0.154 | −0.363 | −0.110 | −0.157 | −0.432 | −0.059 | −0.163 | −0.349 |
Max | 1.082 | 0.181 | 0.383 | 1.024 | 0.152 | 0.366 | 1.086 | 0.186 | 0.353 |
Parameters | MR | RPPW | RWPC | ||||||
Wavelet analyses | A2 | D2 | D1 | A2 | D2 | D1 | A2 | D2 | D1 |
Min | −0.077 | −0.145 | −0.368 | −0.032 | −0.172 | −0.341 | −0.031 | −0.161 | −0.367 |
Max | 1.071 | 0.189 | 0.349 | 1.119 | 0.225 | 0.352 | 1.112 | 0.152 | 0.432 |
WGEP models
Ranking . | ANN . | SVM . | GEP . | WGEP . |
---|---|---|---|---|
1 | PD | PD | PD | PD |
2 | HDI | HDI | HDI | HDI |
3 | RPPW | RPPW | RPPW | RPPW |
4 | II | II | II | II |
5 | MR | MR | MR | EI |
6 | IU | IU | EI | IU |
7 | EI, GDPC | EI | IU | MR |
8 | ___ | GDPC | GDPC | GDPC |
Ranking . | ANN . | SVM . | GEP . | WGEP . |
---|---|---|---|---|
1 | PD | PD | PD | PD |
2 | HDI | HDI | HDI | HDI |
3 | RPPW | RPPW | RPPW | RPPW |
4 | II | II | II | II |
5 | MR | MR | MR | EI |
6 | IU | IU | EI | IU |
7 | EI, GDPC | EI | IU | MR |
8 | ___ | GDPC | GDPC | GDPC |
Table 7 lists all the mathematical equations used in the models to estimate the socioeconomic parameters. The performances of the models for all selected socioeconomic parameters for the studied countries follow the following order: WGEP > GEP > SVM > ANN (Table 6). The WGEP model outperformed its simple form (i.e., GEP) for all eight parameters. The WGEP model improved the performance by 22, 18, 26, 12, 22, 25, 18, and 16% compared with the GEP model for GDPC, II, EI, HDI, PD, IU, MR, and RPPW, respectively.
Parameters . | Equations . | |
---|---|---|
GDPC | A2 | |
D2 | ||
D1 | ||
II | A2 | |
D2 | ||
D1 | ||
EI | A2 | |
D2 | ||
D1 | ||
HDI | A2 | |
D2 | ||
D1 | ||
PD | A2 | |
D2 | ||
D1 | ||
IU | A2 | |
D2 | ||
D1 | ||
MR | A2 | |
D2 | ||
D1 | ||
RPPW | A2 | |
D2 | ||
D1 | ||
For all parameters | Final equation Equation of A2 + Equation of D2 + Equation of D1 |
Parameters . | Equations . | |
---|---|---|
GDPC | A2 | |
D2 | ||
D1 | ||
II | A2 | |
D2 | ||
D1 | ||
EI | A2 | |
D2 | ||
D1 | ||
HDI | A2 | |
D2 | ||
D1 | ||
PD | A2 | |
D2 | ||
D1 | ||
IU | A2 | |
D2 | ||
D1 | ||
MR | A2 | |
D2 | ||
D1 | ||
RPPW | A2 | |
D2 | ||
D1 | ||
For all parameters | Final equation Equation of A2 + Equation of D2 + Equation of D1 |
Hts, GDPC, II, UR, EI, HDI, PD, IU, RPPW, and MR denote renewable water per capita (Hydro), GDP per capita, income index, unemployment rate, exports and imports, human development index, population density, internet users, proportion of rural population served with piped water, and mortality rate (under five years old), respectively.
According to Table 7, various operators have been used to increase the accuracy of the models, and these relations have been applied to quantify the dependance of socioeconomic sciences and water resources. In the GEP models, it is also possible to select simple mathematical equations to reduce the number of operators. But it should be noted that there is a possibility of reducing the accuracy of the proposed models (Bagatur & Onen 2018).
In addition, the results from this study highlighted the importance of examining the relationships between the status of water resource and socioeconomic parameters. This study indicated that water resources parameters had significant impacts on socioeconomic parameters (Sivapalan et al. 2012). WGEP had the best performance among all the data-driven models used for predicting the socioeconomic parameters in this study. In fact, the socioeconomic conditions of a country can be a good indicator that reflects the status of its water resources and also have a mutual relationship, which is very important for making decisions in the integrated management of water resources.
CONCLUSIONS
Water resources are important in terms of production and social, economic, and environmental values for a country. Socioeconomic considerations are needed to cope with the decreasing trend of water resources in many countries in recent decades and the increasing demand for water resources. The new contributions of this study include the following: (1) To the best of our knowledge, this is the first effort to jointly apply various data-driven methods, including artificial neural networks, SVMs, GEP, and WGEP for analyses of linked hydrologic and socioeconomic systems. (2) Different socioeconomic parameters, including GDPC, II, EI, HDI, PD, IU, MR, and population served with piped water (RPPW) were estimated by using RWPC as a representative parameter of water resources. (3) The potential to expand the mathematical relationships in different spatial and temporal dimensions was assessed. In this study, the relationship between water resources and socioeconomic parameters was modeled by data-driven methods and their performances were compared and assessed. This study indicated that the hybrid data-driven models based on the wavelet theory improved the performances of GEP models. It was demonstrated that the WGEP models had the best performance and the ANN models showed the poorest performance. Thus, it is possible to assess the socioeconomic status of a region/country by developing such models before implementing major water projects. The methods developed in this study can significantly improve the related water resources planning and management and also provide useful information for socioeconomic development. The main limitation of this study is the unavailability of data on all socioeconomic parameters that are likely to be strongly correlated with water resources. In the future research, other data mining models can be used to characterize the relationship between water resources and socioeconomic parameters. Other socioeconomic parameters that account for different environmental and/or political dimensions, such as the health index, happiness index, and employment rate can also be studied.
ACKNOWLEDGEMENTS
The authors thank Iran's National Science Foundation (INSF) for its financial support for this research.
DATA AVAILABILITY STATEMENT
All relevant data are included in the paper or its Supplementary Information.
CONFLICT OF INTEREST
The authors declare there is no conflict.