In groundwater quality assessments it is easier and more effective to reduce the number of parameters included in water quality indices. A total of 20 quaternary loose rock pore water and tertiary clastic rock cranny pore water data sets were used for Jilin City, China, as basic data, and 10 water quality parameters were selected for reduction using rough set theory and a statistical analysis of groundwater quality. Results showed that the quality of confined water was better than that of phreatic water in the study area. Confined water was of good quality, and met the permissible limits of the Quality Standards for Groundwater of China, with the exception of NH4+ and F. For phreatic water, the five parameters of total dissolved solids, NH4+, NO2, Fe, and F exceeded the permissible limits, with levels of NH4+ and Fe having a 70% and 40% rate of exceedance, respectively. The results indicated that water evaluation before and after attribute reduction was consistent, which suggests that through rough set theory redundant parameters in indices were eliminated but the accuracy of water quality classification remained effective, while the complexity of the calculation was reduced. Rough set theory provides a convenient and appropriate way to manage large amounts of water quality data.

INTRODUCTION

Water is essential for life, and the safety of drinking water is directly related to human health (Viala 2008). Water quality monitoring and assessment are important components of groundwater development and utilization (Sañudo-Fontaneda et al. 2014). Currently used methods of groundwater quality assessment include the fuzzy theory method (Dahiya et al. 2007; Pathak & Hiratsuka 2011), neural networks (Faruk 2010; Maiti et al. 2013), the support vector machine method (Safavi & Esmikhani 2013; Khader & McKee 2014), main components analysis (Pathak & Limaye 2011), and factor analysis (Pathak & Limaye 2011). Because of the large number of water quality indicators and diverse sources of pollution, whichever method is used will have difficulties processing the large amount of water quality monitoring data. Considering all the indicators and pollution sources, the identification of water quality can become not only inefficient but also ineffective. Therefore, methods to quickly and effectively streamline the data, while ensuring that the accuracy of water quality assessment results is maintained, have become the focus of many environmental studies (Beenen et al. 2011; Talalaj 2014). Scientific identification methods are attribute reduction and reduced dimensions. Rough set theory is one of the effective ways for reduction of data (Pawlak 1997).

Rough set theory was first proposed by Polish scientist Zdzisław Pawlak in 1982. It is a mathematical tool that manages incomplete and uncertain data (Deng et al. 2008). The theory uses the algebra equivalence relation and set operations (Wu & Gou 2011) to define the information system (IS) and the discernibility matrix, after attribute reduction acquires the decision rules and realizes the knowledge classification. Due to its capacity to fuzzy-solve multi-classification problems and its unique view of data analysis, it has acquired a high profile (Pai & Lee 2010; Sakai et al. 2012). The theory has been successfully used in data feature selection, feature extraction (Wang et al. 2012a), decision support and analysis (Kim et al. 2011; Yi et al. 2012), machine learning (Sakai et al. 2012), and data mining (Zhang et al. 2014); but there have been fewer applications in water resource quality assessments (Pai & Lee 2010; Li et al. 2012; An et al. 2014). Rational selection of a quality index for use in groundwater quality assessment is especially critical, and therefore it is necessary to introduce rough set theory for the attribute reduction of assessment indices. Groundwater sources of drinking water are limited in Jilin City, and industrial waste, garbage, pesticides, and fertilizer pollution pose potential threats to water quality (Wu et al. 2012). It is therefore necessary to evaluate the quality of groundwater in the study area. The methods currently used to assess groundwater quality for Jilin City include gray correlation analysis (Liu et al. 2013), hierarchical analysis (Liu et al. 2013), fuzzy comprehensive evaluation based on GIS (Fang et al. 2011), and the single index method (Wang et al. 2012b). The applications of these methods require the use of large numbers of indices and samples, and the assessment is therefore not quick or efficient. Therefore, the use of rough set theory is proposed for data reduction, which can be used to achieve effective water quality assessment.

MATERIALS AND METHODS

Study area

Jilin City is located in Jilin province, northeast China, and a large number of hydrogeological drillings have occurred around its urban area. After considering the distribution of 76 drillings, 10 representative drillings were selected for use in this study. The sampling locations are shown in Figure 1. Data for quaternary loose rock pore water (phreatic water, Q4) and tertiary clastic rock cranny pore water (confined water, Ns) were available for each borehole, with 20 data sets in total. Samples were analyzed based on the Water Quality Monitoring Standards of China (GB/T5750-2006). The pH values were measured on site using a portable pH meter, and the other indices were analyzed by inductively coupled plasma mass spectrometry (ICP-MS, Agilent 7500A), atomic absorption spectrometry (PerkinElmer Lambda750), and ultraviolet visible range spectrophotometry (Varian Carry50) in the Experimental Center of Testing Science. Ten representative parameters were selected for water quality assessment, including pH, total hardness (TH), total dissolved solids (TDS), Cl, SO42−, NH4+, NO2, NO3, Fe and F (Table 1).

Figure 1

Location of the study area and samples.

Figure 1

Location of the study area and samples.

Table 1

Selected chemical analysis results for groundwater samples

SamplesLayer bottom depths (m)pHTH (mg/l)TDS (mg/l)Cl (mg/l)SO42− (mg/l)NH4+ (mg/l)NO2 (mg/l)NO3 (mg/l)Fe (mg/l)F (mg/l)
Q4 22.67 8.30 45.00 513.0 7.09 0.00 0.97 0.000 0.00 0.00 0.14 
Ns 130.2 7.78 152.5 486.0 31.9 26.4 0.12 0.006 7.00 0.03 0.14 
Q4 25.09 7.40 162.5 1,203 3.55 5.00 0.02 0.000 0.08 0.11 0.32 
Ns 110.3 8.40 37.50 492.6 14.2 25.0 0.07 0.000 0.00 0.20 1.40 
Q4 29.73 8.20 70.00 222.9 1.77 2.40 0.04 0.052 0.00 0.20 0.00 
Ns 103.6 7.70 107.5 222.9 8.15 56.4 0.00 0.000 0.56 0.30 0.00 
Q4 28.32 7.09 137.5 194.8 8.15 0.00 0.67 0.000 0.00 1.92 0.00 
Ns 122.2 7.79 77.50 194.8 14.2 2.40 0.03 0.000 0.00 0.15 0.00 
Q4 26.00 7.50 210.0 262.0 3.55 12.0 0.67 0.000 0.00 0.50 0.20 
Ns 151.0 7.80 245.0 334.0 1.77 7.20 0.30 0.000 0.00 0.14 0.12 
Q4 24.50 7.75 267.5 397.4 3.55 4.80 0.30 0.000 0.00 0.40 0.04 
Ns 100.5 7.50 210.0 290.0 3.55 9.61 0.30 0.000 0.23 0.24 0.10 
Q4 24.90 7.30 340.0 446.0 86.9 24.0 0.97 0.014 0.00 0.26 0.04 
Ns 105.1 7.60 240.0 316.1 0.00 4.80 0.10 0.000 0.00 0.00 0.30 
Q4 16.40 7.45 157.5 250.0 8.86 4.80 1.45 0.000 0.00 0.00 0.02 
Ns 100.6 7.75 145.0 325.2 8.86 7.20 0.02 0.009 0.00 0.12 0.00 
Q4 20.30 8.00 162.5 666.0 144 50.8 1.40 0.004 2.35 0.00 3.00 
Ns 91.23 7.60 150.0 500.0 145 21.6 2.33 0.009 0.00 0.10 1.60 
Q4 13.60 7.00 57.50 196.7 7.09 14.4 0.00 0.000 0.00 0.54 0.50 
Ns 83.58 7.60 55.00 262.0 7.09 16.8 0.04 0.000 0.00 0.05 0.10 
SamplesLayer bottom depths (m)pHTH (mg/l)TDS (mg/l)Cl (mg/l)SO42− (mg/l)NH4+ (mg/l)NO2 (mg/l)NO3 (mg/l)Fe (mg/l)F (mg/l)
Q4 22.67 8.30 45.00 513.0 7.09 0.00 0.97 0.000 0.00 0.00 0.14 
Ns 130.2 7.78 152.5 486.0 31.9 26.4 0.12 0.006 7.00 0.03 0.14 
Q4 25.09 7.40 162.5 1,203 3.55 5.00 0.02 0.000 0.08 0.11 0.32 
Ns 110.3 8.40 37.50 492.6 14.2 25.0 0.07 0.000 0.00 0.20 1.40 
Q4 29.73 8.20 70.00 222.9 1.77 2.40 0.04 0.052 0.00 0.20 0.00 
Ns 103.6 7.70 107.5 222.9 8.15 56.4 0.00 0.000 0.56 0.30 0.00 
Q4 28.32 7.09 137.5 194.8 8.15 0.00 0.67 0.000 0.00 1.92 0.00 
Ns 122.2 7.79 77.50 194.8 14.2 2.40 0.03 0.000 0.00 0.15 0.00 
Q4 26.00 7.50 210.0 262.0 3.55 12.0 0.67 0.000 0.00 0.50 0.20 
Ns 151.0 7.80 245.0 334.0 1.77 7.20 0.30 0.000 0.00 0.14 0.12 
Q4 24.50 7.75 267.5 397.4 3.55 4.80 0.30 0.000 0.00 0.40 0.04 
Ns 100.5 7.50 210.0 290.0 3.55 9.61 0.30 0.000 0.23 0.24 0.10 
Q4 24.90 7.30 340.0 446.0 86.9 24.0 0.97 0.014 0.00 0.26 0.04 
Ns 105.1 7.60 240.0 316.1 0.00 4.80 0.10 0.000 0.00 0.00 0.30 
Q4 16.40 7.45 157.5 250.0 8.86 4.80 1.45 0.000 0.00 0.00 0.02 
Ns 100.6 7.75 145.0 325.2 8.86 7.20 0.02 0.009 0.00 0.12 0.00 
Q4 20.30 8.00 162.5 666.0 144 50.8 1.40 0.004 2.35 0.00 3.00 
Ns 91.23 7.60 150.0 500.0 145 21.6 2.33 0.009 0.00 0.10 1.60 
Q4 13.60 7.00 57.50 196.7 7.09 14.4 0.00 0.000 0.00 0.54 0.50 
Ns 83.58 7.60 55.00 262.0 7.09 16.8 0.04 0.000 0.00 0.05 0.10 

Principle of rough set theory

Rough set theory (Pawlak 1997) uses an IS to represent knowledge, which is usually expressed as a data sheet (Pawlak et al. 1995; Pawlak & Skowron 2007). The IS is expressed as 
formula
1
where U expresses the universe (a finite non-empty set with n elements, U = {x1, x2,…, xn}); A expresses the attribute set (a non-empty finite set with m attributes, A = {a1, a2,…,am}), and is divided into condition attributes (set ) and decision attributes (set ), , (Swiniarski & Skowron 2003); , and expresses a non-empty set of attribute values; , expresses an information function that maps an object in U to exactly one value in V.

Indiscernibility relation and approximation set

When certain objects in the IS cannot be distinguished accurately due to a lack of knowledge, the relation is referred to as an indiscernibility relation, and its essence is an equivalence class. Let , , we say that xi and xj are indiscernible by the set of attributes B in IS if for every (Dimitras et al. 1999), which corresponds to an indiscernibility relation as . represents the B-elementary set containing the objective .

The core concept of rough set theory is the use of an upper and lower approximation to describe the indiscernibility relation. Let and . The B-lower approximation of X, denoted by , and the B-upper approximation of X, denoted by , are defined as (Pawlak & Skowron 2007): 
formula
2
 
formula
3

Set X can be defined when , otherwise, set X cannot be defined in U, and is called a rough set.

Discernibility matrix and discernibility function

For an IS, the cardinal number of the indiscernibility relation for attribute set A in universe U is expressed as n = card(U|Ind(A)), the discernibility matrix is of the order n × n, and any element in the matrix is represented as follows: 
formula
4

Obviously, the discernibility matrix is symmetric. Usually only the part below the triangle is considered.

The discernibility function is a Boolean function, and is a Boolean conjunctive of non-empty elements in the discernibility matrix. Let , if , then the Boolean conjunctive is expressed as , denoted by , and the discernibility function is defined as 
formula
5

Its value is 1 when attribute set .

Core and reduction

Let , if , attribute ai is redundant for attribute set A, otherwise, ai is independent in A (Pawlak 1982). Let , when Ind(B) = Ind(A) at the same time B is independent, then B is a reduction of A, and can be used to describe A. Therefore, the attributes are a necessity of B and B retains the indiscernibility relation, thereby reducing data redundancy. It is clear that there may be multiple reduction, and the collection of all reductions is recorded as red(A).

The collection of all necessary attributes forms a core, denoted by core(A), and core(A) = ∩red(A).

RESULTS AND DISCUSSION

Statistical analysis

To determine the basic groundwater quality in the study area, a statistical analysis method (Li et al. 2012) was used to analyze the data as shown in Tables 2 and 3. The zero values were considered to be concentrations below the detection limits, and were used in the calculation of the mean and other coefficients.

Table 2

Statistical analysis results for confined water

Samples beyond limits
IndicesMaximumMinimumMeanDetection limitsStd DevCoefficient of variabilityPermissible limitsNumbersIn %
pH 8.4 7.5 7.8 0.01 0.249 0.032 6.5–8.5 
TH (mg/l) 245.0 37.5 142.0 1.00 73.681 0.519 ≤450 
TDS (mg/l) 500.0 194.8 342.4 1.00 112.724 0.329 ≤1,000 
Cl (mg/l) 145.35 0.00 23.50 0.15 43.759 1.862 ≤250 
SO42− (mg/l) 56.42 2.40 17.75 0.75 16.094 0.907 ≤250 
NH4+ (mg/l) 2.33 0.00 0.33 0.02 0.712 2.145 ≤0.2 10 
NO2 (mg/l) 0.009 0.000 0.002 0.003 0.004 1.643 ≤0.02 
NO3 (mg/l) 7.00 0.00 0.78 0.02 2.193 2.816 ≤20 
Fe (mg/l) 0.30 0.00 0.13 0.03 0.095 0.712 ≤0.3 
F (mg/l) 1.60 0.00 0.38 0.01 0.601 1.598 ≤1 20 
Samples beyond limits
IndicesMaximumMinimumMeanDetection limitsStd DevCoefficient of variabilityPermissible limitsNumbersIn %
pH 8.4 7.5 7.8 0.01 0.249 0.032 6.5–8.5 
TH (mg/l) 245.0 37.5 142.0 1.00 73.681 0.519 ≤450 
TDS (mg/l) 500.0 194.8 342.4 1.00 112.724 0.329 ≤1,000 
Cl (mg/l) 145.35 0.00 23.50 0.15 43.759 1.862 ≤250 
SO42− (mg/l) 56.42 2.40 17.75 0.75 16.094 0.907 ≤250 
NH4+ (mg/l) 2.33 0.00 0.33 0.02 0.712 2.145 ≤0.2 10 
NO2 (mg/l) 0.009 0.000 0.002 0.003 0.004 1.643 ≤0.02 
NO3 (mg/l) 7.00 0.00 0.78 0.02 2.193 2.816 ≤20 
Fe (mg/l) 0.30 0.00 0.13 0.03 0.095 0.712 ≤0.3 
F (mg/l) 1.60 0.00 0.38 0.01 0.601 1.598 ≤1 20 
Table 3

Statistical analysis results for phreatic water

Samples beyond limits
IndicesMaximumMinimumMeanDetection limitsStd DevCoefficient of variabilityPermissible limitsNumbersIn %
pH 8.3 7.0 7.6 0.01 0.449 0.059 6.5–8.5 
TH (mg/l) 340.0 45.0 161.0 1.00 93.573 0.581 ≤450 
TDS (mg/l) 1,202.6 194.8 435.1 1.00 311.590 0.716 ≤1,000 10 
Cl (mg/l) 143.57 1.77 27.40 0.15 48.227 1.760 ≤250 
SO42− (mg/l) 50.83 0.00 11.83 0.75 15.601 1.319 ≤250 
NH4+ (mg/l) 1.45 0.00 0.65 0.02 0.550 0.848 ≤0.2 70 
NO2 (mg/l) 0.052 0.000 0.007 0.003 0.016 2.321 ≤0.02 10 
NO3 (mg/l) 2.35 0.00 0.24 0.02 0.742 3.050 ≤20 
Fe (mg/l) 1.92 0.00 0.39 0.03 0.574 1.460 ≤0.3 40 
F (mg/l) 3.00 0.00 0.43 0.01 0.919 2.157 ≤1 10 
Samples beyond limits
IndicesMaximumMinimumMeanDetection limitsStd DevCoefficient of variabilityPermissible limitsNumbersIn %
pH 8.3 7.0 7.6 0.01 0.449 0.059 6.5–8.5 
TH (mg/l) 340.0 45.0 161.0 1.00 93.573 0.581 ≤450 
TDS (mg/l) 1,202.6 194.8 435.1 1.00 311.590 0.716 ≤1,000 10 
Cl (mg/l) 143.57 1.77 27.40 0.15 48.227 1.760 ≤250 
SO42− (mg/l) 50.83 0.00 11.83 0.75 15.601 1.319 ≤250 
NH4+ (mg/l) 1.45 0.00 0.65 0.02 0.550 0.848 ≤0.2 70 
NO2 (mg/l) 0.052 0.000 0.007 0.003 0.016 2.321 ≤0.02 10 
NO3 (mg/l) 2.35 0.00 0.24 0.02 0.742 3.050 ≤20 
Fe (mg/l) 1.92 0.00 0.39 0.03 0.574 1.460 ≤0.3 40 
F (mg/l) 3.00 0.00 0.43 0.01 0.919 2.157 ≤1 10 

It can be seen from Table 2 that among all the parameters considered for confined water, the coefficient of variation for NO3 was the largest, followed by NH4+, which indicates that NO3 and NH4+ had a higher degree of dispersion; concentrations of the parameters were non-uniform in the region. Cl, NO2, F, TH, TDS, SO42−, and Fe displayed a medium coefficient of variation, and the dispersion of the data was relatively small with more uniform concentrations throughout the region. The minimum coefficient of variation was observed for pH, and subsequently its degree of dispersion was the lowest of all parameters investigated. The results revealed that concentrations of NH4+ and F exceeded the permissible limits set in the Quality Standards for Groundwater of China (QSGC) (GB/T 14848-93) with the rate of exceedance being 10% and 20%, respectively. The groundwater in the study area was clearly polluted to some degree. Moreover, the mean concentration of NH4+ was 0.33 mg/l, which was 1.6 times above the drinking water limit, and therefore pretreatment is required when the groundwater is used for direct human consumption.

For phreatic water, the three largest coefficients of variation were found for NO3, NO2, and F, which all displayed a large degree of dispersion and non-uniform characteristics. TH, TDS, Cl, SO42−, NH4+, and Fe all displayed a medium coefficient of variation and the degree of dispersion of the data was relatively small with more uniform concentrations throughout the region. As with confined water, pH was the parameter with the lowest coefficient of variation, indicating that the degree of dispersion of pH was the smallest of all parameters investigated and that it was more evenly distributed in the region. The parameters of phreatic water that exceeded the drinkable limit included NH4+, Fe, TDS, NO2, and F. The rate of exceedance of NH4+ and Fe was 70% and 40%, respectively, while other parameters such as TDS, NO2 and F had a 10% rate of exceedance. The mean concentrations of TDS, NO2 and F were within the permissible limits, which indicate that groundwater mixing could reduce the concentrations of these materials, ensuring that the water was drinkable. For NH4+ and Fe, the mean concentrations were 0.65 and 0.39 mg/l, which exceeded the drinkable limit by 3.3 and 1.3 times, respectively. Groundwater mixing could reduce the concentrations slightly but would not bring it within the permissible limits.

The results of the statistical analysis revealed differences between confined water and phreatic water. Overall, the quality of confined water was slightly better than that of phreatic water. Two indices were exceeded in confined water, whereas five were exceeded in phreatic water. Although the concentrations of several materials (TH, TDS, NH4+, etc.) in phreatic water were higher than in confined water, the concentrations of other materials (Cl, SO42−, etc.) were higher in confined water than in phreatic water. It is unreasonable to judge water quality as good or bad purely by using ionic concentrations; a scientific classification method should be employed.

Attribute reduction

The steps involved in rough set attribute reduction can be summarized as follows:

  • (1) Set up an IS using the data sets.

  • (2) Data discretization based on QSGC.

  • (3) Build the discernibility matrix.

  • (4) Solve the discernibility function and attribute reduction based on Matlab software.

Initially an IS was built to support the water quality assessment. The parameters used in the water indices were used as attribute set A in the system (i.e., a1, a2, a3… as pH, TH, TDS, etc.). V was the collection of corresponding values for the parameters. The parameters and their values represented the universe in this IS.

Then water quality levels in terms of ion concentrations (Table 1) were ranked according to the QSGC (GB/T 14848-93). Groundwater was classified into five classes from good to poor (I–V) according to the standards (Table 4). Classes I to III indicate good water quality that is drinkable, class IV is poor quality water and is not suitable for human consumption without treatment, and class V is extremely poor quality water. Values were discretized (Tables 5 and 6), for use in constructing the discernibility matrix.

Table 4

Water quality classes based on QSGC

IndicesClass IClass IIClass III/Permissible limitsClass IVClass V
pH 6.5 – 8.5 6.5 – 8.5 6.5 – 8.5 5.5 – 6.5 or 8.5 – 9.0 <5.5 or >9.0 
TH (mg/l) ≤150 ≤300 ≤450 ≤550 >550 
TDS (mg/l) ≤300 ≤500 ≤1,000 ≤2,000 >2,000 
Cl (mg/l) ≤50 ≤150 ≤250 ≤350 >350 
SO42− (mg/l) ≤50 ≤150 ≤250 ≤350 >350 
NH4+ (mg/l) ≤0.02 ≤0.02 ≤0.2 ≤0.5 >0.5 
NO2 (mg/l) ≤0.001 ≤0.01 ≤0.02 ≤0.1 >0.1 
NO3 (mg/l) ≤2.0 ≤5.0 ≤20 ≤30 >30 
Fe (mg/l) ≤0.1 ≤0.2 ≤0.3 ≤1.5 >1.5 
F (mg/l) ≤1.0 ≤1.0 ≤1.0 ≤2.0 >2.0 
IndicesClass IClass IIClass III/Permissible limitsClass IVClass V
pH 6.5 – 8.5 6.5 – 8.5 6.5 – 8.5 5.5 – 6.5 or 8.5 – 9.0 <5.5 or >9.0 
TH (mg/l) ≤150 ≤300 ≤450 ≤550 >550 
TDS (mg/l) ≤300 ≤500 ≤1,000 ≤2,000 >2,000 
Cl (mg/l) ≤50 ≤150 ≤250 ≤350 >350 
SO42− (mg/l) ≤50 ≤150 ≤250 ≤350 >350 
NH4+ (mg/l) ≤0.02 ≤0.02 ≤0.2 ≤0.5 >0.5 
NO2 (mg/l) ≤0.001 ≤0.01 ≤0.02 ≤0.1 >0.1 
NO3 (mg/l) ≤2.0 ≤5.0 ≤20 ≤30 >30 
Fe (mg/l) ≤0.1 ≤0.2 ≤0.3 ≤1.5 >1.5 
F (mg/l) ≤1.0 ≤1.0 ≤1.0 ≤2.0 >2.0 
Table 5

Discretized IS for confined water according to QSGC classes

pHTHTDSClSO42−NH4+NO2NO3FeF
Samplesa1a2a3a4a5a6a7a8a9a10
II III II III 
II III II IV 
II III 
III II 
II II IV II 
II IV III 
II II III 
II II II 
II II II IV 
III 
pHTHTDSClSO42−NH4+NO2NO3FeF
Samplesa1a2a3a4a5a6a7a8a9a10
II III II III 
II III II IV 
II III 
III II 
II II IV II 
II IV III 
II II III 
II II II 
II II II IV 
III 
Table 6

Discretized IS for phreatic water according to QSGC classes

pHTHTDSClSO42−NH4+NO2NO3FeF
Samplesa1a2a3a4a5a6a7a8a9a10
II 
II IV II 
III IV II 
II IV 
II II IV IV 
III II II III III III 
II 
II III II II II II IV 
IV 
pHTHTDSClSO42−NH4+NO2NO3FeF
Samplesa1a2a3a4a5a6a7a8a9a10
II 
II IV II 
III IV II 
II IV 
II II IV IV 
III II II III III III 
II 
II III II II II II IV 
IV 

When considering the water quality classification by using the individual parameters as indices of water quality, for confined water, NH4+ was class IV in samples E and F, and F was class IV in samples B and J. Both could be used as drinking water after treatment. NH4+ was class V in sample J, and was not drinkable. In all other samples the parameters were class III or better, and therefore the water was considered suitable for drinking. For phreatic water, TDS indicated class IV in sample B, NH4+ was class IV in sample F, NO2 was class IV in sample C, Fe was class IV in samples E, F, and K, and F was class IV in sample J. All of these samples were drinkable after treatment. NH4+ was class V in samples A, D, E, H, and J, and Fe was class V in sample D, all of which were unsuitable for human consumption.

In the next stage a discernibility matrix of the discretized IS, based on formula (4), was established (Tables 7 and 8). The matrix was a symmetric 10 × 10 band matrix, with data only in the lower triangular section. The numbers in the matrix represent different items, indicating the discernibility relation between the samples in rows and columns; they can be used to distinguish these two samples. The discernibility function was calculated using formula (5), and employed an attribute reduction program based on Matlab R2011b software. The results of the calculations revealed that {a2, a3, a6, a7, a9} was one of the best reductions for the confined water IS, i.e., TH, TDS, NH4+, NO2, and Fe were the core parameters for confined water quality; while {a2, a3, a4, a6, a7, a9} was one of the best reductions for phreatic water IS, i.e., TH, TDS, Cl, NH4+, NO2, and Fe were the core parameters for phreatic water quality. The other parameters could be omitted because they represented information that was unnecessary for determining groundwater quality.

Table 7

Discernibility matrix for confined water

SamplesABCDEFGHJK
          
2, 3, 7–10          
2, 5–9 3, 5, 6, 9, 10         
2, 7, 8, 9 3, 10 5, 6, 9        
3, 6, 7, 8, 9 2, 6, 10 2, 3, 5, 6, 9 2, 3, 6       
6, 7, 8, 9 2, 3, 6, 9, 10 2, 5, 6 2, 6, 9 3, 9      
3, 7, 8 2, 9, 10 2, 3, 5, 6, 9 2, 3, 9 6, 9 3, 6, 9     
2, 3, 6, 8, 9 6, 7, 10 3, 5, 7, 9 3, 6, 7 2, 6, 7 2, 3, 6, 7, 9 2, 6, 7, 9    
2–4, 6, 8, 10 4, 6, 7, 9 3–7, 9, 10 3, 4, 6, 7, 9, 10 2, 4, 6, 7, 9, 10 2–4, 6, 7, 9, 10 2, 4, 6, 7, 10 4, 6, 9, 10   
2, 7, 8 3, 9, 10 5, 6, 9 2, 3, 6, 9 2, 6, 9 2, 3 3, 6, 7, 9 3, 4, 6, 7, 10  
SamplesABCDEFGHJK
          
2, 3, 7–10          
2, 5–9 3, 5, 6, 9, 10         
2, 7, 8, 9 3, 10 5, 6, 9        
3, 6, 7, 8, 9 2, 6, 10 2, 3, 5, 6, 9 2, 3, 6       
6, 7, 8, 9 2, 3, 6, 9, 10 2, 5, 6 2, 6, 9 3, 9      
3, 7, 8 2, 9, 10 2, 3, 5, 6, 9 2, 3, 9 6, 9 3, 6, 9     
2, 3, 6, 8, 9 6, 7, 10 3, 5, 7, 9 3, 6, 7 2, 6, 7 2, 3, 6, 7, 9 2, 6, 7, 9    
2–4, 6, 8, 10 4, 6, 7, 9 3–7, 9, 10 3, 4, 6, 7, 9, 10 2, 4, 6, 7, 9, 10 2–4, 6, 7, 9, 10 2, 4, 6, 7, 10 4, 6, 9, 10   
2, 7, 8 3, 9, 10 5, 6, 9 2, 3, 6, 9 2, 6, 9 2, 3 3, 6, 7, 9 3, 4, 6, 7, 10  

Note: the numbers 1, 2, 3…represent a1, a2, a3 ….

Table 8

Discernibility matrix for phreatic water

SamplesABCDEFGHJK
          
2, 3, 6, 9          
3, 6, 7, 9 2, 3, 6, 7         
3, 9 2, 3, 6, 9 6, 7, 9        
2, 3, 9 3, 6, 9 2, 6, 7, 9 2, 9       
2, 6, 9 3, 6, 9 2, 3, 6, 7, 9 2, 3, 6, 9 3, 6      
2, 4, 6, 7, 9 2, 3, 4, 6, 7, 9 2, 3, 4, 9 2, 3, 4, 6, 7, 9 2, 3, 4, 6, 7, 9 2, 4, 6, 7, 9     
2, 3 3, 6, 9 2, 6, 7, 9 2, 9 3, 6, 9 2, 3, 4, 6, 7, 9    
2–5, 7, 8, 10 3–10 2–10 2–5, 7–10 3, 4, 5, 7–10 3–10 2, 3, 5–10 3–5, 7, 8, 10   
3, 6, 9 2, 3, 9 6, 7, 9 6, 9 2, 6, 9 2, 3, 6, 9 2, 3, 4, 6, 7 2, 6, 9 2–10  
SamplesABCDEFGHJK
          
2, 3, 6, 9          
3, 6, 7, 9 2, 3, 6, 7         
3, 9 2, 3, 6, 9 6, 7, 9        
2, 3, 9 3, 6, 9 2, 6, 7, 9 2, 9       
2, 6, 9 3, 6, 9 2, 3, 6, 7, 9 2, 3, 6, 9 3, 6      
2, 4, 6, 7, 9 2, 3, 4, 6, 7, 9 2, 3, 4, 9 2, 3, 4, 6, 7, 9 2, 3, 4, 6, 7, 9 2, 4, 6, 7, 9     
2, 3 3, 6, 9 2, 6, 7, 9 2, 9 3, 6, 9 2, 3, 4, 6, 7, 9    
2–5, 7, 8, 10 3–10 2–10 2–5, 7–10 3, 4, 5, 7–10 3–10 2, 3, 5–10 3–5, 7, 8, 10   
3, 6, 9 2, 3, 9 6, 7, 9 6, 9 2, 6, 9 2, 3, 6, 9 2, 3, 4, 6, 7 2, 6, 9 2–10  

Note: the numbers 1, 2, 3…represent a1, a2, a3 ….

Assessment of water quality

Because the core parameters in indices of groundwater quality were known, the improved Nemerow index method (Yang et al. 2012) was used to evaluate groundwater quality after attribute reduction and without attribute reduction indices, as shown in Table 9. For both confined and phreatic water, the assessment results for attribute reduction and without attribute reduction were consistent, which indicates that the application of rough set theory to reduce the parameters used in indices does not change the results. After attribute reduction the water indices were substantially reduced, but remained the largest classification function to ensure the accuracy of water quality assessment results, and effectively reduced the complexity of the calculations.

Table 9

Assessment results of improved Nemerow index method

Water typeSituation of reductionABCDEFGHJK
Confined water With attribute reduction II III II II III III II IV II 
Without attribute reduction II III II II III III II IV II 
Phreatic water With attribute reduction IV III IV III II IV IV IV 
Without attribute reduction IV III IV III II IV IV IV 
Water typeSituation of reductionABCDEFGHJK
Confined water With attribute reduction II III II II III III II IV II 
Without attribute reduction II III II II III III II IV II 
Phreatic water With attribute reduction IV III IV III II IV IV IV 
Without attribute reduction IV III IV III II IV IV IV 

Based on the assessment results and the coordinates for each sample, a Kriging interpolation method was used to interpolate and produce five classes in the study area. The groundwater quality was visualized using ArcGIS 10 software, with the results presented in Figures 2 and 3. It can be seen that 90% of the confined water samples were of high quality (10% class I, 50% class II, and 30% class III), which is suitable for human consumption without any treatment. However, sample J was class IV and required pretreatment before drinking. For phreatic water, only 40% was suitable for drinking (20% class I and II, and 20% class III). Fifty percent of samples were class IV and required pretreatment before drinking. Sample D was class V and was unsuitable for drinking. As seen from the two figures, the quality of the confined water is superior to that of phreatic water.

Figure 2

Confined water classification maps.

Figure 2

Confined water classification maps.

Figure 3

Phreatic water classification maps.

Figure 3

Phreatic water classification maps.

CONCLUSIONS

By using rough set theory to reduce the number of parameters included in groundwater indices, and to spatially evaluate and analyze confined and phreatic water quality in Jilin City, the following conclusions were obtained.

Numerous chemicals affect groundwater quality, but some are redundant in water quality indices. Rough set theory reduced the number of chemical parameters used in indices from 10 to five for confined and from 10 to six for phreatic water. Results with attribute reduction were the same as those without attribute reduction, which suggests that through rough set theory redundant indices were eliminated but the accuracy of water quality classification remained effective, while the complexity of calculation was reduced. There are obvious advantages in the use of rough set theory to solve multi-classification problems, and greatly reduce the amount of unnecessary work undertaken in assessments. Large amounts of data are difficult to evaluate, and the range of complex information requires a lot of manpower and time, while burdening the neural networks and machine simulation calculations, and may even affect the findings. Therefore, it would be more convenient and efficient to use rough set theory to reduce the number of parameters included in water quality indices.

Confined water was mainly class II and III, which meets the permissible limits, with a small number of class IV samples that required pretreatment to make it drinkable. There was less class II and III water in phreatic water, whereas class IV water was widely distribution, and even some class V water was identified. This indicates that most phreatic water needed pretreatment before being used as drinking water, and some was unsuitable for drinking.

ACKNOWLEDGEMENTS

This work received financial support from the National Natural Science Funds (41172205) and the Scientific and Technological Project of Jilin Province (20100452). Financial support from the Water Resources Project of Jilin Province (0773-1441GNJL00390) and the Natural Science Funds of Jilin Province (20140101164JC) is also greatly appreciated.

REFERENCES

REFERENCES
Beenen
A. S.
Langeveld
J. G.
Liefting
H. J.
Aalderink
R. H.
Velthorst
H.
2011
An integrated approach for urban water quality assessment
.
Water Science and Technology
64
(
7
),
1519
1526
.
Dahiya
S.
Singh
B.
Gaur
S.
Garg
V. K.
Kushwaha
H. S.
2007
Analysis of groundwater quality using fuzzy synthetic evaluation
.
Journal of Hazardous Materials
147
(
3
),
938
946
.
Deng
J.
Du
Q.
Mao
Z.
Yao
C.
2008
Classification algorithm based on rough set and support vector
.
Journal of South China University of Technology (Natural Science Edition)
36
(
05
),
123
127
(
in Chinese
).
Dimitras
A. I.
Slowinski
R.
Susmaga
R.
Zopounidis
C.
1999
Business failure prediction using rough sets
.
European Journal of Operational Research
114
(
2
),
263
280
.
Fang
C. S.
Meng
H.
Shan
Y. S.
Dong
D. M.
Wang
J.
2011
Fuzzy comprehensive evaluation of groundwater quality based on GIS of Jilin province
.
Journal of Jilin University (Earth Science Edition)
41
(
1
),
293
297
(
in Chinese
).
Faruk
D. O.
2010
A hybrid neural network and ARIMA model for water quality time series prediction
.
Engineering Applications of Artificial Intelligence
23
(
4
),
586
594
.
Kim
I.
Chu
Y. Y.
Watada
J.
Wu
J. Y.
Pedrycz
W.
2011
A DNA-based algorithm for minimizing decision rules: a rough sets approach
.
IEEE Transactions on NanoBioscience
10
(
3
),
139
151
.
Liu
B.
Xiao
C. L.
Tian
H. R.
Qiu
S. W.
2013
Application of method combining grey relation with analytic hierarchy process for groundwater quality evaluation in Jilin City
.
Water Saving Irrigation
2013
(
1
),
26
29
(
in Chinese
).
Maiti
S.
Erram
V. C.
Tiwari
R. K.
Kulkarni
U. D.
Sangpal
R. R.
2013
Assessment of groundwater quality: a fusion of geochemical and geophysical information via Bayesian neural networks
.
Environmental Monitoring and Assessment
185
(
4
),
3445
3465
.
Pai
P. F.
Lee
F. C.
2010
A rough set based model in water quality analysis
.
Water Resources Management
24
(
11
),
2405
2418
.
Pathak
H.
Limaye
S. N.
2011
Study of seasonal variation in groundwater quality of Sagar City (India) by principal component analysis
.
Journal of Chemistry
8
(
4
),
2000
2009
.
Pawlak
Z.
1982
Rough sets
.
International Journal of Computer & Information Sciences
11
(
5
),
341
356
.
Pawlak
Z.
1997
Rough set approach to knowledge-based decision support
.
European Journal of Operational Research
99
(
1
),
48
57
.
Pawlak
Z.
Skowron
A.
2007
Rudiments of rough sets
.
Information Sciences
177
(
1
),
3
27
.
Pawlak
Z.
Grzymala-Busse
J.
Slowinski
R.
Ziarko
W.
1995
Rough sets
.
Communications of the ACM
38
(
11
),
88
95
.
Sakai
H.
Wu
M.
Nakata
M.
Slezak
D.
2012
Rough sets-based machine learning over non-deterministic data: a brief survey
.
Advanced Machine Learning Technologies and Applications
322
,
3
12
.
Sañudo-Fontaneda
L. A.
Charlesworth
S. M.
Castro-Fresno
D.
Andres-Valeri
V. C. A.
Rodriguez-Hernandez
J.
2014
Water quality and quantity assessment of pervious pavements performance in experimental car park areas
.
Water Science and Technology
69
(
7
),
1526
1533
.
Swiniarski
R. W.
Skowron
A.
2003
Rough set methods in feature selection and recognition
.
Pattern Recognition Letters
24
(
6
),
833
849
.
Wang
J.
Hedar
A. R.
Wang
S. Y.
Ma
J.
2012a
Rough set and scatter search metaheuristic based feature selection for credit scoring
.
Expert Systems with Applications
39
(
6
),
6123
6128
.
Wang
X. Z.
Zhang
H. Z.
Jia
Z.
Bao
Y. Z.
Jiang
W. H.
2012b
Application of single index method and fuzzy comprehensive method in water quality evaluation of Jilin City section of Songhua River
.
Environmental Science and Management
37
(
9
),
184
187
(
in Chinese
).
Wu
S. Z.
Gou
P. Z.
2011
Attribute reduction algorithm on rough set and information entropy and its application
.
Computer Engineering
7
,
56
58
,
61
(
in Chinese
).
Wu
Y. H.
Chen
S. Y.
Zhang
T. F.
Kang
J.
2012
Investigation and analysis of groundwater drinking water source protection areas in Jilin City
.
Ground Water
34
(
1
),
66
(
in Chinese
).
Yang
L. L.
Lu
W. X.
Huang
H.
Chu
H. B.
2012
Application of improved Nemerow pollution exponential method and fuzzy comprehensive evaluation method used in water quality assessment
.
Water Resources and Power
6
,
41
44
(
in Chinese
).
Yi
W. G.
Lu
M. Y.
Liu
Z.
2012
Variable precision rough set based decision tree classifier
.
Journal of Intelligent and Fuzzy Systems
23
(
2
),
61
70
.
Zhang
J.
Li
T.
Chen
H.
2014
Composite rough sets for dynamic data mining
.
Information Sciences
257
,
81
100
.