The present research deals with the important issue of assessing surface water quality ranking by combining the use of two advanced multivariate statistical techniques: Kohonen's self-organizing maps (SOM) and the Hasse diagram technique (HDT). The object of the study is the Mudan River of Mudanjiang city region, China. Samples were collected on a regular monthly basis in 2007–2011 from all sampling sites along the river, involving six major water quality parameters. The grouping of water parameters and the clustering of sampling events by the use of SOM has helped in their pre-processing for application of the HDT. The HDT orders clusters according to the pre-clustered water sampling events. The water quality was ranked against norms established by the Ministry of Environmental Protection of the People's Republic of China in order to assess in detail the water quality of the whole river system. The resulting map of the spatial and temporal changes in the water quality at each sampling site was specifically described by ArcGIS.

INTRODUCTION

Water quality is considered to be a key contributor to both health and the state of disease in humans (Zhao et al. 2012). Surface water quality ranking provides important information for developing efficient water quality management strategies. The assessment of surface water quality ranking is a complex problem which requires intelligent data analysis on various levels – analytical, spatial and temporal (Voyslavov et al. 2013). Most common methods for water quality ranking do not consider the uncertainties associated with water resources systems. Also, these methods only categorize the water quality monitoring stations into classes based on existing water quality, and do not provide any information about the quality of the water in each class (Nikoo & Mahjouri 2013). To reliably assess the surface water quality ranking, multivariate statistical treatment of long-term monitoring data set has been applied (Olkowska et al. 2014). The application of different multivariate statistical techniques, such as self-organizing maps (SOM) and the Hasse diagram technique (HDT), facilitates the interpretation of complex data matrices to better understand the status of the water quality in the water systems being studied (Varol et al. 2012). The SOM, which is based on an unsupervised neural network algorithm, has been used in powerful pattern analysis and clustering techniques and provides excellent visualization capabilities (López García & Machón González 2004). SOM uses linear dimensionality reduction methods to deal with nonlinearities, noisy or irregular data and high dispersion data, making it desirable when site-measured data are examined (Kohonen 2001a; Choi et al. 2014). The HDT is a different approach that endows a set with a partial order relation but without hiding descriptor effects (Restrepo 2008). It is a multi-criteria evaluation method which can be used as a tool to rank objects (Voigt et al. 2006).

The intention of this work was to integrate the advantages of the SOM and HDT techniques to assess the surface water quality ranking of the Mudan River. SOM is a reliable method of identifying the relationships between the water quality parameters. Its outstanding visual information clearly shows the grouping of parameters and the clusters of sampling events. Hasse diagrams analyze ranking relations between objects described by a given number of parameters (Kudłak et al. 2014). HDT assesses the relationships between the sampling events for various clusters and ranks them, revealing the best and worst qualities of the water in the samples, and the inconsistencies among them due to different information content. The resulting map of the spatial and temporal changes in the water quality of each sampling site is separately described by ArcGIS.

MATERIAL AND METHODS

Study area

The Mudan River is situated in southeastern Heilongjiang Province, China. It is the second largest tributary of the Songhua River, with a length of 725 km. Its source is on Moran Hill of Changbai Mountain, in Dunhua County, of Jilin Province. It flows into Lake Jingpo, and then continues north, flowing by Ning'an and Mudanjiang City, and into the Songhua River at Yilan County, Heilongjiang Province. The river has about 700 tributaries, the major ones including Hailang, Wusihun, Zhuerduo and Huangni. The Mudan River has a humid continental climate, with long, bitter winters and warm summers. The average temperatures range from a minimum of −22.6 °C in January to a maximum of 27.9 °C in July. Annual precipitation is 537 mm, most of it falling between May and September.

The area of the Mudan River Basin is surrounded by two ridges, one mountain and one plain: Zhangguangcai Ridge to the west, Laoye Ridge to the south and southeast, Wanda Mountain to the north, and Muxing Plain to the east. The elevation of most of the Mudan River Basin is between 600 and 1000 m, with the highest point at the top of Laotudingzi Hill on Zhangguangcai Ridge, where the elevation is 1686.9 m. The local soil in the basin is dark brown forest soil, which presents a vertical distribution of soil changing with terrain. The land-use types are mainly forest land, cultivated land, and built-up land. Cultivated land is distributed mainly in the middle and upper reaches, whereas built-up land is distributed mainly in the middle reaches, and the rest is mostly forest land.

The Mudan River is the main source of industrial, agricultural and domestic water on both sides of the river. In recent years, with the development of industry and agriculture in Mudanjiang Region, the population has increased. The Mudan River ecological environment suffers from serious industrial and domestic waste water pollution, accelerating the deterioration of river water quality. The main pollutants come from petrochemical, sugar refining, paper-making, metallurgy, machining, rubber and wood processing industries, among others.

The study area is shown in Figure 1.

Figure 1

Location of the sample sites on the Mudan River.

Figure 1

Location of the sample sites on the Mudan River.

Water samples

The 10 sample sites, which cover a large proportion of the Mudan River in the Mudanjiang Region, constitute the water quality monitoring network (Figure 1). Water samples were collected between 2007 and 2011. A frequency analysis of the water quality was carried out on sampling data collected on a regular monthly basis except in March, April, November and December, when ice floes on the river prevented sampling. The data were divided into three major periods: dry (January and February), normal (May, June and October) and wet (July–September).

Coding of the sampling events for analysis was separated into two parts: sampling sites and sampling time. The 10 sampling sites were code-named I to X; sampling times (year and month) were coded as four figures, yy mm. Thus ‘I1106’ refers to the samples collected at Laogulazi site in Jun 2011. The sample sites are described in Table 1, and located as shown in Figure 1.

Table 1

Description of the sample sites

Coding of sample sitesNames of sample sites
Laogulazi 
II Dianshita 
III Guoshuchang 
IV Hailang 
Jiangbin great bridge 
VI Chaihe great bridge 
VII Qunli 
VIII Sandao 
IX Daba 
Hualiangou 
Coding of sample sitesNames of sample sites
Laogulazi 
II Dianshita 
III Guoshuchang 
IV Hailang 
Jiangbin great bridge 
VI Chaihe great bridge 
VII Qunli 
VIII Sandao 
IX Daba 
Hualiangou 

After chemical analysis performed in accordance with standard analytical procedures, six categories of water quality parameter were selected: dissolved oxygen (DO); permanganate index (CODMn); total phosphorus (TP); total nitrogen (TN); hydrogen potential (pH); and ammonia nitrogen (NH3-N). The water quality norms were taken from environmental quality standards for surface water prepared by the Ministry of Environmental Protection of the People's Republic of China (Ministry of Environmental Protection of the People's Republic of China (MEPC) 2002). Basic statistics and surface water quality norms are listed in Table 2.

Table 2

Basic statistics and surface water quality norms

ParametersUnitsValue
Surface water quality norms
MinMaxMeanSDGrade 1Grade 2Grade 3Grade 4Grade 5
pH (−) 6.04 8.87 7.22 0.51  6–9 
DO mg/L 2.67 14.11 8.26 1.50 ≥ 7.5 
CODMn mg/L 3.28 13.03 5.89 1.33 ≤ 10 15 
NH3-N mg/L 0.013 1.935 0.411 0.327 ≤ 0.015 0.5 1.5 
TN mg/L 0.13 6.65 1.01 0.63 ≤ 0.2 0.5 1.5 
TP mg/L 0.002 0.594 0.072 0.066 ≤ 0.02 0.1 0.2 0.3 0.4 
ParametersUnitsValue
Surface water quality norms
MinMaxMeanSDGrade 1Grade 2Grade 3Grade 4Grade 5
pH (−) 6.04 8.87 7.22 0.51  6–9 
DO mg/L 2.67 14.11 8.26 1.50 ≥ 7.5 
CODMn mg/L 3.28 13.03 5.89 1.33 ≤ 10 15 
NH3-N mg/L 0.013 1.935 0.411 0.327 ≤ 0.015 0.5 1.5 
TN mg/L 0.13 6.65 1.01 0.63 ≤ 0.2 0.5 1.5 
TP mg/L 0.002 0.594 0.072 0.066 ≤ 0.02 0.1 0.2 0.3 0.4 

Self-organizing maps

The SOM is an automatic data analysis method, developed by Tuevo Kohonen in 1982 and sometimes called a Kohonen map (Kohonen 2001a). It was originally applied to image and sound analysis, but is now widely applied to clustering problems and data exploration in industry, finance, natural sciences and linguistics (Kohonen 2013). SOM can be considered as a nonlinear mapping technique without the rigid assumptions of linearity or normality required by traditional statistical techniques. Unlike other neural networks which are trained using supervised learning, the SOM approach does not need any specific expected output from the learning process (Voyslavov et al. 2012). SOM is a special class of neural network that is trained using unsupervised learning to convert high-dimensional input to low-dimensional to produce a ‘discrete map’ (Elghazel & Benabdeslem 2014) which consists of components (‘nodes’) such that the most similar nodes are close together, whereas less-similar nodes are further apart. The SOM consists of two layers: the input layer, connected to each vector of the data set, and the output layer, consisting of a two-dimensional network of units (Voyslavov et al. 2012; Russo et al. 2014). Data pass directly from the input layer to the output layer. Each node in the input layer, with a weight between 0 and 1, is completely connected to every node in the output layer. Weight values are initialized randomly (Kohonen 2001b). These weights establish a link between the input units and their associated output units (Park et al. 2014).

The SOM algorithm is based on the ‘winner-take-all’ rule (Voyslavov et al. 2012), which can be described as follows: when an input vector is chosen at random from the data set and presented to the network, the nodes in the output layer compete to be selected as the winning node, which is represented by the minimum Euclidean distance (Equation (1)) between the weight vector and the input vector. According to the sequential learning rules of SOM, the winning node and its neighborhood (the width of the neighborhood function as calculated by Equations (2) and (3)) update their weight vectors. Each node in the winning node's neighborhood has its weights adjusted to become more like the winning node, as predefined by Equations (4)–(6). The closer a node is to the winning node, the more its weight is altered (Kohonen 2001b; Buckland 2003). The learning process is continued until a stopping criterion is met; this typically occurs when the weight vectors stabilize or when a particular number of iterations are completed (Park et al. 2014). 
formula
1
 
formula
2
 
formula
3
 
formula
4
 
formula
5
 
formula
6
where V is the current input vector; W is the weight vector of the node; n is the number of weights; σ0 is the radius of the map; λ is the time constant; t is the current time-step; L is a small variable called the learning rate, which decreases with time; and Θ represents the amount of influence that a node's distance from the winning node has on its learning.

SOM was used to characterize the spatial distribution patterns of water quality in the sample sites. SOM approximates the probability density function of the input data and can be used for efficient clustering and visualization of high-dimensional data in a two-dimensional lattice (Kohonen 2001a; Park et al. 2014). In the present study, which used SOM to group water quality parameters and cluster sampling events, all SOM calculations were performed using the free SOM Toolbox 2.0 software (SOM-Toolbox 2005).

Data standardized calculation models

The standardized algorithm was finished by using two steps:
  • (1) Distributing the weight for every grade of the surface water quality norms (Ministry of Environmental Protection of the People's Republic of China (MEPC) 2002) and every group of water quality parameters using the Delphi method (Linstone & Turoff 1975).

  • (2) Calculating the average values of every cluster and standardizing them using the product of excess multiple and excess weight as the basic rule for the standardized calculation, and the different parameters using different equations. The five grades of water quality norms have the same interval value of 6–9 for the pH parameter; in the standardized calculation, the neutral value of pH (7) was the norm and the weight of Grade 1 was the excess weight. The standardized equation of pH is shown in Equation (7). Unlike the other parameters, a higher concentration of DO indicated better water quality; therefore the reciprocal of the average values and the normal value were used to calculate DO. The standardized expression for DO is given by Equation (8). The standardized equation for the other parameters is Equation (9).

 
formula
7
 
formula
8
 
formula
9
where S is the standardized value; V is the value of the parameter; is the normal value of the grade of exceeded norm; and is the weight of the grade of exceeded norm.

Hasse diagram technique

HDT is named after the German mathematician Helmut Hasse who popularized this visualization technique, although it had already existed before Hasse. It is an approach based on partially ordered sets (posets) that preserve important elements of the evaluation and decision-making processes (Voigt et al. 2006). A detailed description can be found in several publications (Münzer et al. 1994; Brüggemann & Carlsen (2006)) and in this paper only the relevant concepts will be mentioned.

Hasse diagrams contain two objects x, y belonging to each poset. The elements E (a, b, c, d, e,…) are described by a certain number of attributes q1(x), q2(x), …, qn(x) and q1(y), q2(y), …, qn(y), respectively. The order of objects x, y fulfills the following requirements: If qi(x) ≤ qi(y) for all i =1, 2, …, n, this is written as xy (Voigt et al. 2005); then y is located at the top and x at the bottom in a vertical plane and they are connected by a straight line. If qi(x) ≤ qi(y) not for all i, this is written as x || y; x and y are then said to be incomparable, and they are not connected by a line. This is repeated for every ordered pair – that is, for all pairs of two objects for which the ≤ relationship holds (Brüggemann & Patil 2011). In our applications the circles near the top of the graph indicate objects that are the ‘best’ objects according to the criteria used to rank them: these objects are not covered by other objects and are called maximal elements, or ‘maximals’ (Voigt et al. 2005).

In the present work, HDT was used to rank the surface water quality for all sampling events. All calculations concerning HDT were performed using the decision analysis software DART (DART 2012).

RESULTS AND DISCUSSION

The data set used for this study consists of 400 objects which included 150 objects (samples) from the normal period, 150 objects from the wet period and 100 objects from the dry period. Each object is described by six variables derived on a seasonal basis (Table 2).

All component planes and the U-matrix of the input data set are shown in Figure 2. Each plane of variables using a color-scale distribution and the relationship of certain objects are easily found by the distances between nodes in the U-matrix plane. The grouping of the component planes are defined based on their position and color scale (Voyslavov et al. 2012). Using the component planes, the water quality parameters are placed in well-defined groups, as confirmed by considering the component-plane ordering (Figure 3). It is readily seen that there are similarities in the component-plane ordering. There are four well-defined groups. The first (G1) contains NH3-N, TP and TN, for which the objects having high values are located in the lower part of the SOM plane. The second group (G2) consists of only one parameter, CODMn, for which the high-value objects are located in the lower left-hand part of the SOM plane. The third group (G3) also contains one parameter, DO, for which the high-value objects are located in the upper right-hand part of the SOM plane; and the fourth group (G4) contains only the water quality parameter pH, for which the high-value objects are located in the upper left-hand part of the SOM plane.

Figure 2

SOM for all sampling points and all quality parameters in 2007–2011.

Figure 2

SOM for all sampling points and all quality parameters in 2007–2011.

Figure 3

Component-plane ordering for all quality parameters in 2007–2011.

Figure 3

Component-plane ordering for all quality parameters in 2007–2011.

According to the Davies–Bouldin index, the clusters of all sampling events are formed in every period; these are shown in Figure 4. The interpretation of the combined Figures 2 and 4 is that the sampling events fall into 10 clusters:
  • (1) Cluster 1 (C1) – this cluster comprises 82 sampling events, predominantly in the normal and wet periods for the whole river, from 2007 to 2011. G4 is the primary division rule for this cluster. The characteristics of this cluster are that the river water is alkaline and the values of pH are higher than in any other clusters; the other five parameters have low values.

  • (2) Cluster 2 (C2) – this cluster comprises 44 sampling events: 10 in 2007 at sites I and II in May and X for the whole year; 12 events in the middle or upper reaches of the river in the normal and wet periods of 2009; 17 events for the whole river in the normal and wet periods of 2010; and only five events at sites III, VII, VIII and X in the normal period of 2011. G3 and G4 are the main divisions for this cluster. This cluster exhibits certain phenomena: the river water is alkaline and pH is lower than in C1 but higher than in the other clusters; the concentrations of DO are higher, but lower than in C3; the other four parameters have low values.

  • (3) Cluster 3 (C3) – this cluster comprises 29 sampling events: six events predominantly in the upper reaches of the river in August and at site X in the normal and wet periods of 2008; eight events in the lower reaches of the river in the dry and normal periods of 2009; 14 events in the upper reaches in the dry and wet periods and in the lower reaches in the dry period of 2010; and only one event at site X in June 2011. G3 and G4 are the principal divisions for this cluster, with G3 predominating. The common characteristics of the sampling events in this cluster are: the river water is neutral; the concentration of DO is higher than in any other clusters; and the other four parameters have low values.

  • (4) Cluster 4 (C4) – 47 sampling events, predominantly in the middle or lower reaches of the Mudan River in the normal and wet periods from 2007 to 2011. G3 and G4 are the main divisions for this cluster. The cluster has three characteristics: the river water is neutral; unlike C3, the concentration of DO is lower than in all other clusters; and the other four parameters have low values.

  • (5) Cluster 5 (C5) – 62 sampling events: 29 at all sites except VI in 2007; 16 events at the whole river in 2008; nine events in the upper reaches in the dry and normal periods of 2009; two events at sites VIII and IX in August 2010; and six events in the upper reaches in 2011. G1 is the main division for this cluster. The distinguishing characteristic of this cluster is: the concentrations of NH3-N, TN and TP are lower than in any other clusters; the water is acidic; DO and CODMn have a slightly low value.

  • (6) Cluster 6 (C6) – 41 sampling events: six events in the middle reaches in the normal period of 2007; only two events at sites II and III in September 2008; 14 events in the middle and lower reaches in 2009; six events, predominantly in the dry period in the middle and lower reaches, in 2010; and 13 events at all sites except VII in 2011. G2 is the main division for this cluster. This cluster has distinctive features: the concentration of CODMn is lower than in any other clusters; the river water is predominantly neutral; the DO has a slightly high value, but lower than in C3. NH3-N, TN and TP have a low values.

  • (7) Cluster 7 (C7) – 29 sampling events: 16 events at all sites except III, IV and VI in the dry and normal periods of 2008; only two events at sites in February and May 2009; three events in the middle reaches in 2010; and eight events in the middle and lower reaches in the dry period of 2010. G1, G3 and G4 are the main divisions for this cluster. The common points in this cluster were: the water has lower acidity than in all other clusters; as for C2 and C6, the concentration of DO is slightly higher, but lower than in C3; NH3-N, TN and TP are slightly higher, but lower than in C8 and C10; CODMn has low values.

  • (8) Cluster 8 (C8) – 18 sampling events, predominantly in the dry and normal periods in the lower reaches of the river. G1 and G2 are the main divisions for this cluster. The common characteristics of this cluster are: the concentrations of CODMn and TP are higher than in any other clusters; concentrations of NH3-N and TN are slightly higher, but lower than in C10; the river water is neutral; DO has low values.

  • (9) Cluster 9 (C9) – 30 sampling events, predominantly in the middle and lower reaches of the river. As for C8, G1 and G2 are the main divisions for this cluster. The characteristics are: the concentrations of NH3-N, TN and TP are slightly higher, but lower than in C8 and C10; the concentration of CODMn is slightly higher, but lower than in C8; the water is acidic; DO has low values.

  • (10) Cluster 10 (C10) – 18 sampling events, predominantly in the middle and lower reaches of the river. G1 is the main division for this cluster. The common points are: the concentrations of NH3-N and TN are higher than in any other cluster; TP is slightly higher, but lower than in C8; the river water is acidic; the other two parameters have low values.

Figure 4

Clustering of sampling events according to the water quality parameters in 2007–2011.

Figure 4

Clustering of sampling events according to the water quality parameters in 2007–2011.

The basic average values of the water quality parameters of each cluster are listed in Table 3.

Table 3

Average values for the water quality parameters of each cluster

ClusterspHDOCODMnNH3-NTNTP
C1 7.78 7.67 5.41 0.299 0.86 0.048 
C2 7.55 9.07 5.87 0.239 0.73 0.035 
C3 7.30 11.61 5.16 0.199 0.72 0.059 
C4 7.42 6.93 6.08 0.398 0.93 0.098 
C5 6.76 7.54 6.14 0.234 0.72 0.033 
C6 7.14 8.76 4.63 0.342 0.77 0.061 
C7 6.61 9.21 5.59 0.417 1.64 0.112 
C8 7.22 7.38 9.73 0.995 1.83 0.206 
C9 6.83 8.01 7.19 0.728 1.20 0.078 
C10 6.78 8.17 5.37 1.358 2.46 0.173 
ClusterspHDOCODMnNH3-NTNTP
C1 7.78 7.67 5.41 0.299 0.86 0.048 
C2 7.55 9.07 5.87 0.239 0.73 0.035 
C3 7.30 11.61 5.16 0.199 0.72 0.059 
C4 7.42 6.93 6.08 0.398 0.93 0.098 
C5 6.76 7.54 6.14 0.234 0.72 0.033 
C6 7.14 8.76 4.63 0.342 0.77 0.061 
C7 6.61 9.21 5.59 0.417 1.64 0.112 
C8 7.22 7.38 9.73 0.995 1.83 0.206 
C9 6.83 8.01 7.19 0.728 1.20 0.078 
C10 6.78 8.17 5.37 1.358 2.46 0.173 

In accordance with the result of the SOM, the average values for water quality parameters of each cluster were standardized using the data standardization calculation model. The weight of every grade of the water quality norms and all water quality parameters are listed in Tables 4 and 5. The standardized values for the water quality parameters in each cluster are shown in Table 6.

Table 4

Weight value for each grade of water quality norms

Grade of water qualityGrade 1Grade 2Grade 3Grade 4Grade 5
Weight value 0.01 0.04 0.15 0.3 0.5 
Grade of water qualityGrade 1Grade 2Grade 3Grade 4Grade 5
Weight value 0.01 0.04 0.15 0.3 0.5 
Table 5

Weight value for each water quality parameter

GroupsWeight for groupParametersWeight for parameter
G1 0.4 TN 0.15 
TP 0.15 
NH3-N 0.1 
G2 0.3 CODMn 0.3 
G3 0.2 DO 0.2 
G4 0.1 PH 0.1 
GroupsWeight for groupParametersWeight for parameter
G1 0.4 TN 0.15 
TP 0.15 
NH3-N 0.1 
G2 0.3 CODMn 0.3 
G3 0.2 DO 0.2 
G4 0.1 PH 0.1 
Table 6

Standardized values for the water quality parameters of each clusters

ClusterspHDOCODMnNH3-NTNTP
C1 0.00111 0.00000 0.01406 0.00995 0.02905 0.01391 
C2 0.00078 0.00000 0.01869 0.00592 0.01820 0.00725 
C3 0.00043 0.00000 0.01158 0.00329 0.01774 0.01974 
C4 0.00060 0.00082 0.00212 0.01651 0.03408 0.03880 
C5 0.00035 0.00000 0.00357 0.00563 0.01739 0.00630 
C6 0.00020 0.00000 0.00630 0.01280 0.02194 0.02055 
C7 0.00056 0.00000 0.01588 0.01780 0.02714 0.00498 
C8 0.00032 0.00016 0.09313 0.03963 0.06509 0.00483 
C9 0.00025 0.00000 0.02974 0.01822 0.03070 0.02880 
C10 0.00032 0.00000 0.01368 0.05371 0.11613 0.02931 
ClusterspHDOCODMnNH3-NTNTP
C1 0.00111 0.00000 0.01406 0.00995 0.02905 0.01391 
C2 0.00078 0.00000 0.01869 0.00592 0.01820 0.00725 
C3 0.00043 0.00000 0.01158 0.00329 0.01774 0.01974 
C4 0.00060 0.00082 0.00212 0.01651 0.03408 0.03880 
C5 0.00035 0.00000 0.00357 0.00563 0.01739 0.00630 
C6 0.00020 0.00000 0.00630 0.01280 0.02194 0.02055 
C7 0.00056 0.00000 0.01588 0.01780 0.02714 0.00498 
C8 0.00032 0.00016 0.09313 0.03963 0.06509 0.00483 
C9 0.00025 0.00000 0.02974 0.01822 0.03070 0.02880 
C10 0.00032 0.00000 0.01368 0.05371 0.11613 0.02931 

Water quality was then ranked by HDT using the results of SOM and the standardized data calculation model. The Hasse diagram for the 10 clusters is shown in Figure 5. The diagram has three levels: level 1 has three clusters (C3, C5 and C6) in which the water quality is better than on the other levels; level 2 has four clusters (C1, C2, C7 and C9) of intermediate water quality; and level 3 also has three clusters (C4, C8 and C10) of the worst water quality.

Figure 5

Hasse diagram for the 10 clusters. Equivalent classes with more than one object (C1; C2; C7), (C3; C5; C6).

Figure 5

Hasse diagram for the 10 clusters. Equivalent classes with more than one object (C1; C2; C7), (C3; C5; C6).

From the results of SOM and HDT, the resulting map was obtained by ArcGIS (Figure 6), which reflected the water quality level for every site in each month from 2007 to 2011. The interpretation of Figure 6 is as follows:

  • (1) Overall, the best water quality was in the upper reaches nearest the source of the Mudan River, the middle quality was in the lower reaches near the outlet of the river, and the worst water quality was in the middle reaches. The reason for this was that the middle reaches have more industrial enterprises and serious river pollution.

  • (2) For the dry period, the best ranking was in 2010 and the worst in 2008. The main polluted river sections were in the middle and lower reaches, and the influencing factors on the worst water quality were TN, NH3-N and TP in 2008; for the normal period, the water quality was best in 2009 and worst in 2011. The main pollution period occurred in October, and the influencing factors on the worst water quality were DO and CODMn in 2011; for the wet period, the water quality was best in 2007 and worst in 2011. The main pollution period occurred in September, and the influencing factors on the worst water quality were DO, CODMn and TP.

  • (3) In 2007, the lowest ranking of water quality was in January and February of the dry period and the highest ranking was in July of the wet period. In 2008, the best water quality was in September and the worst was in February. In 2009, the best water quality was in January of the dry period and the worst was in the wet period. In 2010, the water quality was best in the dry period, moderate in the normal period and worst in the wet period. In 2011, the best water quality was in June of the normal period and worst in September and October.
    Figure 6

    The ranking of sampling events in every month in 2007–2011.

    Figure 6

    The ranking of sampling events in every month in 2007–2011.

  • (4) According to the results obtained here, the type of the main pollution source had changed between 2007 and 2011. The water quality of the wet period was better than that of the dry period in 2007 and 2008, which illustrated that point source pollution was more serious during this period. The water quality of the dry period was better than that of the wet period from 2009 to 2011, which illustrated that remediation of point source pollution had achieved some initial success and that therefore non-point source pollution became the main pollution source during this period. Therefore, non-point source pollution control should now take priority to combat water quality deterioration, especially in the middle–upper reaches of the Mudan River Basin, which is the concentrating distribution region for cultivated land. In this region, protected forest area should be increased, and amounts of pesticide and fertilizer should be strictly controlled to reduce the quantity of pollutants entering the river through surface runoff. At the same time, point source pollution controls cannot be relaxed.

CONCLUSIONS

Integration of SOM and HDT was successfully applied to assess the ranking of surface water quality along the Mudan River. SOM is a reliable method of identifying the relationships between water quality parameters. Its outstanding visual presentation of information clearly showed the grouping of water quality parameters and the clusters of sampling events. Standardization of the water quality parameters in the clusters followed the water quality norms, using a data standardizing calculation model. HDT assessed the relationships between the sampling events in the various clusters and ranked them. The resulting map of the spatial and temporal changes in the water quality at each sampling site was specifically described by ArcGIS.

ACKNOWLEDGEMENTS

This research has been funded by the Major Science and Technology Program for water pollution control and treatment in China (2009ZX07207-008-5-2) and the State Key Laboratory of Urban Water Resources and Environment (Harbin Institute of Technology) (No. 2013DX09).

REFERENCES

REFERENCES
Brüggemann
R.
Patil
G. P.
2011
Partial order and Hasse diagrams. In:
Ranking and Prioritization for Multi-indicator Systems
.
Springer, Heidelberg
,
Germany
, pp.
13
23
.
Brüggemann
R.
Carlsen
L.
2006
Introduction to partial order theory exemplified by the evaluation of sampling sites. In:
Partial Order in Environmental Sciences and Chemistry
,
Brüggemann
R.
Carlsen
L.
(eds),
Springer, Heidelberg
,
Germany
, pp.
61
110
.
Buckland
M.
2003
Kohonen's Self Organizing Feature Maps. http://www.ai-junkie.com/ann/som/som1.html
.
Choi
B.-Y.
Yun
S.-T.
Kim
K.-H.
Kim
J.-W.
Kim
H. M.
Koh
Y.-K.
2014
Hydrogeochemical interpretation of South Korean groundwater monitoring data using self-organizing maps
.
Journal of Geochemical Exploration
137
,
73
84
.
DART
2012
.
Elghazel
H.
Benabdeslem
K.
2014
Different aspects of clustering the self-organizing maps
.
Neural Processing Letters
39
(
1
),
97
114
.
Kohonen
T.
2001a
The basic SOM. In:
Self-Organizing Maps
.
Springer, Heidelberg
,
Germany
, pp.
105
176
.
Kohonen
T.
2001b
Software tools for SOM. In:
Self-Organizing Maps
.
Springer, Heidelberg
, pp.
311
328
.
Kohonen
T.
2013
Essentials of the self-organizing map
.
Neural Networks
37
,
52
65
.
Kudłak
B.
Tsakovski
S.
Simeonov
V.
Sagajdakow
A.
Wolska
L.
Namieśnik
J.
2014
Ranking of ecotoxicity tests for underground water assessment using the Hasse diagram technique
.
Chemosphere
95
,
17
23
.
Linstone
H. A.
Turoff
M.
1975
The Delphi Method: Techniques and Applications
.
Addison-Wesley, Reading
,
MA
.
López García
H.
Machón González
I.
2004
Self-organizing map and clustering for wastewater treatment monitoring
.
Engineering Applications of Artificial Intelligence
17
(
3
),
215
225
.
Ministry of Environmental Protection of the People's Republic of China (MEPC)
2002
Quality Standard for Surface Water
.
Standards Press of China
,
Beijing
.
Nikoo
M. R.
Mahjouri
N.
2013
Water quality zoning using probabilistic support vector machines and self-organizing maps
.
Water Resources Management
27
(
7
),
2577
2594
.
Olkowska
E.
Kudłak
B.
Tsakovski
S.
Ruman
M.
Simeonov
V.
Polkowska
Z.
2014
Assessment of the water quality of Kłodnica River catchment using self-organizing maps
.
Science of the Total Environment
476–477
,
477
484
.
Restrepo
G.
2008
Assessment of the Environmental Acceptability of Refrigerants by Discrete Mathematics: Cluster Analysis and Hasse Diagram Technique. Dr. rer. nat. dissertation, Environmental Chemistry and Ecotoxicology, University of Bayreuth, Germany
.
Russo
T.
Scardi
M.
Cataudella
S.
2014
Applications of self-organizing maps for ecomorphological investigations through early ontogeny of fish
.
PLoS One
9
(
1
),
e86646
.
Voigt
K.
Brüggemann
R.
Pudenz
S.
Scherb
H.
2005
Environmental contamination with endocrine disruptors and pharmaceuticals: an environmetrical evaluation approach
.
EnviroInfo Brno
99
,
858
862
.
Voigt
K.
Brüggemann
R.
Pudenz
S.
2006
A multi-criteria evaluation of environmental databases using the Hasse diagram technique (ProRank) software
.
Environmental Modelling & Software
21
(
11
),
1587
1597
.
Voyslavov
T.
Tsakovski
S.
Simeonov
V.
2012
Surface water quality assessment using self-organizing maps and Hasse diagram technique
.
Chemometrics and Intelligent Laboratory Systems
118
,
280
286
.
Voyslavov
T.
Tsakovski
S.
Simeonov
V.
2013
Hasse diagram technique as a tool for water quality assessment
.
Analytica Chimica Acta
770
,
29
35
.
Zhao
Y.
Xia
X. H.
Yang
Z. F.
Wang
F.
2012
Assessment of water quality in Baiyangdian Lake using multivariate statistical techniques
.
Procedia Environmental Sciences
13
,
1213
1226
.