## Abstract

Planting trees on a floodplain along a river is a practical and ecological method for embankment protection. Optimization of wave break forest is also a new concept on wave attenuation studies. In this study, we carried out physical experiments to obtain fundamental data and proposed the Cluster Structure Preserving Based on Dictionary Pair for Unsupervised Feature Weighting model (CDUFW) for multi-objective wave break forest design. Physical experiments were designed with considering the effects of different planting configurations on wave attenuation in three scenarios: (1) the equilateral triangle arrangement with different row spacings; (2) different arrangements with the same density; (3) different tree shapes with the same row spacing. The physical experiment condition was typically defined according to the field research of the study area. Then, a multi-objective weighting model for wave break forest design optimization was based on the scheme set of physical experiment outputs using the proposed CDUFW model. Physical experiments showed that different arrangement modes take advantage of the wave attenuation effect of different forest widths. The CDUFW model performed well in finding the effective, economic and reasonable scheme. The proposed model is excellent in data mining and classification, and can be applied to many decision-making and evaluation fields.

## INTRODUCTION

Many ancient civilizations were born on the banks of rivers. Humans are certain to rely on water resources, but the excessive use of water resources also raises concerns. People have been paying more attention to the protection of the ecological environment of rivers rather than unlimited demands. Since the 1950s, runoff restoration has received increasing attention and construction of ecological runoff has become an international trend. Concrete dams and artificial revetments, which may lead to the loss of biodiversity and productivity, have been demolished in many countries (Auerbach *et al.* 2014; Wohl *et al.* 2015; Huq *et al.* 2017) and replaced by environmentally friendly ecologically specimen banks (ESBs) (Everaert *et al.* 2013; Küster *et al.* 2015; Movalli *et al.* 2017). Planting wave break forests on the floodplain is a type of ecological embankment which effectively combines wave prevention and ecosystem creation. It is generally believed that wave break forests should be planted in coastal zones. However, more and more studies focus on planting wave break forests on river floodplains, such as Huaihe River in China (Zhang & Song 2013), Noordwaard in the Netherlands (Reinout *et al.* 2010) and so on. Planting wave break forests on floodplains can not only reduce the destruction of levees by wind and waves and prevent levees from landslide and bank collapse, but also protect soil from erosion, maintain the balance of land and water ecosystems, and improve the environment of levees (Duarte *et al.* 2013; Temmerman *et al.* 2013).

The study of wave break forests has been a great interest in many aspects and covers various topics, including hydrodynamics of vegetated channels (Nepf 2012; Kumar 2014; Kalra *et al.* 2017), the relationship among flow alteration, riparian vegetation and ecosystems (Shafroth *et al.* 2017; Datta 2018), and wave attenuation by vegetation (Knutson *et al.* 1982; Kobayashi *et al.* 1993; Nepf 2012; Silinski *et al.* 2018), etc. Wave attenuation by vegetation has been investigated through both physical experiments and numerical models (Reinout *et al.* 2010; Anderson & Smith 2014; Peruzzo *et al.* 2018). Most of them focus on individual characteristics such as vegetation width, vegetation density (i.e. the amount of vegetation per square meter), vegetation flexibility, or water depth. It has been widely accepted that wave attenuation by vegetation is related to plant characteristics (width, density, and stiffness) and hydrodynamic conditions (water depth, wave period, and wave height). Wave heights are exponentially attenuated in plant regions, and the fastest subtractions occur in the first few to 10 meters of the plant belt (Knutson *et al.* 1982). The effect of unsubmerged plants on wave propagation is greater than that of submerged plants because the flow velocity of the upper part of the water is very high, and the stem of the unsubmerged plant occupies the whole water body, which will have a great influence on the upper water body compared with the submerged plant (Augustin *et al.* 2009). The higher density of vegetation and higher plant stiffness leads to higher wave energy dissipation (Möller *et al.* 2014; Silinski *et al.* 2018).

The configurations of vegetation can also influence the attenuation effect (Maza *et al.* 2015). The equilateral triangle arrangement, which has been diffusely used according to our field investigation of Nenjiang River, has seldom been studied and compared with other configurations. The difference of these arrangements on the effect of wave attenuation needs further study. It is necessary to elucidate specific studies on different engineering construction areas rather than studying the mechanism through experiments (Alexander & Allan 2006; Everaert *et al.* 2013). Furthermore, the available land area and funds are different for each project and there is a contradiction among the available investments, the available land area and the effect of the wave attenuation. Therefore, a specific optimization scheme of vegetation should be proposed for each project.

Compared with the subjective empowerment and the objective weighting method based on entropy theory, some machine learning theories, such as the spectral clustering and dictionary learning, may be more suitable for the wave break forest scheme optimization. There are few examples of the research on the layout of the wave break forest, hence, the subjective weighting method, such as expert weighting, may lead to great subjective arbitrariness. Moreover, the method of mutual information (MI) and its variants, as the development of information entropy, has been widely used in many fields (Alfonso *et al.* 2013; Vergara & Estévez 2014; Taormina & Chau 2015). The unsuitability of this method is that methods based on MI theory would have a preference on the attribute which has more possible values (Quinlan 1986) or which has fewer possible values (Quinlan 1993). However, the scheme set to be optimized is an incomplete scheme set of artificial design, and the numbers of values of different indicators are different. These methods may cause a large deviation from actuality. To circumvent the above problems, we introduced a novel method based on dictionary learning and spectral clustering. Dictionary learning can capture the intrinsic feature space of data by learning an over-complete dictionary (Zhu *et al.* 2016) and spectral clustering has been demonstrated (Zhao *et al.* 2010; Liu *et al.* 2016) as an effective approach to classify the data by exploring the clustering structure. It is widely accepted that the information from different views is relevant and complementary. Clustering structure information and intrinsic feature space characterize the data from different perspectives, respectively. To more accurately describe the data, we explore the complementary information between different views by combining spectral clustering and dictionary learning. Based on this motivation, we proposed the Cluster Structure Preserving Based on Dictionary Pair for Unsupervised Feature Weighting model (CDUFW).

The purpose of this paper is to put forward a method that can be used to optimize the design scheme of prodigal forest under multi-objective conditions. The physical experiments were projected on the Baidajie Dike in Nenjiang River basin of China and the CDUFW model was applied to the optimization of wave break forest design of the basin.

## EXPERIMENT SETUP

Physical experiments were established to determine the influence of vegetation configuration and tree shape on wave attenuation. Experiments simulated the situation of waves striking the wave break forest along the transverse direction of the river. Experiment elements such as water depth, wave height, and tree dimensions were all scaled to the experiment channel based on the actual situation of the research area.

### Study area

Baidajie Dike, which is located at the mainstream of Nenjiang River, is the largest distributary of Songhua River. Nenjiang River (45°27′–51°38′ N, 119°52′–126°30′ E) is the boundary river among Heilongjiang Province and Inner Mongolia Autonomous Region, Jilin Province in North-eastern China. It flows 1,370 km with a drainage area of 2.97 × 10^{5} km^{2}. The precipitation of Nenjiang River is mainly concentrated during June–September, accounting for 82% of the annual precipitation, which leads to a wide water surface, with the water level rising over the floodplain (Li *et al.* 2012). The mainstream of the Nenjiang River forks 10 km upstream of Baidajie Dike and joins together 10 km downstream of it. Baidajie Dike is on the left bank of the left distributary of Nenjiang River, named Tuoli River, in Tailai County, Qiqihar City. This segment of river meanders and zigzags with wetlands and yoke lakes on the floodplain and the average inclination of the slopes is 0.6‰. The flood prevention standard of Baidajie Dike was 30-year reoccurrence before 2016 and is expected to be raised to 50-year reoccurrence in the next few years. The sketch of Baidajie Dike is shown in Figure 1 and the hydrologic characteristics and wave features of Baidajie Dike are shown in Table 1. The water depth of floodplain in Table 1 is the average water depth of floodplain of the designed flood with 2% frequency. The average wave height and the average wave period were calculated by the Putian experimental station formula, which is recommended by Specification JTS145-2-2013 of China. By this specification, the one-tenth wave height and the significant wave height were then calculated according to the average wave height and the average wave period. Poplars and willows are discretely planted on the floodplain along Baidajie Dike. The total length of vegetation along the direction of the river on the floodplain of Baidajie Dike is 2 km and the current vegetation area on the floodplain is 8–14 m wide. The spacing between trees is about 2.5 m, in equilateral triangle configuration according to our field investigation.

Water depth of floodplain (m) | Average wave height (m) | One-tenth wave height (m) | Significant wave height (m) | Average period (s) |
---|---|---|---|---|

2.83 | 0.57 | 1.157 | 0.91 | 3.01 |

Water depth of floodplain (m) | Average wave height (m) | One-tenth wave height (m) | Significant wave height (m) | Average period (s) |
---|---|---|---|---|

2.83 | 0.57 | 1.157 | 0.91 | 3.01 |

### Initial condition validation

*i*th measured wave height value. For irregular wave runs, the significant wave height is calculated according to the statistical data of the wave height, which is described by the following expression: where is the value of wave height counted by zero-up crossing method.

### Experiment scheme

After the incident wave condition was validated, the wave attenuation effects of different schemes were studied. Experiments were conducted to investigate the effects of row spacing, arrangement and tree shape on the wave attenuation of the wave break forest separately. Four kinds of simulation tree with different tree heights, trunk radiuses and crown radiuses were made based on the actual condition of vegetation on the flood plain of Baidajie Dike, along with the consideration of the changes in plant growth (Table 2). Planting costs for each shape of tree are also listed in Table 2.

Tree shape | Trunk height (m) | Crown height (m) | Trunk radius (m) | Crown radius (m) | Planting cost C (CNY/individual plant) |
---|---|---|---|---|---|

T1 | 0.26 | 0.35 | 0.025 | 0.25 | 100 |

T2 | 0.26 | 0.35 | 0.015 | 0.25 | 50 |

T3 | 0.26 | 0.35 | 0.015 | 0.17 | 50 |

T4 | 0.16 | 0.08 | 0.007 | 0.13 | 3 |

Tree shape | Trunk height (m) | Crown height (m) | Trunk radius (m) | Crown radius (m) | Planting cost C (CNY/individual plant) |
---|---|---|---|---|---|

T1 | 0.26 | 0.35 | 0.025 | 0.25 | 100 |

T2 | 0.26 | 0.35 | 0.015 | 0.25 | 50 |

T3 | 0.26 | 0.35 | 0.015 | 0.17 | 50 |

T4 | 0.16 | 0.08 | 0.007 | 0.13 | 3 |

Experiment scenarios of different row spacings (Figure 2) were carried out with the width of vegetation ranging from 1 to 5 m to determine the wave attenuation effect of equilateral triangle arrangement. Trees of type T3 were taken as the experimental tree. If the difference of the two runs of a certain experiment scheme exceeded 5%, the third run would be taken and the average of the closest two values was taken as the final value.

Then, experiments of three different configurations (i.e. equilateral triangle, square and quincunx) of vegetation were carried out, taking T3 as the experimental tree too. The sketch of different configurations is shown in Figure 3. The space between two trees was 25 cm in equilateral triangle, square configurations and 30 cm in quincunx mode. Under these three configurations, 18, 16 and 18 trees can be planted in a square of 10 × 10 cm. This makes the density of vegetation approximately equal under the three arrangements. Because rigid plants are much larger than flexible plants, it is almost impossible to make the density of rigid vegetation in different arrangements completely consistent as in the studies of flexible plants. Each kind of configuration with different widths of vegetation ranging from 1 to 5 m was measured to determine the difference among them. Each scheme was run twice and the average value was taken if the deviation of the two detected values was within 5%. Another one run was taken if the deviation of the two detected value exceeded 5%.

The influence of different kinds of simulation tree was then experimented under the equilateral triangle configuration with a plant space of 25 cm. Different forest widths ranging from 1 to 5 m were also taken into consideration and at least two runs for each circumstance ensured the quality of the experiment.

## OPTIMIZATION MODEL BASED ON CDUFW

The physical experiments above provide a certain wave attenuation coefficient for a certain configuration with certain forest width and tree shape in each scheme. On the one hand, these schemes vary in investment and land use. On the other hand, the provided investment and the expected wave attenuation effect are different in each project. So, it is necessary to carry out a multi-objective optimization according to the actual situation. Multi-objective evaluation usually gives weight to different indexes first, and then gives each scheme a score by some weighted calculation method. An adaptive weighting model that learns the importance of data features based on dictionary learning and clustering is introduced in this section. This method can be used to obtain the importance of different factors, so as to select the optimal solution.

### Preliminaries of CDUFW

*k*clusters and the clustering model under the framework of matrix factorization using the prevalent relaxation calculation method (Tang & Liu 2012) is as follows: where is representation matrix. is the cluster indicator matrix and each entry in indicates whether . If ; otherwise, . The cluster indicator matrix reveals the discriminative information of data.

**A**.

### Algorithms of CDUFW

Clustering structure can indicate affiliation relations of samples and dictionary learning can reflect the data distribution with fewer constraints. The label space and sample space are different descriptions of the same data set. The coupled dictionary learning separately based on samples and pseudo-labels generated by spectral clustering was used to explore the intrinsic data information. In the coupled dictionary learning, it was expected to minimize the error between the two dictionary learnings continuously. Then, the projection matrix can be obtained, which reveals the importance of each feature. Thereunder, a novel unsupervised feature empowerment algorithm CDUFW was proposed.

*p*is the size of intrinsic feature space. and are representation dictionaries to describe the data distribution where

*k*is the number of clusters the instances are partitioned into. It can be seen that the model developed by Equation (8) consists of two parts, including dictionary learning and spectral clustering .

There are four variables, , involved in Equation (8) to be optimized. A selective minimization strategy was used to solve this problem. To ensure that the objective function in Equation (6) was convex, only one variable was updated at one time and others were fixed. Then, the updating process is iterative until the model converges. The iteration steps are shown in Appendix B (available with the online version of this paper).

**E**which reveals the importance of each feature was calculated as follows: where is a small-valued variable to avoid anomalies in the inversion of the matrix and is defined as 10

^{−6}. The weight of each feature can then be defined as: where is the

*i*-th row of

**E**.

The details about CDUFW above are summarized as follows:

**Algorithm 1** Cluster Structure Preserving Dictionary Pair for Unsupervised Weighting (CDUFW).

**Input:** A data matrix , the size of intrinsic feature space *p*, loss variation ratio , the maximum iteration *N* and the parameters

**Output:** The weight array **w**;

1: Initialize all variables.

2: **do**

3: Update .

4: Update **A**.

5: Update **F**.

6: **until** up to the maximum iterations or loss variation ratio

7: calculate **E**.

8: calculate **w**.

9: **return w**.

The convergence of each variable guarantees the convergence of the objective function, which has been intensively studied (Gorski *et al.* 2007). The time complexity of the proposed function (Equation (6)) was biquadratic.

### Pre-process and post-process of the CDUFW

The magnitude of values under different factors varies greatly. Therefore, the raw data obtained by experiment and calculation need to be processed to form the data matrix before being brought into the CDUFW model. The membership functions were used to normalize the raw data under each indicator in the pre-process. The pre-process steps are as follows:

- (1)
Determine the features of interest and the original data matrix

**O**. - As was mentioned above, the decision makers should consider the effect of wave elimination effect, economic input and land investment to choose a certain scheme of wave break forest. These can be transformed into three features: wave attenuation coefficient
*K*, the width of forest and the cost (being the number of plants multiplied by the cost of planting per tree). Thus, the original data matrix can be formed as follows: where is the value of the scheme*j*to feature*i*. - (2)
Calculate relation degree and form the data matrix

**X**. - The wave attenuation coefficient belongs to the benefit type while the others belong to the cost type. The benefit type means that it is expected to be larger, while the cost type indicates that the feature value is expected to be lower. The calculation formula of relation degree is shown in Equations (12) and (13) for benefit type and cost type, respectively: where is the relative membership of scheme
*j*to the optimal value of the feature*i*, is the minimum of eigenvalues of all schemes to feature*i*, is the maximum of values of all schemes to feature*i*. Then the data matrix**X**can be obtained:

After the data matrix was obtained, the weight of each feature can be worked out by CDUFW with Algorithm 1 mentioned above, then post-processes were needed to choose a better scheme. The distance between scheme and best goal along with the distance between scheme and worst goal were taken as a criterion to evaluate each scheme. The post-process steps are as follows:

- (1)
Define the relative greatest scheme and the relative worst scheme.

- In the multi-objective optimization, the comparison is limited to the chosen
schemes, so the best and the worst are relative concepts. In the alternative n schemes, the relative best scheme*n***G**and relative worst scheme**B**were defined according to their relation degree (Zhou*et al.*2007): - (2)
Calculate relative optimum membership degree of each scheme.

The higher the value of , the better the scheme. The scheme with the highest value of was selected as the best one.

## RESULTS AND DISCUSSION

### Results of physical experiments

In the section regarding physical experiment, three factors, including space of equilateral triangle arrangement, different arrangements and different tree shapes, were taken into consideration. The results of these three factors are shown in Figures 4–6 respectively.

As shown in Figures 4–6, the trend of experimental results under regular wave conditions and irregular wave conditions are consistent with the increase of forest width, and in all experiments the wave attenuation coefficient increases with the increase of forest width, which is in line with previous research results (Reinout *et al.* 2010; Hoque *et al.* 2018).

Considering its extensive use in practical engineering, the equilateral triangle configuration was especially studied (Figure 4). The wave reduction effect decreases with the increase of plant spacing, which is consistent with the conclusion that the wave elimination effect increases with the increase of vegetation density in previous studies (Möller *et al.* 2014; Peruzzo *et al.* 2018). However, in practical engineering, the use of distance is more easily accepted by workers and is convenient for construction.

The comparison of three different arrangements (Figure 5) shows that the square configuration which is commonly used in the experiment appeared to have a sharp turn after 3 m, and the increase of the wave attenuation coefficient decreased. It could be due to the less projected area of the trees in the direction of the incoming wave of the square configuration than others. It is apparent from Figure 3 that the projection area of the trunks and crowns of the quincunx and equilateral triangle arrangement is twice that of the square arrangement. It appears that the wave attenuation effect of the quincunx arrangement is better than that of the equilateral triangle arrangement, but the slope of the equilateral triangle curve is greater than that of the quincunx by 3 m. Thus, it could be speculated that the wave attenuation effect of the equilateral triangle arrangement is the best when the forest width is more than 5 m on the laboratory scale (i.e. 50 m in reality). The stationary point of the curve of the square and quincunx arrangement appears at the forest width of 3 m while the curve of the equilateral triangle arrangement has not yet appeared a stationary point, so more experiments are needed for future study.

According to Figure 6, it is evident that the wave dissipating effect differs in tree shapes. Both wave attenuation coefficients of T1 and T2 are much higher than those of T3 and T4. If we divide these four kinds of trees into two categories, one is T1 and T2, the other is T3 and T4, the main difference between the two is the crown radius on the basis of Table 2. It is conceivable that under the same water level, the principal influence factor of wave attenuation effect is the crown radius of the tree. The difference between T1 and T2 is rather small, and it could be speculated that the contribution of trunk radius to wave attenuation is very small. The diameters of the tree trunk and tree crown are consistent with the area per unit height of each vegetation stand normal to wave direction as described in a previous study (Kobayashi *et al.* 1993). It can be found from Figure 6 that the effect of tree trunks and canopy on the wave attenuation varies greatly, so it is not recommended to simplify rigid plants to uniformly rigid cylinders. When the rigid plant density is defined as the number of trees per unit area, special consideration should be given to changes in crown radius. Moreover, it is interesting that the initial difference between T3 and T4 is small, but as the width of forest increases, the gap became larger, which is due to the fact that the wave attenuation effect of T4 has hardly increased along the course. The total height of the tree of T4 is 0.24 m while the water depth of this experiment is 0.28 m, which means trees of type T4 are completely submerged under water. It can be inferred that when the vegetation is completely submerged, the wave effect rarely increases with the increase of forest width. This phenomenon is in accordance with previous studies by Augustin *et al.* (2009). Generally speaking, the branches and stems produce turbulence and the vortex in the stems draws energy from the average flow (Nepf 2011). The orbital velocity of water particles in the water column is gradually decreasing from the surface of water to the ground. When the vegetation is completely submerged, it cannot impede the top portion of water, thus the wave energy decrease is lower.

### Results of CDUFW

To make the decision more realistic, all valid physical experiment results under irregular wave condition were chosen to form a design set. The design set consisting of 40 schemes is presented in the form of a bubble diagram (Figure 7). This diagram comprehensively reflects the three aspects of each scheme. It expresses the cost of a scheme by its bubble size, the wave attenuation coefficient by the horizontal axis, and the forest width by the vertical axis. It is obvious that several schemes with different costs and forest widths can achieve similar wave attenuation effects, and it can be inferred that schemes represented by solid black bubbles are better than the hollow circles.

A bubble diagram (Figure 7) consisting of the 40 schemes was built to verify the rationality of the results. The matrix generated by the design set was taken as the original data matrix **O** and the calculation was carried on as described above under ‘Algorithms of CDUFW’. The specific steps are shown in Appendix B. It is calculated that the weight of forest width, cost and wave attenuation coefficient was 0.3892, 0.3298 and 0.2810, respectively. The results showed that when making decisions on the planting layout of the wave break forest, it seems that we should not persist in the pursuit of excessive wave attenuation coefficient, which will lead to a great increase in planting area and planting cost. Schemes with accredited attenuation coefficient and lower cost were preferred. The best scheme chosen by the proposed model is T2 of an equilateral triangle with a space of 0.25 m arrangement in 2 m forest width. The chosen scheme is exactly one of these three black solid bubbles whose wave attenuation coefficient is 0.3417. The best scheme chosen by the model is exactly one of the four possible optimal schemes we inferred from Figure 7, which provides compelling evidence that the proposed model is valid and scientific. As mentioned earlier, the scheme set is just based on the experiment we have already carried out, so the best is a relative concept here. With the deepening of the research on the mechanism of wave breaking, more diverse and more efficient schemes can be proposed. Even if the number of schemes is increased and the number of indicators considered increases, this model can still give reasonable choices.

## CONCLUSIONS

To study the influence of different factors on the wave attenuation effect in a practical engineering background, we carried out physical experiments. In total, 40 kinds of schemes were carried out and wave attenuation effect was evaluated by wave attenuation coefficient *K*. The laboratory study focused on vegetation effect under equilateral triangle configuration, a vegetation effect comparison among three different configurations and a comparison among four different tree shapes. The trend of wave reduction of equilateral triangle arrangement and the wave effect changes under different densities are similar to that of other permutations, but different arrangement modes take advantage on the wave attenuation effect of different forest widths. The effects of water depth, canopy and tree trunk on the wave attenuation effect of vegetation have been verified in the experiment of different tree shapes.

At the aim of choosing the best one in various wave break forest layout design schemes, an objective weighting model (CDUFW) was proposed, which is the first study in the field of wave break forest to our knowledge. The dictionary learning and spectral clustering have been combined to reflect the data structure information and -norm was used to quantify the significance of features. The proposed can overcome the mentioned defects of the entropy weighting method. It is effective even when there is no correlation between indicators. This model was applied to the data set of physical experimental data and obtains a reasonable optimization result. In future research, a more comprehensive data set can be acquired as input to the model so that better optimization schemes can be obtained.

The experimental design in this study was more closely related to engineering practice, which is of great significance to the design of river vegetation in other regions. The optimization model of vegetation design scheme is also proposed according to the actual demand, so as to achieve the objective effect of wave dissipation effectively, which can be used for reference by other river management decision makers. The CDUFW model is superior in its ability of intrinsic information mining and its convergence and computational complexity, and it can be applied in other multi-objective decision-making fields of water resources management.

## ACKNOWLEDGEMENTS

This work was supported by the Applied Technology Research and Development Program of Heilongjiang Province under Grant No. GZ16B031 and No. GZ16B035.