## Abstract

2D non-uniform polygonal meshes allow representation of the impact of landscape elements and small infrastructures on water flows. The initial vectorial mesh, derived from the intersection of several geographical information systems' layers, can have highly non-convex or sliver polygons. These bad-shaped elements compromise accurate numerical flow computation. We propose a flexible divide-and-conquer strategy to decompose polygons into physiographical meaningful parts using shape descriptors to better represent the surface terrain and hydrologic connectivity. We use the convexity index (*CI*) and the form factor (*FF*) to consider convex and square like optimum shapes. The strategy was applied to two peri-urban areas whose hydrologic response was simulated using distributed modeling. Good-quality meshes were generated with threshold values of *CI*≈0.8 and *FF*≈0.2, and *CI*≈0.95 and *FF*≈0.4 for undeveloped and highly urbanized zones, respectively. We concluded that the mesh segmentation facilitates the representation of the spatially distributed processes controlling not only the lumped response of the catchment, but also the spatial variability of water quantity and fluxes within it at medium and small scales.

## GLOSSARY

**Simple polygon:** polygon delimited by a continued and close polyline in which each pair of edges only intersect at one vertex. The number of vertexes equals the number of edges.

**Regular polygon:** an n-sided simple polygon with equilateral edges and equiangular interior angles.

**Irregular polygon:** an n-sided simple polygon with non-equilateral edges or non-equiangular interior angles.

**Non-uniform mesh:** a mesh composed of irregular polygons of variable size.

**Sliver polygon:** an elongated polygon with small area. In this article, a sliver polygon usually represents roads, footpaths, median strips, or narrow green areas.

**Well-shaped polygon:** a simple polygon that fulfills geometrical constraints given by threshold values of one or more shape descriptors such as the convexity index or the form factor.

**Bad-shaped polygon:** a polygon that is not well-shaped.

**Initial mesh:** output mesh from the initial segmentation step.

**Initial segmentation step:** procedure in which the main GIS layers are intersected to generate the initial mesh.

**Good-quality mesh:** mesh composed only of well-shaped polygons. The concept of good-quality depends on geometrical constraints. In our case, the geometrical constraints are given by numerical restrictions of the distributed hydrological model, and by a good representation of hydrological connectivity (i.e., a representative drainage network).

**Pseudo-convex polygon:** a simple polygon with a convexity index value larger than a defined threshold.

**Pseudo-square polygon:** a simple polygon with a form factor value larger than a defined threshold.

**Pseudo-regular polygon:** a simple polygon with a shape metric value larger than a defined threshold.

**Meaningful parts:** the accepted notion of meaningful parts relies on human perception, and is based on the observation that human vision defines part boundaries along negative minima of principal curvatures (Krayevoy & Sheffer 2006). Examples of meaningful parts are the hands and neck in a human body shape.

**Meaningful physiographic unit:** any natural or urban feature with homogeneous properties (i.e., same land use and pedo-topo-geological properties).

**Shape factor:** dimensionless metric used to describe the shape of a 2D polygon, which is independent of its size.

**Hydrological response unit (HRU):** a simple polygon with hydrological homogeneous properties (i.e., hydrological properties whose variation within an HRU is small compared to that among neighboring HRUs (Flügel 1995)).

**Urban hydrological elements (UHE):** a simple polygon composed of a cadastral parcel and its corresponding portion of street (Rodriguez *et al.* 2008).

**Drainage network:** connectivity structure among the hydrological elements within the catchment contributing to the channelized system. It includes overland path flows, streams, ditches, and sewers.

## INTRODUCTION

The resolution of spatial discretization in hydrological models is directly governed by the objectives of the modeling, the hydrological processes to be represented, and data quality and availability (Dehotin & Braud 2008). For planning, analysis, and design of sustainable urban and peri-urban hydro-landscapes at small scales, high resolutions are highly desirable. Moreover, hydrological models designed to study the hydrology of urban and peri-urban catchments require irregular meshes representing spatial features of highly variable sizes, such as gardens, infiltration strips, and detention ponds (Lagacherie *et al.* 2010; Abily *et al.* 2013; Jankowfsky *et al.* 2014).

Landscape features such as agricultural and urban boundaries, streets, buildings, tree lines and hedges, as well as the main channelized infrastructure, can be represented by any 2D vectorial mesh. Normally, hydrological models applied in urban environments use 2D vectorial meshes composed of sub-catchments, such as SWMM (Rossman 2009) and StormCAD (Haestad Methods 1995). Moreover, 2D hydraulics models for computing flooding areas use regular grid and triangular (e.g., MIKE 21 and MIKE 21 FM (DHI 2007a, 2007b)), or non-uniform hex-dominant meshes (3D FVM (Versteeg & Malalasekera 2007)). The intersection of vectorial layers leads to an initial model mesh composed of polygons with homogeneous hydrological properties (i.e., with the same land use and pedo-topo-geological properties) that are representative of the terrain. This mesh preserves all the artificial boundaries at small scales, but contains lots of polygons with irregular shapes that must be improved prior to implementing the hydrological model, as they may cause some numerical problems (Moussa *et al.* 2002; Zundel *et al.* 2002; Rodriguez *et al.* 2008; Jankowfsky *et al.* 2014). Indeed, some equations in hydrological models are based on specific geometrical characteristics that may have erroneous values when computed using bad-shaped polygons (i.e., polygons that do not fulfill certain geometrical constraints). Examples of these equations are the ones used in the PUMMA model (Jankowfsky *et al.* 2014) to estimate: (1) lateral fluxes between two polygons using Darcy's law; (2) fluxes between hydrological units (represented by polygons) and rivers (represented by polylines) considering the Dupuit–Forchheimer hypothesis (Miles 1985); and (3) surface fluxes between polygons using the Manning equation, where the flux depends on the distance between the centroids of polygons and/or polylines that may be erroneous for non-convex polygons (see examples in Figure 1). All these drawbacks have motivated the development of new technologies to represent the terrain using good quality meshes. Similar advances have been implemented to discretize complex geometries in other Earth sciences areas, such as earthquakes (Shewchuk 1996) and hydrogeology (Panday *et al.* 2013).

### Appropriate meshes for representation of varying size elements

Building an appropriate mesh for representing spatial domains and obtaining meaningful results when using numerical models is essential. The main differences among commonly used meshes are related to the type of elements that can be used and the definition of a well- or bad-shaped element. For finite elements' methods, triangulations and quadrilateral meshes are the most common (Kim & Chung 2015; Russo *et al.* 2015). The classical triangulation fulfills the Delaunay condition and has been implemented in widely used algorithms such as Triangle in 2D (Shewchuk 1996) and TetGen in 3D (Si 2006). Typical quality criteria used in these algorithms are that the minimum angle of each triangle must be greater than – or the edge-ratio must be smaller than – a threshold value. In the case of quadrilateral meshes (Lee *et al.* 2003), common approaches are based on off-setting techniques, while classical quality criteria consider the convexity of the quadrilaterals and a maximum possible angle. In recent years, the polygonal finite element method based on polygonal meshing has become more popular (Panday *et al.* 2013). Polygonal meshing allows the generation of a mesh of variable resolution, in which high and low point densities are combined using graded transitions. The most used polygonal mesh is that composed of polygons defined by the Voronoi regions, which can be generated by building the dual-graph of a Delaunay mesh. The Voronoi regions are convex polygons and commonly have six sides on average (O'Rourke 1998). The constrained Delaunay triangulation (DT) algorithm respects the polygon boundary without inserting new points. The Delaunay condition may not be accomplished by the triangles containing the boundary of the initial polygon. On the other hand, the conforming DT algorithm inserts points so that all triangles fulfill the Delaunay condition. This method may be used together with some geometrical criteria involving angles, area, etc.

Triangulations (Tucker *et al.* 2001), quadrilateral (grid-cells) meshes (Downer *et al.* 2006), and polygonal meshes based on Voronoi regions (Collon *et al.* 2015) can also be used for terrain representation and hydrological modeling. However, regular polygons such as square grid cells are not adequate to represent the original shape of hydrological elements such as residential lots, agriculture fields, or green areas, nor for representing variable shape boundaries of urban and peri-urban features (i.e., streets, footpaths, hedges). They can only be used with elements bigger than the regular polygon size (i.e., features smaller than the regular raster's cell size cannot be explicitly represented). Other regular polygons (e.g., equiangular and equilateral sides polygons) may represent such complex environments, but the number of final polygons can increase heavily. High resolution 2D runoff modeling based on triangles requires mesh refinement to improve the representation of urban environments' features. This is a demanding task and it is difficult to build models with discretization refinement using standard tools, as time-consuming hand-made operations are still required (Abily *et al.* 2013). In addition, such models can be deployed on small areas of a few hectares, but are computationally too expensive for catchments of a few square kilometers. There is therefore an interest to rely on bad-shaped polygons to better represent the complexity of urban and peri-urban catchments, in order to reduce significantly the number of elements in the final mesh and the computational time of the hydrological models. Nevertheless, following such an approach requires the final mesh to respect criteria ensuring numerical stability and meaningfulness of the geometrical characteristics. Recently, such non-uniform meshes composed of units representing terrain elements have been used in distributed hydrological models such as MHYDAS (Moussa *et al.* 2002), URBS (Rodriguez *et al.* 2008) and PUMMA (Jankowfsky *et al.* 2014), which solve surface, sub-surface, and river flows. Furthermore, GIS-tools such as Geo-MHYDAS (Lagacherie *et al.* 2010), GRIDGEN (Lien *et al.* 2015), LumpR (Pilz *et al.* 2017), and Geo-PUMMA (Sanzana *et al.* 2017) have been developed for the implementation of these meshes, which are the focus of the present paper.

### Current technologies in polygonal decomposition

The decomposition of 2D shapes into meaningful parts is a key process in human image understanding, and geometrical computing areas (Lien & Amato 2005; Liu *et al.* 2014). The accepted notion of meaningful parts relies on human perception, and is based on the observation that human vision defines part boundaries (e.g., hands and neck in a human body shape) (Krayevoy & Sheffer 2006). Any non-convex polygon can be decomposed into approximately convex sub-set parts, which are commonly associated with meaningful parts. Several efficient strategies to decompose polygons into approximatively convex pieces, such as the DuDe (Liu *et al.* 2014) and ACD (Lien & Amato 2005) algorithms, have been developed and applied to decompose silhouette-based features. Unfortunately, the convexity criterion may not be enough for terrain representation using irregular meshes for hydrological models. Many models use a terrain representation based on hydrological response units (HRUs) (Flügel 1995) and urban hydrological elements (UHEs) (Rodriguez *et al.* 2008), meaningful physiographic units of the natural and urban landscapes with homogeneous properties, respectively. These units can be highly irregular and bad-shaped, and they eventually need to be decomposed into meaningful small pieces. For example, there can be sliver polygons (i.e., elongated polygons with small area), perfectly convex (i.e., long streets, linear hedges, footpaths, median strips, or narrow green areas) but not suitable for flow routing. A sliver polygon can work as an artificial wall restricting flow routing, because its elevation – which may not be representative of the entire area involved – can be such that flow from lower adjacent polygons is restricted (Jankowfsky 2011). Hence, the criteria of decomposition of HRUs must consider not only convexity but other factors such as form factor or shape factor, circularity, elongation ratios, and compactness coefficient. These criteria can be implemented through tools using a divide-and-conquer approach able to adapt to a variety of geometrical criteria. An example of such a tool is Geo-PUMMA (Sanzana *et al.* 2017), a semi-automatic vectorial toolbox to represent urban and peri-urban small catchments (0.1–10 km^{2}) and the hydrological connectivity among their components.

### Contribution

This paper presents and assesses in detail a flexible divide-and-conquer strategy decomposition for 2D polygons for terrain representation and hydrological routing. The strategy is based on methods partially described previously in the literature. In particular, Sanzana *et al.* (2013) developed tools to improve HRUs considering (1) the high heterogeneity in HRUs' properties derived from a digital elevation model, (2) the existence of concave polygons or polygons with holes inside, (3) the existence of very large polygons, and (4) bad estimations of HRUs' perimeters and distances. Recently, Sanzana *et al.* (2017) developed Geo-PUMMA, a GIS-based tool to generate good quality polygonal meshes composed of HRUs and UHEs for urban and peri-urban catchments. These meshes are composed of the smallest number of properly interconnected polygons (i.e., without spurious pits or unrealistic flow paths), with homogeneous hydrological properties that are representative of the terrain and ensure the efficient application of hydrological models. Based on these previous results, the main contributions of this paper are as follows:

The generalization of the work of Sanzana

*et al.*(2017) through the proposition and description of a flexible divide-and-conquer strategy for decomposing bad-shaped 2D polygons into smaller meaningful components. This strategy is tested using both urban and peri-urban terrains and silhouette-based features, which exemplifies its potential application beyond the field of hydrology.The assessment of the computational complexity of the algorithms involved in the strategy.

An in-depth analysis of the impact of threshold values proposed for the involved shape descriptors, and recommendations of empirical values for these descriptors to be used in the decomposition of bad-shaped polygons.

A specific assessment of the effect of 2D polygon decomposition in hydrological models.

## BASIC CONCEPTS

The drainage network is a primary input for hydrological models. Generally, this concept refers to the network of pipes and streams conveying flow in a catchment to the outlet. In this paper, this concept refers to the whole connectivity structure among the hydrological elements within the catchment contributing to the channelized system (streams, ditches, and sewers). The drainage network is extracted from a hydrological model mesh representative of the terrain, and the natural and urban features. From now on, we follow the approach by Sanzana *et al.* (2017), which considers a hydrological model mesh to be composed of HRUs and UHEs.

The initial mesh is obtained from the direct intersection of GIS layers, such as land use, soil type, geology, sub-catchment, UHEs, and channelized network. The resulting hydrological mesh is formed by bad-shaped polygons, which are more suitable than regular grids for representing human-made features affecting significantly the hydrological processes in a catchment (Lagacherie *et al.* 2010). However, vector-based HRU model meshes also come with specific constraints, depending on the distributed hydrological model to be applied. On the other hand, due to the urban design of the lots, typical UHEs are polygons with regular shape (normally pseudo convex or not sliver). Nevertheless, any UHE with irregular shape can be decomposed with the tools described in this paper.

In distributed hydrological meshes, bad-shaped HRUs cause problems when defining the topology of the drainage network (Jankowfsky 2011). Typically, the distance between the polygons' centroids and their boundaries is used to calculate flowpath lengths among HRUs, and overland and subsurface flows. This distance is meaningless if the centroid is outside of the polygon, and a modification of the HRUs becomes necessary. Moreover, the HRUs can vary greatly in size, which creates problems in drainage network extraction and the stability of numerical schemes (for instance, by inducing unrealistic water depths on small polygons downstream of larger ones). Thus, the segmentation of very large polygons is recommended to obtain a mesh composed of similar size polygons and avoid flows from big polygons to small ones in hydrologic modeling.

Good-quality meshes are composed of well-shaped polygons that are physiographically homogeneous. This representation allows the identification of the hydrologic connectivity among them defined by the terrain. In this paper, a well-shaped HRU is defined as a hydrologically homogeneous, not-sliver and pseudo-convex polygon. Good-quality meshes do not result in the problems previously identified, as they are composed of convex elementary units with the same number of vertexes and boundaries, whose centroids are located inside the polygon. Figure 1 shows examples of bad-shaped HRUs from Estero El Guindo, a peri-urban catchment located in the piedmont of Santiago, Chile (Sanzana *et al.* 2017), and the Mercier and Chaudanne, two peri-urban catchments located in Lyon, France (Sanzana *et al.* 2013). Irregular shapes in this figure include sliver HRUs representing streets (Figure 1(a)), median strips (Figure 1(b)), walking paths (Figure 1(c) and 1(e)), a hedgerow (Figure 1(d)), highly non-convex HRUs corresponding to lots' division (Figure 1(f), 1(h) and 1(i)), and a particular green area (Figure 1(g)). Each feature in Figure 1 plays a relevant role in the water balance and hydrological routing, so preserving their original shape while segmenting them in pseudo-regular pieces becomes relevant.

### Shape descriptors

Several basin shape descriptors or metrics have been proposed in the literature to characterize the geomorphology of basins. Either these metrics or the geometric properties used in their calculation have been used to estimate the hydrological response of a basin from its geomorphology or its time of concentration, which corresponds to a relevant response parameter used in hydrologic modeling. Examples of these metrics calculated from the main stream length (*L*) and the basin area (*A*) and perimeter (*P*) include (Karamouz *et al.* 2012): form factor (*FF _{w}*), basin shape factor (

*B*), elongation ratio (

_{s}*R*), circularity ratio (

_{e}*R*) and compactness coefficient (

_{c}*C*) (Table 1).

_{c}Watershed shape descriptor | Expression | Range |
---|---|---|

Form factor | FF = _{w}A/L ^{2} | (0,1) |

Basin shape factor | B = _{s}L ^{2}/A | (1, ∞) |

Elongation ratio | R = 1.128_{e}A ^{0.5}/L | (0,1] |

Circularity ratio | R = 12.57_{c}A/P ^{2} | (0,1] |

Compactness coefficient | C = 0.2821_{c}P/A ^{0.5} | [1,∞) |

Watershed shape descriptor | Expression | Range |
---|---|---|

Form factor | FF = _{w}A/L ^{2} | (0,1) |

Basin shape factor | B = _{s}L ^{2}/A | (1, ∞) |

Elongation ratio | R = 1.128_{e}A ^{0.5}/L | (0,1] |

Circularity ratio | R = 12.57_{c}A/P ^{2} | (0,1] |

Compactness coefficient | C = 0.2821_{c}P/A ^{0.5} | [1,∞) |

These descriptors are related to the hydrologic response. Thus, larger peak flows are related to large values of *FF _{w}*,

*R*, and

_{e}*R*and small values of

_{c}*B*and

_{s}*C*(Karamouz

_{c}*et al.*2012). All these shape descriptors have been used in basins with areas larger than 1 km

^{2}; however, their direct use at urban lot scales (<1 ha) is not possible, as the main channel is typically not well-defined to identify the basin length. Conversely, there are shape descriptors used in image processing which can be used at small and medium scales (Russ 2002): form factor based on circle (

*FF**), form factor based on square (

*FF***), roundness (

*R*), aspect ratio (

*A*), elongation (

_{r}*E*), convexity index (

*CI*), solidity index (

*SI*), and compactness (

*C*) (Table 2).

Image shape descriptor | Expression | Range |
---|---|---|

Form factor based on circle | FF* = 4πA/P^{2} | (0,1] |

Form factor based on square | FF** = 16A/P^{2} | (0,1] |

Roundness | R= 4A/(πD_{max}) | (0,1] |

Aspect ratio | A = _{r}D_{max}/D_{min} | [1, + ∞] |

Elongation | E=L_{max}/L_{min} | [1, + ∞] |

Convexity index | CI=P_{convex}/P | (0,1] |

Solidity index | SI=A/A_{convex} | (0,1] |

Compactness | C= ((4/π)A)^{0.5} | (0,1/π] |

Image shape descriptor | Expression | Range |
---|---|---|

Form factor based on circle | FF* = 4πA/P^{2} | (0,1] |

Form factor based on square | FF** = 16A/P^{2} | (0,1] |

Roundness | R= 4A/(πD_{max}) | (0,1] |

Aspect ratio | A = _{r}D_{max}/D_{min} | [1, + ∞] |

Elongation | E=L_{max}/L_{min} | [1, + ∞] |

Convexity index | CI=P_{convex}/P | (0,1] |

Solidity index | SI=A/A_{convex} | (0,1] |

Compactness | C= ((4/π)A)^{0.5} | (0,1/π] |

*Note:* D_{max}: maximum diameter, D_{min}: minimum diameter, L_{max}, length of the major axis, L_{min}: length of the minor axis; L: main stream length; P_{convex}: convex hull perimeter, A_{convex}: convex hull area (modified from Russ (2002) and Jiao *et al.* (2012)).

At scales less than 1 ha, these descriptors have been used to describe the main features in land use, such as roads, cultivated lands, settlements, rivers, ponds, and forests and grass lands (Jiao *et al.* 2012).

The more relevant shape descriptors for the present work are those that allow the identification of non-convex and sliver HRUs. Highly non-convex units characterized by values of *CI**=**P*_{convex}/*P* ≪ 1 typically have their centroid outside of the polygon and non-smoothed contours (*P _{convex}* is the convex hull perimeter). On the other hand, values of

*CI*≈1 ensure the centroid is inside the polygons and non-smoothed boundaries. In image processing, the form factor can be defined in different ways, depending on whether a circle or square is used as the reference shape (

*FF** and

*FF***). Sliver units have values of

*FF*** ≪ 1, and

*FF***≈1 for well-shaped polygons with a square or pseudo-square shape. From now on,

*FF*** will be simply referred to as

*FF*(

*FF*

*=*4

*πA*/

*P*). To illustrate the use of

^{2}*CI*and

*FF*, Table 3 presents their values for the irregular shape units shown in Figure 1. Note how sliver polygons and non-convex polygons have low values of

*FF*and

*CI*, respectively.

Unit | Physiographic meaning | Area (m^{2}) | CI | FF |
---|---|---|---|---|

a | Segmented street | 1,085 | 0.580 | 0.057 |

b | Segmented street | 3,439 | 0.830 | 0.072 |

c | Footpath | 497 | 0.940 | 0.073 |

d | Green area | 680 | 0.750 | 0.053 |

e | Segmented street | 2,664 | 0.950 | 0.017 |

f | Lot partition | 4,921 | 0.700 | 0.241 |

g | Green area | 8,594 | 0.620 | 0.204 |

h | Lot partition | 8,9351 | 0.590 | 0.130 |

i | Green area | 9,894 | 0.730 | 0.126 |

Unit | Physiographic meaning | Area (m^{2}) | CI | FF |
---|---|---|---|---|

a | Segmented street | 1,085 | 0.580 | 0.057 |

b | Segmented street | 3,439 | 0.830 | 0.072 |

c | Footpath | 497 | 0.940 | 0.073 |

d | Green area | 680 | 0.750 | 0.053 |

e | Segmented street | 2,664 | 0.950 | 0.017 |

f | Lot partition | 4,921 | 0.700 | 0.241 |

g | Green area | 8,594 | 0.620 | 0.204 |

h | Lot partition | 8,9351 | 0.590 | 0.130 |

i | Green area | 9,894 | 0.730 | 0.126 |

*Note:* Critical shape descriptor of *CI**<***0.80** and *FF**<***0.20** are in italics and bold font.

Jiao *et al*. (2012) computed statistics for the main shape metrics of several land-use classes. Forest and grass lands were the most non-convex land use classes, while the most convex class is ponds. This is expected as green areas grow irregularly, whereas pond units such as retention basins are associated with regular shape units. Furthermore, despite their high *CI* values, roads are the most sliver polygons. Thus, convexity cannot be the only criterion to improve polygonal meshes, and must be complemented by the *FF* criterion, which allows the decomposition of important units into smaller elements (i.e., small streets, footpaths, backyards, etc.).

## DIVIDE-AND-CONQUER STRATEGY TO DECOMPOSE BAD-SHAPED POLYGONS

To improve bad-shaped units, we use a divide-and-conquer strategy. The first step is to select bad-shaped elements from the initial mesh using the shape descriptors. The polygons are then segmented into smaller triangles with either a constrained or conforming DT algorithm. The resulting triangles are convex elements, but large bulks of relatively small triangles are usually created. To dissolve small triangles, we propose the following algorithms that respect shape descriptors and the overall physiographical-oriented character of the mesh units: (1) an algorithm for non-convex polygons, (2) an algorithm for large polygons, and (3) an algorithm for sliver polygons. These algorithms are implemented in Geo-PUMMA (Sanzana *et al.* 2017) with Python using GRASS functions. The scripts *p.convexity_segmentation.py* and *p.form_factor.py* were modified to apply them not only to GIS features, but also computer graphics' vectorial images. The updated Geo-PUMMA scripts and the dataset used in this paper are available at https://forge.irstea.fr/projects/geopumma. Note these algorithms can be easily modified to apply other constraints based on the watershed shape descriptors such as the ones previously identified. What follows is a definition of the different decompositions considered in the strategy (i.e., convex, pseudo-convex, and pseudo-square decomposition), and an explanation of the general notation. In this explanation, polygons are marked with bold font and scalar values with italic font. The following auxiliary indexes and polygons' notation are defined:

*k*: index assigned to number of vertexes*l*: index assigned to any polygon*m*: index assigned to any polygon different than polygon*l**i*: step index for each iteration*j*: index assigned to the final components*n*: total number of triangles obtained in the triangulation step*r*: index used for the biggest triangle not used in previous steps*s*: index used for the biggest neighbor triangle**T**: bigger triangle not used in the iteration step_{r}**T**: bigger neighbor triangle_{s}

A simple polygon **P** without holes is composed of a set of segments *δ P*

**=**{

*δ*

**P**

_{0},

*δ*

**P**

_{1}, … ,

*δ*

**P**

_{n}_{-1}}, where

*δ*

**P**

*is the segment between two adjacent vertexes, V*

_{k}*and V*

_{k}

_{k}_{+1}, with

*k*

*=*0, … ,

*n*− 1. The area and perimeter of the polygon

**P**are defined as

*area*(

**P**) and

*peri*(

**P**), respectively, whereas the

*CI*and

*FF*of the polygon

**P**are defined as

*CI*(

**P**) and

*FF*(

**P**). Pseudo-convex polygons are classified according to their value of

*CI*, and pseudo-square polygons are classified using the

*FF*criterion. The threshold values to create well-shaped polygons for

*CI*and

*FF*are referred as to

*CIT*and

*FFT*, respectively.

The following polygons and functions are defined:

A polygon

**C**is a component of**P**if**C**⊂**P****H**is the convex hull of a polygon_{P}**P**, i.e., the convex polygon with the smallest area containing**P****P**is said to be convex if=*P***H**_{P}**P**is said to be well-shaped pseudo-convex if*CI*(**P**) >*CIT*, i.e., perimeter(**H**/peri(_{P})**P**) >*CIT***S**is the equivalent square polygon of_{P}**P**if*area*(**S**) =_{P}*area*(**P**)**P**is said to be square if*P*= S_{P}**P**is said to be well-shaped pseudo-square if*FF*(**P**) >*FFT*, i.e., 4·*π*·area(**P**)/peri(**P**)^{2}>*FFT*U

means union of polygons with sub-index_{l}*l*∩ means intersection between two polygons

/ means subtraction of two polygons

means empty set.

**C**

*} is a decomposition of*

_{l}**P**,

**D**(

**P**), if their union is

**P**and all

**C**

*are interior disjoint (Lien & Amato 2005), i.e., {*

_{l}**C**

*} must satisfy: A convex decomposition of*

_{l}**P**,

**CD**(

**P**), is a decomposition of

**P**(Lien & Amato 2005) that contains only convex components: A well-shaped pseudo-convex decomposition of

**P**,

**PCD**(

**P**), is a decomposition of

**P**that contains only convex components or pseudo-convex components: A square decomposition of

**P**,

**SD**(

**P**), is a decomposition of

**P**that contains only square components: A well-shaped pseudo-square decomposition of

**P**,

**PSD**(

**P**), is a decomposition of

**P**that contains only squares or pseudo-squares components:

### Algorithm for non-convex polygons

The pseudo-convex decomposition algorithm applied to a polygon or HRUs is illustrated in Figure 2(a) and 2(b). In the first step, the bad-shaped polygon is segmented into triangles (**T*** _{i}*) using a constrained DT implemented within the software Triangle

^{®}(Shewchuk 1996). In an iterative manner, the dissolving procedure starts from the largest triangle with the neighboring triangles (Figure 2(a)). The final set

**PCD**(

**P**) contains only components with a

*CI*(

**C**

*) <*

_{i}*CIT*. All the final segmentation lines of all polygons,

*δ*

**P**

*, are connected to the initial vertexes without inserting new vertexes in the initial boundaries (Figure 3(b)).*

_{k}### Algorithm for big polygons

The pseudo-convex decomposition algorithm is applied to polygons whose area is larger than a threshold or maximum area *A _{max}*. Polygons with

*area*(

**P**) >

*A*can be part of the original mesh, or be obtained after applying the previously described pseudo-convex decomposition algorithm to improve the

_{max}*CI*values. Normally, values of

*A*= 1 to 2 ha have been recommended at urban and peri-urban scales (Jankowfsky 2011; Sanzana

_{max}*et al.*2013, 2017). The segmentation of big polygons first considers the implementation of a conforming DT using Triangle

^{®}(Figure 2(c)). Similarly to the iterative implementation depicted in a previous subsection, the largest triangle and the second largest neighboring triangle are dissolved (Figure 2(c) and 2(d)). The final set

**PCD**(

**P**) contains only components with a

*CI*(

**C**

*) <*

_{i}*CIT*and

*area*(

**C**

_{i}) <

*A*.

_{max}### Algorithm for sliver polygons

The pseudo-square decomposition algorithm applied to a polygon or HRUs is illustrated in Figure 3(a) and 3(b). In a first step, the initial boundaries are segmented into new ones with maximum separation length of *d _{max}* (Figure 3(b)). For peri-urban meshes, where sliver polygons are generally associated with streets, paths, and river banks, an empirical value of

*d*

_{max}*=*5 m is recommended to add additional vertexes along the perimeter. In a second step, the sliver polygon is segmented into triangles (

**T**

*) using a conforming DT without inserting new vertexes inside the polygon (Figure 3(a)). In an iterative manner, the dissolving procedure starts from the biggest triangle associated with the neighboring triangles (Figure 3(a) and 3(b)). The final set*

_{i}**PSD**(

**P**) contains only components for which

*FF*(

**C**

*) <*

_{i}*FFT*.

### Generalization of the divide-and-conquer approach to other shape descriptors

**T**

*) using a conforming or constrained DT. Therefore, any SHApe DEscriptor (*

_{i}*SHADE*), such as those presented by Karamouz

*et al.*(2012) and Russ (2002), can then be used to create a new decomposition of an initial polygon,

**D**(

**P**). For each

*SHADE*, a shape descriptor threshold

*SHADET*can be proposed, considering any regular polygon selected by the modeler. Algorithms previously described consider the convexity, area, and form factor as shape constraints, but any other shape descriptor can be selected. Some new vertexes can eventually be added on the boundaries of the polygon (as in the case of pseudo-square decomposition for sliver polygons), or inside the polygon (as in the case of maximum area restriction for bigger polygons). Figure 4 shows the steps to create a pseudo-regular decomposition,

**PRD**(

**P**), of

**P**. This new sub-set contains only regular or pseudo-regular components:

In the Appendix (available with the online version of this paper), we present the theoretical and empirical order of the proposed pseudo-convex decomposition strategy.

## RESULTS

### Segmentation of bad-shaped HRUs

The bad-shaped HRUs shown in Figure 1 were decomposed into pseudo-regular polygons using Triangle^{®} and the algorithms *p.convexit.py* and *p.form_factor.py* in Geo-PUMMA. The different panels in Figure 5 illustrate the effect of several *FFT* and *CIT* values on the number of elements in the final segmentation of the corresponding HRUs. Restrictive (i.e., high) values of these indexes increase the final number of elements for each HRU. In fact, the final number of elements in which sliver polygons are decomposed increases exponentially when using the *FF* criterion (Figure 5(a)–5(e)), and up to 100 final elements are obtained with values of *FFT* > 0.9. Interestingly, for values of *FFT* ∼ 0.8, a maximum number of elements is reached for two curves (i.e., 18 and 113 elements in Figure 5(c) and 5(e), respectively), which shows that the performance of the *p.form_factor.py* algorithm depends on the polygon shape and the presence of narrowing regions, abrupt discontinuities, etc.

As expected, the number of elements also increases with the *CIT* value, and up to 14 elements are created for values of *CIT* = 0.99. This increase is less continuous than for the *FFT* case, as the number of elements does not change within certain ranges of *CIT*; nonetheless, the convexity of each final element always improves with larger values of *CIT*. In general, non-strictly increasing curves in Figure 5(f)–5(i) are explained by the existence of irregular narrowing zones in the initial polygons, hence, more than the quantity of vertexes in the initial polygon, their location is what explains the obtained results.

Low values of *FF* are typical of land use classes with several small irregular-shaped elements (i.e., roads composed of street, or green areas composed of trees or small shrubs) (Jiao *et al*. 2012). Low values of *CI* are associated with green areas such as hedgerows. From an empirical perspective, we propose values of *FFT* = 0.20–0.40, and *CIT* = 0.80–0.95 to decompose the bad-shaped HRUs. These values allow the generation of well-shaped HRUs composed of small meaningful parts (e.g., road median, backyard, small set of shrubs or trees) without substantially increasing the final number of elements to be eventually used in a hydrological model. Thus, the minimum of the proposed range of threshold corresponds to a value needed to improve the representation of bad-shaped polygons into a new set of well-shaped HRUs, whereas the maximum of the range corresponds to a limit value that avoids increasing the final number of well-shaped HRUs. Figure 6 shows the polygons of Figure 1 decomposed using these minimum and maximum recommended thresholds.

### Application to images in computer graphics

In disciplines such as human vision or computer graphics, the decomposition of non-convex elements is an active research field. We now evaluate the ability of our approach to decompose non-convex graphical computer images typical of these disciplines, in order to demonstrate its applicability to other fields beyond hydrology. Several pseudo-convex decompositions of these computer images were generated using the *p.convexity.py* script and different *CI* thresholds as is shown in Figure 7. First, a baby's footprints with initial values of *CI _{initial}* = 0.83 and 0.86 (Figure 7(a) and 7(b), respectively) were decomposed with threshold values of

*CIT*

*=*0.982 (Figure 7(c)) and

*CIT*

*=*0.985 (Figure 7(d)). These values allow the separation of the toes from the sole of each foot. Then, a yoga position silhouette with an initial value of

*CI*

_{initial}*=*0.66 (Figure 7(e)) was decomposed into a pseudo-regular subset using a value of

*CIT*

*=*0.985 (Figure 7(f)). In this case, the chosen

*CIT*value allows the identification of body parts (i.e., head, arm, torso, waist, etc.). Thus, high values of

*CIT*may be needed to ensure the automatic decomposition of complex non-convex images. The metrics %pieces∣

*CI*

_{i}>

*CIT*(i.e., percentage of pieces for which the

*CI*value exceeds

*CIT*) is not 100% because the dissolution of too-small elements deforms the shape of well-shaped elements in some cases (i.e., %pieces∣

*CI*>

_{i}*CIT*

*=*80%, Figure 7(c); %pieces∣

*CI*>

_{i}*CIT*

*=*50%, Figure 7(d); %pieces∣

*CI*>

_{i}*CIT*

*=*82%, Figure 7(f)).

## HYDROLOGICAL ROUTING ASSESSMENT

### Qualitative hydrological routing assessment for mesh quality improvement

In the case of meshes generated to represent hydro-landscapes, the segmentation adopted affects directly the hydrological connectivity among the elements. This connectivity is given by the flow paths obtained using the overland routing routine *p.olaf.py* (Sanzana *et al.* 2017). This script routes the flow between HRUs and the drainage network following topography towards the lowest neighbor HRU or drainage reach. As an example, Figure 8 presents two catchments, the Riou (Lyon, France) and El Guindo (Santiago, Chile), which were adopted as case studies to analyze the effect of the segmentation over the hydrologic response. This figure shows the elevation contours, the initial and segmented model meshes (Mesh_pre and Mesh_post, respectively), the presence and characteristics of bad-shaped polygons, and the corresponding routing structures calculated by the *p.olaf.py* routine. Note that the initial mesh is segmented using the *CI*, *FF* and maximum area criteria.

In both cases, a visual inspection shows that the hydrological connectivity tends to be more realistic after segmentation, and flow paths crossing rivers or streets are avoided. In both cases, non-convex and sliver elements were segmented without substantially increasing the final number of elements. For the more rural Riou catchment, values of *CIT* = 0.80 and *FFT* = 0.20 were chosen, while for the more urbanized El Guindo catchment, whose streets are better defined, values of *CIT* = 0.95 and *FFT* = 0.40 were adopted. Based on our results, we recommend these pairs of *CIT* and *FFT* values for more natural and urbanized landscapes, respectively. Note these threshold values are higher (i.e., more restrictive) than those presented by Sanzana *et al.* (2013) (*CI**<* 0.75 and *CI**<* 0.88) and Sanzana *et al.* (2017) (*CI**<* 0.75; *FF**<* 0.20).

### Quantitative hydrological assessment for mesh quality improvement

To further illustrate the effects of the decomposition strategy, we simulated the hydrologic response of two catchments represented using both the initial and segmented meshes. Our objective is not to explore in detail the capabilities of the models, but to assess the effects of the segmentation on their outputs. Because land use and soil parameters for the initial and segmented meshes are the same, the effects of the different meshes and their respective hydrologic connectivity on flow routing are well isolated. Although two particular models are used in this assessment, other semi-distributed and fully distributed hydraulic tools can be used with the modeling meshes prepared with the strategy proposed in this work.

We used SWMM (Storm Water Management Model) (Rossman 2009; Gironás *et al.* 2010) to simulate the hydrologic response of the Riou and El Guindo catchments. To compare the effect on the hydrographs' outputs, we simulated synthetic rain events of 5-, 10-, and 25-year return period (T) using the alternating block method. The effect of the segmentation is more notorious when using short events with low return periods. This is shown in Figure 9, which compares for both catchments the resulting hydrographs for a 6 h, 2-year storm. Table 4 compares the resulting peak flows for all the synthetic events.

Riou | El Guindo | |||
---|---|---|---|---|

T (year) | Mesh_pre | Mesh_post | Mesh_pre | Mesh_post |

2 | 110 | 51 | 11 | 10 |

5 | 144 | 69 | 17 | 19 |

25 | 314 | 154 | 59 | 62 |

Riou | El Guindo | |||
---|---|---|---|---|

T (year) | Mesh_pre | Mesh_post | Mesh_pre | Mesh_post |

2 | 110 | 51 | 11 | 10 |

5 | 144 | 69 | 17 | 19 |

25 | 314 | 154 | 59 | 62 |

Interestingly, for the Riou catchment, the peak flows from the most segmented model mesh (Mesh_post) are ∼50% smaller than those simulated using the initial model mesh (Mesh_pre). For the El Guindo catchment, the peak flow is lower in the segmented mesh just for T = 2 years; for the other return periods, the peak flow actually slightly increases. A possible explanation for such a difference in the effect of the mesh segmentation may be the way SWMM infiltrates overland flow in the downstream elements. SWMM is a semi-distributed hydrologic model in which the excess runoff from one element or subcatchment is added homogeneously to the precipitation over the downstream element (Rossman 2009). Hence, the infiltration representation may differ from that of a distributed hydraulic model able to solve the continuity and momentum equations, in which the excess runoff from one element enters the downstream element only through its upstream end. Thus, infiltration is enhanced in SWMM as more subcatchments connected to each other are used to represent the catchment, and is exacerbated when flatter slopes are considered, as is precisely the case with the Riou catchment. Nevertheless, it is important to note that overland routing among many subcatchments is not a common practice in SWMM, as individual subcatchments are typically connected to channelized elements instead. For the sake of comparison, Figure 9 also presents the hydrograph simulated when using the best modeling mesh obtained from the available information (i.e., the reference mesh), which allows the best topographic fidelity with the largest number of units, while avoiding topological problems. More details about the generation of this mesh are provided in Sanzana *et al.* (2017). Interestingly, the hydrographs simulated by this reference mesh and the segmented mesh are very alike, which seems to demonstrate an ultimate infiltration capacity as the mesh becomes finer and finer. Curiously, such behavior is not observed for the Guindo case, for which the reference mesh and the initial mesh produce more similar results. Overall, these results show the impact of the modeling mesh when coupling an infiltration model with a hydrologic overland flow model over the local calculation of the infiltration rate. This issue is also relevant when a 2D hydraulic routing model is used instead (Fernández-Pato *et al.* 2016).

We also used the hydrological model PUMMA (Peri-Urban Model for landscape Management) (Jankowfsky *et al.* 2014) to study the response of the Mercier catchment (Lyon, France) (Figure 10(a)). The model and setup for the initial mesh is described in Sanzana *et al.* (2017) and Fuamba *et al.* (2018), while a second model was built for the segmented mesh. The initial mesh is composed by 1,626 HRUs and 289 UHEs, but only HRUs were segmented. Both models were run for the year 2009, and the resulting spatial distributions of ponding depth were compared. The initial mesh produced artificial and unrealistically large values in some polygons, especially in forested areas in the west (Figure 10(b)), despite the high infiltration capacity in the area (Gonzalez-Sosa *et al.* 2010). Ponding reduces 50% when using the segmented mesh (Figure 10(c)), and polygons with ponding larger than 0.1 m disappear. These improvements are directly related to changes in the number of elements in the mesh. The segmentation not only increases the number of HRUs in 20%, but also all the polyline features (i.e., river segments in 459%; water table interfaces (WTI), in 25%; and water table river interfaces (WTRI), in 47%), as shown in Table 5.

Mesh | River segments | HRUs | WTI | WTRI |
---|---|---|---|---|

Mesh_pre | 128 | 1,626 | 4,763 | 580 |

Mesh_post | 716 | 1,952 | 5,945 | 852 |

Increase (%) | 459% | 20% | 25% | 47% |

Mesh | River segments | HRUs | WTI | WTRI |
---|---|---|---|---|

Mesh_pre | 128 | 1,626 | 4,763 | 580 |

Mesh_post | 716 | 1,952 | 5,945 | 852 |

Increase (%) | 459% | 20% | 25% | 47% |

Note that 12 h of simulation in PUMMA were needed for the initial mesh, while 96 h were used to simulate the post-segmented mesh. Hence, avoiding excessively high threshold values for the shape factors controls the number of segmented HRUs and reduces the computing time used in hydrological modeling.

## CONCLUSION AND FUTURE WORK

This paper presents a flexible divide-and-conquer strategy to create good-quality meshes (composed of simple polygons that fulfill certain geometrical constraints) for distributed hydrological models composed of pseudo-regular elements. Although based on threshold values of the *CI* and form factor (*FF*), the proposed strategy can be generalized to other watershed shape descriptors of relevance in hydrology (e.g., basin shape factor, elongation ratio, or compactness coefficient) or any other shape descriptor commonly used in image processing (i.e., roundness, aspect ratio, elongation, or compactness). The algorithms were applied to small catchments in France and Chile, and further tested with silhouette-based features. We conclude the following:

The number of final elements of the segmented mesh relates directly to the threshold values for the shape factors. This number increases exponentially with the threshold value of

*FF*(*FFT*), and in an irregular manner (i.e., as a step function) with the threshold value of*CI*(*CIT*). Values of*FFT*of 0.2 and 0.4 are recommended for mesh segmentation in undeveloped and highly developed areas, respectively. Conversely, values of*CIT*of 0.8 and 0.95 are recommended for locations with low and high density of green areas, respectively. Very large values of*FFT*or*CIT*can increase significantly the final number of elements in the model mesh, affecting the performance of numerical models using this mesh as a spatial domain.High convexity threshold values (

*CIT*∼ 0.995) allow obtaining meaningful parts when decomposing silhouette-based features. Because it incorporates the form factor, the proposed strategy is more flexible than existing algorithms, which mostly use the convexity to decompose polygons into pseudo-convex components.Overall, mesh segmentation is crucial to avoid bad-shaped elements that affect the realistic representation of hydrological connectivity in distributed hydrological modeling. Moreover, mesh segmentation facilitates the representation of the spatially distributed processes controlling not only the lumped response of the catchment, but the spatial variability of water quantity and fluxes within it.

Studies dedicated to building optimal model meshes (Zundel *et al.* 2002) and testing the performance of modeling tools at small scales (Abily *et al.* 2013) emphasize the importance of mesh refinement due to the time-consuming task involved. The approach here presented addresses this issue as they attempt to improve the construction of irregular model meshes for hydrological modeling. Because the proposed scripts are written in Python using GRASS functions, typical GIS features and formats can be easily utilized in their application. Unfortunately, GRASS functions may increase the running time required to run the scripts. Although not a big issue when processing small urban and peri-urban catchments, this limitation can be relevant when decomposing many complex non-convex features. As an alternative, geometrical libraries can be used to improve the running time of the proposed scripts. Finally, other examples of decomposition of non-convex polygons into ‘approximately convex’ elements have been implemented by Lien & Amato (2005) (ACD algorithm) and Liu *et al.* (2014) (DuDe algorithm). It would be interesting in future to compare these approaches in landscapes features' decomposition, as proposed by Sanzana *et al.* (2017).

## ACKNOWLEDGEMENTS

This work was developed within the framework of Project MAPA (IDRC 107081–001) and Project ECOS-CONICYT C14U02. The work presented was part of the Doctoral thesis funded by the CONICYT PCHA/2014-21140685 and ‘Estadías en el Extranjeros para Tesistas de Doctorado’ VRI-UC grants. Funding from Projects FONDECYT N°1131131 and 1181506, FONDAP 15110020 and IRSTEA-Lyon are also acknowledged. Finally, Jorge Gironás also acknowledges FONDAP 15110017. The assistance in the SWMM model construction from graduate students Alexander Hoch and Tomás Bunster is also appreciated. The Mercier catchment is part of OTHU (Observatoire de Terrain en Hydrologie Urbaine). This work was partially developed within the framework of the Panta Rhei Research Initiative of the International Association of Hydrological Sciences.

## REFERENCES

*) in France*

^{2}*U.S. Geological Survey Techniques and Methods*, book 6, chap. A45, 66 p.