## Abstract

Drinking water utilities and commercial vendors are developing battery-powered autonomous robots for the internal inspection of pipelines. However, these robots require nearby charging stations next to the pipelines of the water distribution networks (WDN). This prompts practical questions about the minimal number of charging stations and robots required. To address the questions, an integer linear programming optimization is formulated, akin to set covering, based on the shortest path of the charging stations to each node along a pipeline. The optimization decisions revolve around designating nodes as charging stations, considering the maximum distance (*δ*_{max}) at which a robot can cover a hard constraint. For optimal placement, two objective formulations are proposed: (i) minimize the total number of stations, representing total cost; and (ii) maximize the total redundancy of the system. The methodology is applied to three WDN topologies (i.e. *Modena, FiveReservoirs, and E-Town*). Results show the influence of topology on the total number of stations, the number of robots, and the redundancy of the charging stations network. A trade-off between *δ*_{max} and total number of stations emphasizes robot battery capacity's significance.

## HIGHLIGHTS

A new method for the selection of locations of optimal charging locations for monitoring robots.

Application of two different formulations for solving the problem, each with its advantages and disadvantages.

It allows utilities to solve business-related questions on the subject.

## INTRODUCTION

Research teams and utilities worldwide are actively developing robots to monitor various types of infrastructure. These robots have diverse applications, including traffic support, agriculture, emergency surveys, land surveying, marine applications, and military surveillance. Within the water industry, some projects focus on developing robots for leakage detection and pipeline condition monitoring. Some of these robots draw inspiration from existing oil and gas pipeline monitoring technology, commonly known as ‘pigging’ (H2O-Techniek 2019). These developments encompass a range of robot types, such as inline robots submerged in the flow stream for later recovery (Wu 2014, 2018; Mittmann 2019), miniature robots (Nguyen *et al.* 2022), and autonomous entities which mimic principles out of biological creatures (Dertien 2014). Such autonomous robots vary in their shape, the number of installed sensors, the telemetry system, Wi-Fi or network connectivity, and control systems. Among the creeping robot types, several research groups have developed their own versions (Wakimoto *et al.* 2003; Trebuňa *et al.* 2016; van Thienen *et al.* 2016; Miro *et al.* 2017; Worley *et al.* 2020). Battery-powered autonomous robots intended for use in Vitens' Water Distribution Networks (WDN) rely on an extensive network of charging stations. This raises questions for utilities considering their deployment, including:

What is the optimal placement of charging stations in the water network to maximize pipe coverage?

What is the minimum number of robots required for affordable and timely condition monitoring?

How many robots are required to schedule effective monitoring?

The allocation of charging stations is analogous to a set covering problem (SCP). An SCP describes that given a universe of connected elements (in this case the WDN) and a collection of sets that cover these elements (charging stations), the goal is to identify the smallest subset of these sets that covers all elements or to minimize the number of charging stations. This problem has practical applications in logistics, resource allocation, and facility location (Johnson 1974) and is known to be NP-hard, attracting significant attention in operations research and computer science (Karp 1972; Hochbaum 1982).

This manuscript presents a robust methodology to answer the questions described earlier, akin to SCP. The focus is on the optimal placement of charging stations for robots within WDN.

## METHODOLOGY

A water distribution network can be conceived as a graph where junctions and sources (tanks and reservoirs) are represented as nodes while links such as pipes, valves, and pumps are represented as edges. Thus, a WDN can be described as a graph , where is the number of edges and is the number of nodes.

Graph theoretical approaches have been successfully applied for optimizing WDN since the 1970s (Chandarshekar & Kesavan 1972; Deo 1974). These approaches have addressed various topics, including partitioning or sectorization (Hajebi *et al.* 2015; Bianchotti *et al.* 2021), energy consumption reduction (Castro-Gama *et al.* 2016) resilience, criticality, redundancy (Yazdani & Jeffrey 2011), pipe diameter design (Zheng *et al.* 2013), optimal sampling (Simone *et al.* 2016) and structure classification (Giustolisi *et al.* 2017).

The topology of a WDN is accessible in the form of the incidence matrix through open-source software such as EPANET (Rossman 2000; Rossman *et al.* 2019). The matrix contains the list of initial nodes () and final nodes () of each edge *p*. In EPANET, the incidence matrix corresponds to the representation of pipes, valves, and pumps of a WDN. However, it is worth noting that the benchmarks used in this study did not include valves or pumps. To obtain the topology data and coordinates of elements of a WDN, one can programmatically extract this information using the EPANET toolkit, or through geographical information systems (GIS) exports depending on the specific case. The authors utilized these tools due to their availability as freeware for model-based optimization applications. It is also relevant that Vitens' advisors use a set of commercial modelling tools as per corporate policy. A new version of this toolkit has been recently released (EPANET 2.2) and is available online (EPA 2020).

### Problem formulation

*et al.*2009). While a variety of approximate solutions exist for this problem (Deo 1974), advancements in computer capabilities allow us to analyse numerous scenarios for real-world conditions and diverse network topologies. Equation (1) presents the weighted version of the SCP (Wolsey 1998):For the purposes of analyses, is the objective function; represents the cost (or weight) of covering location

*i*. Decision variables are binary , indicating whether there is a station at the node or not, respectively. The matrix represents all boundary conditions that are formulated as inequalities, is the value of all possible locations which are present in each inequality and, on the right-hand side, is the required value of the inequality , or the minimum acceptable number of stations to node

*i*. On the other hand, in case some equalities are present in the formulation represents all boundary conditions that are formulated as equalities, is the value of for locations for each equality , and is the required value of the equality constraint.

For each node *i*, it is determined as the set of nodes which are reachable. To do this, a node preselection is performed, extracting the nodes which are closer to each other at a distance lower than the maximum distance (). Here is one-half of the maximum distance which any robot can cover. This is necessary to guarantee that a robot in the worst-case scenario of a long transport main will be able to either travel as far as and come back to the charging station, or travel up to to reach a second charging station. In this regard, different robot configurations can have different autonomous travel distances, and different charging station configurations can lead to different robot needs.

The maximum distance that a robot can travel inside a WDN is directly connected to the battery size. Larger battery sizes often result in a larger maximum distance (), increased robot weight, and greater volume. Conversely, future advancements in battery technology are expected to enhance efficiency, leading to reductions in battery size and associated emissions, though these aspects are not addressed in this article.

### Additional assumptions

In order to determine the optimal locations of the robot charging stations within a WDN certain assumptions must be made to align the problem with real-world conditions.

A WDN contains a limited number of nodes where charging stations can be placed. By setting the optimization problem taking into account only the existing nodes of the topology, it limits the possibility of setting the location of the stations at intermediate locations along the pipes. This means that ‘virtual’ nodes need to be added as feasible locations.

*N*. While it does impact the computational time of the optimization solver negatively, it is necessary to ensure that no pipe is inaccessible for inspection, preventing robots from becoming stranded.

Applying SCP algorithms to real-world problems like charging station location poses additional asset (e.g. such as pipe material, diameter, corrosion, leaching of asbestos, or biofilm) and hydraulic system (e.g. maximum flow, maximum velocity) challenges. A good optimal placement of charging stations will indicate that all nodes (i.e. original or virtual) are reachable. It is not feasible to add these as additional constraints to the algorithm directly. A more practical approach is to add a pre-processing step to the selection of the feasible (real or virtual) nodes/locations of the charging stations along the pipes. It is worth noting that this pre-processing step may result in fragmented subnetworks that are isolated from the rest of the system. While the current manuscript does not delve into this aspect, it is an important consideration during implementation. Ensuring connectivity between different parts of the network is vital for a robot's ability to traverse the WDN.

### Objective formulations

#### Formulation 1 – minimization of total number of stations

*i*. The total cost of a charging station is not known in advance, but it can be conceived as the sum of the fixed costs and specific costs at node

*i*(Equation (3)):

The matrix , represents all boundary conditions that are formulated as inequalities (Equation (4)). This matrix is a sparse square matrix with a rank equal to the number of nodes ( if not interpolated, *N* otherwise). Each row of represents the neighbourhood of node *i*. If the node *j* is close to *i* then is equal to 1, otherwise remains 0 (Equation (4)). No equalities are considered in this SCP formulation.

The variable is the set of all which are present in each inequality and, on the right-hand side, is the required value of the inequality . For example, a valid inequality is that the total number of charging stations near node should be higher or equal than . By default, is set to 1 for all nodes. As a minimum one charging station must be in the neighbourhood of node *i*. The value of can be predetermined based for example on the need to increase the number of stations from where a robot can visit node *i*. This can be needed for special condition monitoring near critical assets. If node *i* can be visited from more than one charging station, then it is said that its redundancy is increased. Redundancy means here the total number of stations *j* which are sufficiently close to reach node *i*.

#### Formulation 2 – maximization of redundancy

In this case, what is maximized is the total redundancy of all nodes , where represents the individual redundancy given by setting a station at location *i*. The redundancy of a node can be interpreted as the sum of all the rows of matrix as previously defined.

In this case, setting the number of stations () imposes equality constraints, satisfying this constraint is part of the optimization process. is estimated with the minimization of Formulation 1. In addition, in order to be able to use the selected optimization solver the objective has been set to negative, making it into a minimization problem.

### Topology constraints

The inline distance between charging stations and each node in the pipes is estimated by taking into account the shortest path along the pipes using Dijkstra (1959) algorithm. This analysis includes all virtual nodes which split pipes into trunks of pipes of a maximum length (). A distance matrix is created of size , and for each node to node , a denomination as close neighbours such that are selected for . The sparseness of the distance matrix really depends on the choice of . Each node (real or virtual) close enough to becomes a charging station, including the node itself.

### Determination of connected components

A subgraph, denoted as , representing the connectivity of the charging stations can be constructed from the distance matrix for a specific . To create this subgraph, we extract the relevant portion of the distance matrix from the main graph *G*. If a robot can travel from station to station , the adjacency in *H* is set to 1, indicating that the distance . If not, it suggests that the robots would become stranded, and further excavation and construction would be necessary to enable extraction. Subsequently, a depth-first search (DFS) algorithm (Tarjan 1972) is employed on the subgraph *H*, to determine the number of connected components within it. This is an important feature as it enables any utility to identify the minimum number of robots required per area or WDN. Additionally, it allows a utility to identify if there are different areas that can be inspected and covered independently by individual robots or by a number of them, hence increasing redundancy.

### Optimization software and other parameters

Calculations are conducted using MATLAB^{®} 2021A. Their Optimization Toolbox™ includes a solver for mixed integer linear programming (MILP) and for integer linear programming (ILP) (MATHWORKS 2021). The optimizer employs various pre-processing methods (Savelsbergh 1994; Andersen & Andersen 1995; Mészáros & Suhl 2003), including the extended simple integer rounding (ZI-round) (Wallace 2009), to prepare a problem. Subsequently, it verifies the feasibility of solutions (Cornuéjols 2008). Once a feasible initial solution is found, the ILP optimization process evolves based on a Branch and Bound (BnB) algorithm (Achterberg *et al.* 2005; Achterberg 2007), or using Gomory cut-plane algorithm (Gilmore & Gomory 1961, 1963; Cornuéjols 2006). The optimization stops when either the time limit is reached (set here at 4 h) or convergence is achieved with a minimum gap (or relative change between iterations) of 0.1%.

### Case studies

** Modena WDN:** Corresponds to a skeletonized version of the WDN of

*Modena*, Italy. This network contains 317 pipes, and 268 nodes, while it is supplied by four sources. The WDN has a total length of pipes of 71.8 km. This WDN has been used for many purposes for the WDN design problem (Bragalli

*et al.*2008, 2011) and the design of monitoring networks (Bragalli

*et al.*2016).

** FiveReservoirs:** This network corresponds to a network of 1,278 pipes, 935 nodes and

*fivereservoirs*. The network has a total length of pipes of 253.7 km. This benchmark network was originally published by Kang & Lansey (2012) and subsequently modified by Zheng & Zecchin (2014).

** E-Town**: The network of

*E-Town*has almost 14,000 pipes and more than 11,000 nodes. The total length of pipes is 873 km and the diameters range from 55.7 to 1,524 mm. An average demand of almost 2.0 m

^{3}/s is supplied from three treatment plants and two pumping stations. The WDN was the subject of an international competition denominated as Battle of Water Networks District Metering Areas (BWNDMA) between the end of 2014 and July 2016. Such a network is reported in several publications (Martínez-Solano

*et al.*2018).

## RESULTS AND DISCUSSION

*Modena*and

*FiveReservoirs*have similar pipe lengths,

*E-Town*has a larger number of short pipes, and as such requires less interpolation.

*Modena*. Figure 5(b) displays the adjacency matrix corresponding to the WDN. If a node is connected to another one a value of one is set (blue), otherwise zero (white) is given. While Figure 5(c) shows the histogram of all such distances. The adjacency matrix corresponds to the matrix with related nodes in the system. Given the distribution of , the maximum distance is 7.7 km, average of 3.2 km. This means that one robot will be able to cover the whole WDN if its is between charging stations and is less than 7.7 km.

### Original WDN and interpolated WDN

*Modena*, this implies an increase of 65% of non-zero (

*nz*) elements in its adjacency, while for a larger WDN such as

*E-Town*the increase of

*nz*is approximately 31% with 100 m and of 7% with 200 m. This is because the individual pipes of

*E-Town*are shorter, on average, than in the case of

*Modena*(Figure 4(c)).

### Optimization runs formulation 1 vs formulation 2

*FiveReservoirs*with = 2,500 m for both formulations (Figure 7, top vs bottom). A comparison is also possible with and without interpolation (Figure 7, left vs right) for each formulation. In this case, the interpolated cases use 100 m. The colour of each node represents the redundancy, or from how many stations is the node reachable. In all cases, is equal to seven. One of the important aspects to consider is the fact that although remain constant, their spatial distribution (nodes) is very variable as a function of the formulation and the decision to perform interpolation (or not).

A comparison of the results of Formulation 1 vs Formulation 2 shows that although the number of stations remains the same, the average redundancy increases. In fact, Formulation 1 obtains a maximum redundancy of four for a couple of nodes without interpolation, while Formulation 2 manages to have some nodes where the maximum redundancy is four. In any case, a prior run of Formulation 1 is needed to find out the optimal to apply on the right-hand side of Equation (2). The average redundancy for Formulation 2 is higher than 1.5 for both with and without interpolation, such a result is not possible with Formulation 1. In addition, the interpolation has an effect reducing for both formulations. The decrease of redundancy for the cases with interpolation in comparison to those without may be related to the positioning of long pipes. Long pipes on the periphery of the network will tend to have a lower redundancy and therefore adding virtual nodes via interpolation to the network will depress the mean redundancy. In this regard, a decision maker may be willing to consider the spatial distribution of charging stations based on redundancy to avoid having stranded robots in the pipelines.

An interesting aspect is the fact that the station located in the southeast for the four solutions presented, in Figure 7, is located on the same node. This indicates that at least one of the possible charging locations is very relevant independent of the formulation or addition of virtual nodes via interpolation. Regarding the two long pipes located in the south and southwest of the WDN, using the formulations without interpolation (left), it is not possible to know whether the pipe can be inspected in its length by a robot. However, after the inclusion of virtual nodes (right), it is verified that any node in those two long pipelines is reachable from at least one charging station. In the case of Formulation 1 interpolated, it is even possible for such pipes to be reached from two stations. This is very important for decision-makers as it portrays that the long transport mains in the WDN can be fully inspected without losing the robot during the inspection.

*H*of connected components among them in the WDN. In principle, each connected component will contain a single robot which will move along the pipelines close to each station at = 5,000 m. Let us examine Figure 7 in the top-right corner. It corresponds to Formulation 1 with interpolation and a = 2,500 m. The figure illustrates that stations numbered 2, 3, and 7 are connected due to a redundancy of three nodes in between, highlighted in purple. An examination of the distance matrix between all nodes should determine whether it is possible to establish connections between stations 2, 3, and 7. Table 1 presents the shortest path distances between nodes corresponding to each charging station in graph

*H*of Figure 7 top-right.

. | 1 . | 2 . | 3 . | 4 . | 5 . | 6 . | 7 . |
---|---|---|---|---|---|---|---|

1 | 0.0 | 3774.3 | 5460.1 | 7206.2 | 9720.1 | 6529.3 | 3613.3 |

2 | 3774.3 | 0.0 | 3654.8 | 4453.2 | 6604.7 | 3234.5 | 4257.7 |

3 | 5460.1 | 3654.8 | 0.0 | 7710.1 | 7262.9 | 3892.7 | 2861.5 |

4 | 7206.2 | 4453.2 | 7710.1 | 0.0 | 4276.8 | 4074.8 | 8335.5 |

5 | 9720.1 | 6604.7 | 7262.9 | 4276.8 | 0.0 | 3370.2 | 8362.2 |

6 | 6529.3 | 3234.5 | 3892.7 | 4074.8 | 3370.2 | 0.0 | 4992.0 |

7 | 3613.3 | 4257.7 | 2861.5 | 8335.5 | 8362.2 | 4992.0 | 0.0 |

. | 1 . | 2 . | 3 . | 4 . | 5 . | 6 . | 7 . |
---|---|---|---|---|---|---|---|

1 | 0.0 | 3774.3 | 5460.1 | 7206.2 | 9720.1 | 6529.3 | 3613.3 |

2 | 3774.3 | 0.0 | 3654.8 | 4453.2 | 6604.7 | 3234.5 | 4257.7 |

3 | 5460.1 | 3654.8 | 0.0 | 7710.1 | 7262.9 | 3892.7 | 2861.5 |

4 | 7206.2 | 4453.2 | 7710.1 | 0.0 | 4276.8 | 4074.8 | 8335.5 |

5 | 9720.1 | 6604.7 | 7262.9 | 4276.8 | 0.0 | 3370.2 | 8362.2 |

6 | 6529.3 | 3234.5 | 3892.7 | 4074.8 | 3370.2 | 0.0 | 4992.0 |

7 | 3613.3 | 4257.7 | 2861.5 | 8335.5 | 8362.2 | 4992.0 | 0.0 |

It is noteworthy that for station 2, the distances to stations 3 and 7 are 3,654.8 and 4,257.7 m, respectively, while stations 3 and 7 are 2,861.5 m apart. The robot should be capable of reaching any location within a radius of 2*, for a round trip. This observation indicates that the required number of robots is, in fact, only 1, as illustrated in the new Figure 8 (top-right).

Formulation 1 without interpolation requires a total of one robot the same as the interpolated case. In the case of Formulation 2 without interpolation, only one robot is required. By comparison, Formulation 2 with interpolation requires also one robot. If the deciding factor for decision makers of a utility is the robot cost for a fixed number of stations, there is an indication that neither Formulation 1 nor 2 produces a smaller amount of robots. In the end, it all depends on the number of connected components of the subgraph *H* as a function of .

### Time of computation

In general, for both problem formulations and the two small case studies (*Modena* and *FiveReservoirs*), a maximum time of computation of 4 h is set. The solutions presented in Figures 7 and 8 typically required only a few minutes to converge. In the case of *E-Town*, some optimization runs reached a minimum gap of 0.1% before the time limit was reached. It is worth noting that only tests conducted on larger WDN (with *nn* > 30,000, not presented here) resulted in some unfeasible solutions.

### Sensitivity to *δ*_{max}

In order to estimate the total number of charging stations as a function of , a set of runs is performed. The was kept constant for the same optimization run. Different runs are set for ranging between 0.5 and 4.0 km. This gives robots autonomy for a total of 1.0–8.0 km, depending on battery settings and utility needs.

*FiveReservoirs*seems to be very similar to the case of

*E-Town*, while the case of

*Modena*always requires additional charging stations for the same equivalent pipe length and .

The results of show (Figure 9(b)) that independent of , the maximum which can be achieved is a redundancy of ∼2. This means that even in the best-case scenario most nodes can be reached from two charging stations, while for most of the scenarios, with variable and independent of the topology of the WDN, all that can be achieved is a redundancy of 2.0. In general, the average redundancy also shows that the smaller , the lower the chances of having additional charging stations nearby some nodes. In the limit case if , then each node will have its own station, so . On the other hand if , then all nodes can be reached from a central charging station, making also .

In fact, if the robots are not to be retrieved from the pipelines, then it is safer to have robots travelling through their own isolated area of the WDN. This effect is not perceived in Figure 9(c), where the number of robots is presented (also in Figure 8 for = 2,500 m, all are equal to 1), but for some configurations, this is the case. Once again there is a trade-off with respect to , and similar to the case of the trend of *FiveReservoirs* and *E-Town* is very similar, *Modena* will require an additional amount of robots deployed for the same 1,000 km of equivalent pipe lengths.

## CONCLUSIONS

This article presents an efficient algorithm for the estimation of several charging stations and a number of autonomous robots required within a WDN. The two formulations cover different aspects of interest for decision-makers and work planners within utilities. The algorithms have been tested with three different topologies, providing answers to the three proposed questions and showcasing the applicability of traditional optimization methods to contemporary and future WDN-related subjects. Recently two new approaches were explored corresponding to a Greedy algorithm for larger WDN, and another one related to the minimization of the number of connected components (or minimization of ).

As Wu (2014) suggests, there are numerous methodologies available for robot planning and scheduling, treating robots either as individual agents or as part of a swarm robot control system. Further research should delve into these methodologies to advance this research field.

## ACKNOWLEDGEMENT

The authors would like to thank Rakesh Chotkan, Ivo de Liefde and Maarten Schellens who were Data Analysts of *Evides Waterbedrijf* (Water Company Southwest Netherlands) for providing input for the early development of Formulation 1 presented in this article.

## DATA AVAILABILITY STATEMENT

Data cannot be made publicly available; readers should contact the corresponding author for details.

## CONFLICT OF INTEREST

The authors declare there is no conflict.

## REFERENCES

*Constraint Integer Programming*

*Smart Water Network Management with in-Pipe Leak Detection Robots*

*Master's Thesis*

_{2}O Water Matters

*Design and Fabrication of a Maneuverable Robot for in-Pipe Leak Detection*

*Low-cost Soft Sensors and Robots for Leak Detection in Operating Water Pipes*