## Abstract

Recently, urban waterlogging prevention and treatment of black–odorous rivers have become a social concern and the upgradation of drainage system and the development of river runoff pollution control projects have accelerated. The use of deep tunnels to upgrade old drainage systems and achieve pollution control-related engineering designs has complicated the drainage system operation control. The traditional operation control mainly relies on human experience or model simulation. This study provides a perspective of machine learning for controlling the operation of the drainage system and exploring whether the operation suggestions regarding facilities in this system can be given in real time while relying only on real-time data and avoiding the complex model simulation process. Herein, five drainage systems were used as examples: the initial water level of a pipeline, key point water level flow, pump station front pool water level, and most unfavorable point water level were selected as relevant variables and four machine-learning discrimination methods were used for to analyze the weir-lowering operation of a deep tunnel. This study found that the average error rate of the linear discrimination method was <10%, thereby exhibiting satisfactory performance. This study provides insights for improving the operation of complex drainage systems.

## HIGHLIGHTS

ML can be used to address the switching problem in a deep tunnel, which is important for the functionality of it.

This study provides an insight to improve the operation of complex drainage systems using ML to provide operation suggestions of weirs in real time employing real-time data only.

ML explains physical phenomena from the perspective of probability distribution, providing a new way to solve switching problems.

### Graphical Abstract

## INTRODUCTION

For efficient drainage, a deep tunnel (deeply buried storage and drainage tunnel) is buried underground at a depth of >20 m. Deep tunnels usually have a large storage capacity for rainwater or combined sewage storage and transportation. Recently, extensive research has been conducted on the application and development trends of deep drainage tunnel technology, optimization of deep tunnel system control, and analysis of pollution control effects (Wang *et al.* 2016; Tan *et al.* 2018; Liao *et al.* 2019; Liu *et al.* 2019; Wei *et al.* 2019).

The construction of deep tunnel projects has two major objectives (Figure 1). The first objective is flood prevention for improving the safety of the drainage systems through standard upgrading and waterlogging prevention. The other objective is pollution control for reducing initial rainwater pollution and combined sewer overflow (CSO) emissions. Because the overall structure of a deep tunnel system is in the new drainage system, which comprises the deep tunnel system and original drainage system, collaboration is required to achieve this engineering objective. For improving the original rainwater system standard, deep tunneling increases the inflow point and changes the hydraulic line of the original system, thereby reducing the quantity of water that exceeds the system's design standard.

In deep tunnel projects, a unified and contradictory complex relationship exists between flood prevention and pollution control. The unity implies that the two engineering objectives must be realized in the same project and coordinated with each other. However, contradiction arises because the requirements and operating modes of the two engineering objectives do not match completely and in some cases, there are conflicts between the two engineering objectives. For example, under certain conditions, to achieve the objective of a pollution control project of not releasing a certain number of millimeters of rain (10 mm of initial rain) into a river, the municipal pump is expected to start after the first 10 mm of rain enters the deep tunnel system. However, to improve the system's drainage capacity, the deep tunnel system must be used at a peak-shaving storage capacity, i.e., the system will not be activated until the rain peak is achieved. In essence, the two objectives are realized through the deep tunnel's storage space, and storage space limitation is the main cause of contradiction.

A deep tunnel is equipped with weirs at new inflow points, allowing a small rainfall's water to be lifted to the sewage pipe network through the established system's interception pump. This way, deep tunnels are avoided, and the cost of power consumption is reduced. To control the inflow of a deep tunnel, an adjustable weir with a flexible control form is used.

The adjustment of a weir is critical for realizing the functionality of a deep tunnel. The model's calculation results show that lowering the weir too early or too late may affect the functionality. On the one hand, if the weir is lowered too early, the deep tunnel system may get filled and closed prematurely. Therefore, the deep tunnel system will be unable to participate in the peak-clipping process, resulting in water accumulation in the system. Furthermore, discharging rainwater into deep tunnels before peak runoff may result in the incomplete startup of municipal pumps and inadequate usage of the shallow system's drainage capacity. On the other hand, if the weir is lowered too late, the flow weir may be unable to contribute to reducing the rain peak. Because when the rain peak is reached, if the shallow pipe network is running at maximum load and the inflow weir is not lowered on time, the insufficient flow of the weir will lead to water accumulation in the system. Therefore, it is critical to lower the weir at the appropriate time to increase the flow and peak-cutting capacities. Hence, this study investigates how to establish a relationship between the data obtained from drainage system's monitoring and the time of lowering the weir.

Machine learning (ML), as a type of artificial intelligence technology, can predict the future based on the large amount of collected data (Zalavadia *et al.* 2021). The ML technology mainly uses algorithms to analyze data and make inferences or predictions based on learning (Bernardelli *et al.* 2020). Given the large amount of continuously updated effective data generated during deep tunnel research and actual operation, ML can analyze the data to determine the time of the weir lowering. Because lowering the weir in the deep tunnel is a simple switching problem (subject to supervised learning) (Ki *et al.* 2018), the data (training data) in the database has a clear judgment result. Based on the aforementioned conditions, the established prediction model is continuously adjusted by comparing the ML prediction results with the actual training data results, to achieve a higher judgment accuracy rate (Fleuren *et al.* 2020).

ML has been widely used in finance, transportation, medicine, and other fields. For example, ML has been used in the prevention, management, and monitoring of new coronaviruses (Rodríguez-Tomàs *et al.* 2021). By analyzing large amounts of data (including medical information, human behavior patterns, and environmental conditions), ML can assist judgment and decision-making. Nowadays, the application of ML technology in drainage system is common; in fact, there are numerous studies regarding CSO control using the ML technology (Hong *et al.* 2017; Gudaparthi *et al.* 2020). In this study, five drainage systems in Shanghai, China, were used as examples to obtain a training database from each system for ML (Figure 2). Then, the time when a deep tunnel should lower the weir was determined using different discriminant analysis algorithms. Thus, when a discrimination algorithm identifies the data sequence obtained from the monitoring center as being true, a command is sent to the inflow weir to perform the weir-lowering operation, thereby ensuring that the deep tunnel improves the realization of the rainwater system standard.

The features of the five drainage systems are shown in Table 1.

. | System A . | System B . | System C . | System D . | System E . |
---|---|---|---|---|---|

Type of system | Established diversion system | Established diversion system | Diversion system planned to be built | Established diversion system | Established diversion system |

Service area (km^{2}) | 3.50 | 2.93 | 3.28 | 2.90 | 1.30 |

Rainstorm return period (year) | 1 | 1 | 3 | 1 | 1 |

Integrated runoff coefficient (the ratio of surface runoff to rainfall for a certain catchment area) | 0.6 | 0.6 | 0.6 | 0.6 | 0.6 |

Main trunk pipes of system | The system had two main trunk pipes, and the rainwater in the two trunk pipes merged into a 3500 × 2400 rainwater tank culvert. | The main pipe diameter was DN1800–DN2700. | The main pipe diameter was DN800–DN3000. | Three main pipes | Three main pipes |

. | System A . | System B . | System C . | System D . | System E . |
---|---|---|---|---|---|

Type of system | Established diversion system | Established diversion system | Diversion system planned to be built | Established diversion system | Established diversion system |

Service area (km^{2}) | 3.50 | 2.93 | 3.28 | 2.90 | 1.30 |

Rainstorm return period (year) | 1 | 1 | 3 | 1 | 1 |

Integrated runoff coefficient (the ratio of surface runoff to rainfall for a certain catchment area) | 0.6 | 0.6 | 0.6 | 0.6 | 0.6 |

Main trunk pipes of system | The system had two main trunk pipes, and the rainwater in the two trunk pipes merged into a 3500 × 2400 rainwater tank culvert. | The main pipe diameter was DN1800–DN2700. | The main pipe diameter was DN800–DN3000. | Three main pipes | Three main pipes |

## MATERIALS AND METHODS

### Selection of control factors and control methods

#### Selection of control factors

This study predicts the working conditions based on the characteristic changes of relevant variables at key points to guide the control and scheduling of related facilities. The key monitoring points regarding to relevant variables are as follows: initial water level, inflow point of the secondary and tertiary pipelines (simultaneous monitoring of water level and flow), pumping station forebay (located at the most downstream of the system, where the system's water level can be monitored), and the most unfavorable point (the water level at this point is monitored). Water is most likely accumulated in the upper and middle parts of the system, where the terrain is low. This study investigated the relationship between real-time data and weir-lowering operation based on the data (the above-mentioned factors) used for training the ML model.

#### Control method

Control objective: Under the premise of ensuring the safety of flood control (five-year rainfall without ponding), pollution control should be achieved as much as possible (10-mm initial rainfall into a deep tunnel).

Control elements: adjustable weir and municipal pump.

- 1.
After the real-time data were analyzed and identified as ‘true,’ the adjustable weir would be lowered to the bottom.

- 2.
Water accumulation at an unfavorable point (main pipe) triggered the pump immediately.

### Selection of discriminant analysis

To achieve the control objectives, we determined the operation of relevant variables based on discriminant analysis. Discriminant analysis is a method for classifying samples of unknown categories. After classifying the research objects, the discriminant formula and criterion were established on the basis of the extracted samples; subsequently, the categories of the unknown samples were determined.

Discriminant analysis is useful for various applications. For example, in archaeology (Kovarovic *et al.* 2011), the age of a tomb, its identity, and the sex of the owner are identified using unearthed objects. In medicine, the type of disease is determined by analyzing a patient's clinical symptoms and laboratory results (Stühler *et al.* 2011). In the field of pattern recognition, the analysis is used for text recognition, speech recognition, fingerprint recognition, etc.

In this study, the following four methods were selected to analyze the weir-lowering operation of a deep tunnel: linear, linear diagonal matrix, quadratic discriminant in distance, and naive Bayes discriminant in Bayesian discriminant methods. The calculation principles of the distance and Bayesian discriminant methods are given as follows.

- 1.
Distance discriminant method (Mahalanobis distance): The Mahalanobis distance (mean vector and covariance matrix) between the sample and its population is the shortest, whereas the Mahalanobis distance between the sample and other populations should be large. The calculation principle is as follows:

- 2.
Bayesian discriminant method: People's existing cognition of the research object may affect the result of the judgment; however, the distance discrimination method does not consider this cognition. First, the Bayesian discriminant assumes a prior probability to describe an existing cognition. Then, the sample corrects the prior probability to obtain the posterior probability. Finally, another decision is made on the basis of the posterior probability. The calculation principle is as follows:

*k*p-dimensional populations are assumed as

*G*

_{1},

*G*

_{2,}…,

*G*. Thus, the probability density functions are , , …, , respectively. Assuming the prior probability of sample

_{k}*x*coming from the population

*Gi*is

*p*(

_{i}*i*= 1, 2, 3…,

*k*), then . Based on the Bayesian theory, the posterior probability of sample

*x*from the total

*Gi*is

*Ri*is used to represent the set of all samples that may be classified as

*G*(

_{i}*i*= 1, 2, …,

*k*) according to a certain criterion. Simultaneously,

*c*(

*j|i*) (

*i*,

*j*= 1, 2, …,

*k*) is used to represent the cost of misclassification of sample

*x*from

*Gi*as coming from

*Gj*, then c(

*i|i*) = 0. The conditional probability of misjudging the sample

*x*from

*Gi*as coming from

*Gj*is

If the sample's ECM to *Gi* is smaller than the sum of the other overall miscalculation costs, the sample is classified as *Gi*.

### Model selection

#### Selection of mathematical model

Urban drainage network models can be divided into three categories: hydrological, hydraulic, and comprehensive models. The hydrological model mainly adopts a black or gray box model to simulate the influence of rainfall on runoff and confluence (Susanna *et al.* 2016). The hydraulic model mainly adopts the microscopic physical laws, such as continuity and momentum equations, to simulate the flow of rainwater and sewage in the slope and pipe networks, especially changes in the values of hydraulic elements such as flow velocity and volumetric flow. The comprehensive model is a combination of the hydrological and hydraulic models as well as a comprehensive application that includes the simulation of discharge and transmission in rainwater and sewage. The current model of urban drainage systems mainly adopts the comprehensive model, and some modules in the comprehensive model belong to the hydrological or hydraulic models. For example, the RUNOFF module in storm water management model (SWMM) belongs to the hydrological model and the TRANSPORT and EXTRAN modules belong to the hydraulic model.

In terms of model origin and development, the United States Environmental Protection Agency proposed SWMM and a storage, treatment, overflow, runoff model in the early 1970s, which were continuously updated in the later period. Stormwater hydraulics and quality models such as Distributed Routing Rainfall-Runoff Model-Quality (DR3M-QUAL), Hydrologic Simulation Program-Fortran, Hydro-works, Wallingford, and Model of Urban Sewers (MOUSE) have emerged internationally since then. Later, Hydro-works further evolved into an InfoWorks model. The InfoWorks CS series was the first model that was launched, followed by the InfoWorks ICM series in 2011.

Numerous studies have been conducted worldwide on urban drainage network models. The most widely used comprehensive models are SWMM, InfoWorks, and MOUSE (belonging to the MIKE series software). The comparison of the three models is presented in Table 2.

Factor . | SWMM . | InfoWorks . | Mouse . |
---|---|---|---|

Software type | Comprehensive software | Comprehensive software | Comprehensive software |

Simulation method | Single and continuous event simulation | Single and continuous event simulation | Single and continuous event simulation |

Water quantity simulation | Yes | Yes | Yes |

Water quality simulation | Yes | Yes | Yes |

Inflow mode | Node inflow | Node inflow | Nodal and lateral inflow |

Rainfall-runoff module | Three types of runoff modules and one type of confluence modules | 13 types of runoff modules and nine types of confluence modules | Five types of production and confluence modules |

Data interface | Connect with pictures | Connect with AutoCAD and GIS | Connect with AutoCAD and GIS |

Modalities of property rights | Free | Paid | Paid |

Software maturity | Sometimes secondary development is needed | Relatively mature | Relatively mature |

User self-development | Yes | No | No |

Factor . | SWMM . | InfoWorks . | Mouse . |
---|---|---|---|

Software type | Comprehensive software | Comprehensive software | Comprehensive software |

Simulation method | Single and continuous event simulation | Single and continuous event simulation | Single and continuous event simulation |

Water quantity simulation | Yes | Yes | Yes |

Water quality simulation | Yes | Yes | Yes |

Inflow mode | Node inflow | Node inflow | Nodal and lateral inflow |

Rainfall-runoff module | Three types of runoff modules and one type of confluence modules | 13 types of runoff modules and nine types of confluence modules | Five types of production and confluence modules |

Data interface | Connect with pictures | Connect with AutoCAD and GIS | Connect with AutoCAD and GIS |

Modalities of property rights | Free | Paid | Paid |

Software maturity | Sometimes secondary development is needed | Relatively mature | Relatively mature |

User self-development | Yes | No | No |

SWMM, InfoWorks, and MOUSE are relatively complete in terms of functionality. They can simulate not only the amount of rainwater but also the water quality of rainwater runoff and drainage networks. Because InfoWorks and MOUSE are paid software, they are more integrated and mature. However, SWMM may require some subsequent development. Furthermore, InfoWorks provides the most choices in the production and concentration of modules (Zalavadia & Gildin 2021), which may be more applicable to different cities and regions. Therefore, InfoWorks was selected for the simulation in this study.

#### Selection of rainfall-runoff model

In this study, three commonly used runoff models were selected for comparison: the integrated runoff coefficient method, fixed runoff coefficient method, and Horton method. Their overview and application scope are as follows:

- (1)
Integrated runoff coefficient method (proportional loss model): It can directly define the proportion of rainfall entering the system, i.e., net rainfall is a fixed proportion of rainfall intensity. Instead of subdividing different land-use types, the entire catchment area adopts a fixed proportion.

- (2)
Fixed runoff coefficient method: This model defines a fixed percentage of net rainfall, which becomes runoff. Different coefficients can be used for different catchment areas.

- (3)
Horton method: This method considers the soil's infiltration capacity and its time-variation (Yang

*et al.*2020).

In this study, two commonly used concentration models were selected for further comparison: the Wallingford model and SWMM. Their overview and scope of application are as follows:

- (1)
Wallingford model: Its storage-routing model is based on a dual quasilinear reservoir model. For each surface type, two reservoirs are used in a series, with each reservoir having an equivalent storage–output relationship.

- (2)
SWMM: Flow is routed using a single nonlinear reservoir, whose routing coefficient depends on the surface roughness, surface area, ground slope, and catchment width.

First, the abovementioned three runoff generation models and two confluence models are combined to create six rainfall-runoff models. The results of the combination are shown in Table 3. Then, taking system A as the research object, two typical rainfall processes in June 2015 and August 2015 were selected to compare the consistency between the simulation results (simulated water level curve of pumping station forebay) and the actual data (measured water level curve of pumping station forebay). Finally, for the study area, the optimal combination of the runoff generation and confluence models for the study area was selected.

. | Runoff generation models . | Confluence models . |
---|---|---|

1 | Integrated runoff coefficient method | SWMM |

2 | Integrated runoff coefficient method | Wallingford |

3 | Fixed runoff coefficient method | SWMM |

4 | Fixed runoff coefficient method | Wallingford |

5 | Horton method | SWMM |

6 | Horton method | Wallingford |

. | Runoff generation models . | Confluence models . |
---|---|---|

1 | Integrated runoff coefficient method | SWMM |

2 | Integrated runoff coefficient method | Wallingford |

3 | Fixed runoff coefficient method | SWMM |

4 | Fixed runoff coefficient method | Wallingford |

5 | Horton method | SWMM |

6 | Horton method | Wallingford |

In this study, the parameters of runoff generation and confluence models were benchmark values applicable to the Shanghai area (Table 4), which were selected through normative and literature review (ASCE 1992; McCuen 1996; Rossman 2015; Rossman & Huber 2016).

Object . | Type . | Reasonable range . | Benchmark . |
---|---|---|---|

Runoff coefficient | Composite | Specified value | Specified value |

Permeable area | 0.85–0.95 | 0.9 | |

Impervious area | 0.1–0.2 | 0.15 | |

The confluence parameters of SWMM (Manning roughness coefficient) | Composite | 0.03–0.10 | Optimal value by calibration in a reasonable range |

Impervious area | 0.01–0.32 | 0.0175 | |

Permeable area | 0.1–0.3 | 0.2 | |

Confluence parameters of Wallingford | Composite | 5–8 | Optimal value by calibration in a reasonable range |

Permeable area | 5–7 | 5 | |

Impervious area | About 10 | 10 | |

Parameters of Houghton model | Initial permeability (mm/h) | 67–120 | 70 |

Steady permeability (mm/h) | 0.6–25.4 | 2.5 | |

Attenuation coefficient (h^{−1}) | About 2 | 2 | |

Initial loss value (mm) | Permeable area | 6–10 | 8 |

Impervious area | 0.5–2.5 | 1.5 |

Object . | Type . | Reasonable range . | Benchmark . |
---|---|---|---|

Runoff coefficient | Composite | Specified value | Specified value |

Permeable area | 0.85–0.95 | 0.9 | |

Impervious area | 0.1–0.2 | 0.15 | |

The confluence parameters of SWMM (Manning roughness coefficient) | Composite | 0.03–0.10 | Optimal value by calibration in a reasonable range |

Impervious area | 0.01–0.32 | 0.0175 | |

Permeable area | 0.1–0.3 | 0.2 | |

Confluence parameters of Wallingford | Composite | 5–8 | Optimal value by calibration in a reasonable range |

Permeable area | 5–7 | 5 | |

Impervious area | About 10 | 10 | |

Parameters of Houghton model | Initial permeability (mm/h) | 67–120 | 70 |

Steady permeability (mm/h) | 0.6–25.4 | 2.5 | |

Attenuation coefficient (h^{−1}) | About 2 | 2 | |

Initial loss value (mm) | Permeable area | 6–10 | 8 |

Impervious area | 0.5–2.5 | 1.5 |

### Input and output

First, we assume that the weir operation is closely related to the initial water level, the water level and flow at the inflow point of the secondary and tertiary pipelines, the water level of the pump station forebay, and the water level at the most unfavorable point. Second, we define the record operation categories as 0 and 1. In operation category 1, the weir-lowering operation should be performed based on the monitoring data of the same line. By contrast, in operation category 0, the weir-lowering operation should not be performed based on the monitoring data of the same line. Then, we record six and twelve sets of observations with operation categories 1 and 0, respectively. Subsequently, we record a set of data as unknown operation data. Finally, we establish a sample matrix (including all the above data), a training matrix (including all the data except the unknown records), and a weir-lowering operation matrix (group 1).

Based on the training matrix and group 1, we use different discriminant methods to obtain group 2 of the sample matrix. The difference between group 1 and group 2 is defined as the misjudgment ratio (group 1 will be exactly equal to group 2 only if the sample completely conforms to the overall distribution).

The calculation procedure is shown in Figure 3.

## RESULTS AND DISCUSSION

### Model simulation

Taking system A as an example, when the runoff generation model used the integrated runoff coefficient method, the curves obtained using different confluence models (SWMM and Wallingford model) are shown in Figures 4 and 5, respectively. The shape of the simulated water level of the pump station forebay is known to be closer to the shape of the measured process line when the runoff generation and confluence models use the integrated runoff coefficient method and SWMM, respectively.

When the runoff model adopted the fixed runoff coefficient method, the curves obtained from different confluence models are as shown in Figures 6 and 7.

Figures 6 and 7 show that the water level process lines obtained from the confluence model using SWMM and Wallingford models are similar to the process lines measured in June and August 2015, respectively, when the fixed runoff coefficient method was employed in the runoff generation model. Furthermore, the water level process line obtained from the confluence model using the Wallingford model shows that at around 9 a.m. on August 24, 2015, there was no peak water level of forebay corresponding to the measured rainfall peak, whereas the SWMM does. Although the peak value generated by the SWMM used in the confluence model is higher than the actual water level, it is more conducive with respect to the engineering safety considerations.

Therefore, the SWMM is more favorable with respect to engineering safety when the runoff generation model adopts the fixed runoff coefficient method.

The curves obtained using different confluence models when the runoff generation model adopts the Horton method (wherein permeable area can be obtained using the Horton method and impervious area can be obtained using the fixed runoff coefficient method) are as shown in Figures 8 and 9.

Figures 8 and 9 show that when the runoff generation model adopts the Horton method, the shape of the process line drawn by the two confluence models is similar to the rainfall process in June 2015. However, the peak water level corresponding to the peak rainfall of the curve obtained by the SWMM is relatively complete, which is conducive to engineering safety. Furthermore, the water level process line drawn from the confluence model, using the Wallingford method, shows that at around 9 a.m. on August 23, 2015, there was no peak water level of forebay corresponding to the measured rainfall peak, but the SWMM does. Similar to the preceding analysis, while the peak value generated by the SWMM used in the confluence model is higher than the actual water level, it is more conducive to engineering safety considerations.

Therefore, when the runoff generation model adopts the Horton method, it is more beneficial to choose SWMM as the confluence model with respect to engineering safety.

Figures 10 and 11 show the simulation results of all the rainfall runoff models for the two periods of rainfall in June and August 2015, respectively. Considering the shape similarity and engineering safety, the Horton method and SWMM were recommended for the runoff generation and confluence models, respectively. Additionally, in this study, the relative error between the simulated and actual values of each peak water level was estimated to evaluate the degree of conformity between the model and actual processes (Table 5). By observing Table 5, we can draw similar conclusions as Figures 10 and 11.

Runoff generation models . | Confluence models . | Relative error (%) . | |
---|---|---|---|

Rainfall conditions in June 2015 . | Rainfall conditions in August 2015 . | ||

Integrated runoff coefficient method | SWMM | −37% | −27% |

Integrated runoff coefficient method | Wallingford | 22% | 4% |

Fixed runoff coefficient method | SWMM | 16% | 6% |

Fixed runoff coefficient method | Wallingford | −34% | −16% |

Horton method | SWMM | 4% | 4% |

Horton method | Wallingford | −38% | −36% |

Runoff generation models . | Confluence models . | Relative error (%) . | |
---|---|---|---|

Rainfall conditions in June 2015 . | Rainfall conditions in August 2015 . | ||

Integrated runoff coefficient method | SWMM | −37% | −27% |

Integrated runoff coefficient method | Wallingford | 22% | 4% |

Fixed runoff coefficient method | SWMM | 16% | 6% |

Fixed runoff coefficient method | Wallingford | −34% | −16% |

Horton method | SWMM | 4% | 4% |

Horton method | Wallingford | −38% | −36% |

Similarly, by evaluating the runoff generation and confluence models of other systems, this study also obtained the same conclusions.

The model parameters in five systems were further calibrated and verified using the recommended rainfall-runoff model, and the results are shown in Table 6.

. | System A . | System B . | System C . | System D . | System E . |
---|---|---|---|---|---|

Initial loss value of impervious area (mm) | 0.5 | 0.5 | 0.5 | 0.5 | 0.5 |

Initial loss value of green area (mm) | 6 | 6 | 6 | 6 | 6 |

Initial infiltration rate of green space (mm/h) | 70 | 65 | 70 | 70 | 70 |

Steady infiltration rate of green space (mm/h) | 2.5 | 1 | 2.5 | 2.5 | 2.5 |

Characteristic width (m) | Defaults | Defaults*0.5 | Defaults*0.8 | Defaults | Defaults |

Manning roughness of impervious area | 0.023 | 0.0175 | 0.0175 | 0.023 | 0.023 |

Manning roughness of green space | 0.2 | 0.2 | 0.2 | 0.2 | 0.2 |

. | System A . | System B . | System C . | System D . | System E . |
---|---|---|---|---|---|

Initial loss value of impervious area (mm) | 0.5 | 0.5 | 0.5 | 0.5 | 0.5 |

Initial loss value of green area (mm) | 6 | 6 | 6 | 6 | 6 |

Initial infiltration rate of green space (mm/h) | 70 | 65 | 70 | 70 | 70 |

Steady infiltration rate of green space (mm/h) | 2.5 | 1 | 2.5 | 2.5 | 2.5 |

Characteristic width (m) | Defaults | Defaults*0.5 | Defaults*0.8 | Defaults | Defaults |

Manning roughness of impervious area | 0.023 | 0.0175 | 0.0175 | 0.023 | 0.023 |

Manning roughness of green space | 0.2 | 0.2 | 0.2 | 0.2 | 0.2 |

Based on the selected rainfall-runoff model and parameters, we could proceed to the next step of research, which was to simulate the inflow point flow, water level, and the judgment result of weir lowering under these design conditions.

### Procedure and results of calculation

#### Original data and standardized conversion

First, system A was taken as an example, and the variables mentioned in *2.1.1* were selected as the sample matrix based on the model's simulation results (the last data was assumed to be unknown). Table 7 presents the element table of relevant variables of system A.

Number . | Operation category . | Initial water level (m) . | Water level at the inflow point of Road X (m) . | Water level at the inflow point of Road Y (m) . | Water level at the most unfavorable point (m) . | Water level of pumping station forebay (m) . | Inflow point flow of Road X (m^{3}/s)
. | Inflow point flow of Road Y (m^{3}/s)
. |
---|---|---|---|---|---|---|---|---|

1 | 1 | 1.3 | 2.06 | 1.91 | 2.15 | 2.17 | 6.04 | 4.32 |

2 | 1 | 1.3 | 2.1 | 1.93 | 2.15 | 2.23 | 6.45 | 4.52 |

3 | 1 | 1.3 | 2.09 | 1.92 | 2.08 | 2.24 | 6.34 | 4.46 |

4 | 1 | 1.3 | 1.94 | 1.84 | 1.87 | 2.08 | 4.6 | 3.56 |

5 | 1 | 0.25 | 1.47 | 1.15 | 1.63 | 2.02 | 12.25 | 7.81 |

6 | 1 | 0.25 | 1.79 | 1.3 | 2.23 | 2.3 | 17.45 | 9.87 |

7 | 0 | 1.3 | 2.25 | 2.02 | 2.8 | 2.32 | 8.36 | 5.49 |

8 | 0 | 0.25 | 1.68 | 1.24 | 2.79 | 1.99 | 15.58 | 8.99 |

9 | 0 | 0.25 | 1.35 | 1.06 | 1.77 | 1.5 | 10.48 | 6.65 |

10 | 0 | −4.32 | −0.4 | 0.03 | 1.8 | 0.8 | 24.42 | 7.62 |

11 | 0 | −4.32 | −0.61 | −0.11 | 1.69 | 0.4 | 20.54 | 5.99 |

12 | 0 | −4.32 | −0.75 | −0.21 | 1.62 | 0.08 | 17.97 | 4.76 |

13 | 0 | −4.32 | −0.99 | −0.44 | 1.52 | −0.37 | 14.08 | 2.51 |

14 | 0 | 1.3 | 2.25 | 2.05 | 2.8 | 2.32 | 8.36 | 5.49 |

15 | 0 | 1.3 | 2.34 | 2.07 | 3.07 | 2.46 | 9.56 | 6.13 |

16 | 0 | 1.3 | 1.98 | 1.86 | 2.16 | 2 | 5.11 | 3.79 |

17 | 0 | 1.3 | 1.79 | 1.73 | 1.85 | 1.79 | 3.08 | 2.53 |

18 | 0 | 1.3 | 1.47 | 1.47 | 1.48 | 1.47 | 0.64 | 0.62 |

19 | NaN | 0.25 | 1.08 | 0.88 | 1.42 | 1.15 | 6.93 | 4.58 |

Number . | Operation category . | Initial water level (m) . | Water level at the inflow point of Road X (m) . | Water level at the inflow point of Road Y (m) . | Water level at the most unfavorable point (m) . | Water level of pumping station forebay (m) . | Inflow point flow of Road X (m^{3}/s)
. | Inflow point flow of Road Y (m^{3}/s)
. |
---|---|---|---|---|---|---|---|---|

1 | 1 | 1.3 | 2.06 | 1.91 | 2.15 | 2.17 | 6.04 | 4.32 |

2 | 1 | 1.3 | 2.1 | 1.93 | 2.15 | 2.23 | 6.45 | 4.52 |

3 | 1 | 1.3 | 2.09 | 1.92 | 2.08 | 2.24 | 6.34 | 4.46 |

4 | 1 | 1.3 | 1.94 | 1.84 | 1.87 | 2.08 | 4.6 | 3.56 |

5 | 1 | 0.25 | 1.47 | 1.15 | 1.63 | 2.02 | 12.25 | 7.81 |

6 | 1 | 0.25 | 1.79 | 1.3 | 2.23 | 2.3 | 17.45 | 9.87 |

7 | 0 | 1.3 | 2.25 | 2.02 | 2.8 | 2.32 | 8.36 | 5.49 |

8 | 0 | 0.25 | 1.68 | 1.24 | 2.79 | 1.99 | 15.58 | 8.99 |

9 | 0 | 0.25 | 1.35 | 1.06 | 1.77 | 1.5 | 10.48 | 6.65 |

10 | 0 | −4.32 | −0.4 | 0.03 | 1.8 | 0.8 | 24.42 | 7.62 |

11 | 0 | −4.32 | −0.61 | −0.11 | 1.69 | 0.4 | 20.54 | 5.99 |

12 | 0 | −4.32 | −0.75 | −0.21 | 1.62 | 0.08 | 17.97 | 4.76 |

13 | 0 | −4.32 | −0.99 | −0.44 | 1.52 | −0.37 | 14.08 | 2.51 |

14 | 0 | 1.3 | 2.25 | 2.05 | 2.8 | 2.32 | 8.36 | 5.49 |

15 | 0 | 1.3 | 2.34 | 2.07 | 3.07 | 2.46 | 9.56 | 6.13 |

16 | 0 | 1.3 | 1.98 | 1.86 | 2.16 | 2 | 5.11 | 3.79 |

17 | 0 | 1.3 | 1.79 | 1.73 | 1.85 | 1.79 | 3.08 | 2.53 |

18 | 0 | 1.3 | 1.47 | 1.47 | 1.48 | 1.47 | 0.64 | 0.62 |

19 | NaN | 0.25 | 1.08 | 0.88 | 1.42 | 1.15 | 6.93 | 4.58 |

#### Calculation method and misjudgment rate

The judgment was obtained after the joint estimation of the covariance matrix was performed based on the sample using the linear discriminant method, assuming that the prior distributions of each group were p-element normal distributions with the same covariance matrix. The covariance matrix can be estimated using the linear diagonal matrix. In the quadratic discriminant method, the prior distribution of each group was assumed to be p-ary normal distribution, but the covariance matrix was not the same (Peck *et al.* 1988). In the Bayesian discriminant method, the future samples were predicted using a naive Bayesian classifier after the samples were fitted. The ML results of the four methods are presented in Table 8.

Number . | Linear discriminant method . | Linear diagonal matrix . | Quadratic discriminant method . | Bayes discriminant method . |
---|---|---|---|---|

1 | 1 | 1 | 1 | 1 |

2 | 1 | 1 | 1 | 1 |

3 | 1 | 1 | 1 | 1 |

4 | 1 | 1 | 1 | 1 |

5 | 1 | 1 | 1 | 1 |

6 | 1 | 1 | 1 | 1 |

7 | 0 | 1 | 1 | 1 |

8 | 0 | 1 | 0 | 0 |

9 | 0 | 0 | 0 | 0 |

10 | 0 | 0 | 0 | 0 |

11 | 0 | 0 | 0 | 0 |

12 | 0 | 0 | 0 | 0 |

13 | 0 | 0 | 0 | 0 |

14 | 0 | 1 | 1 | 1 |

15 | 0 | 1 | 0 | 0 |

16 | 0 | 1 | 1 | 1 |

17 | 0 | 1 | 1 | 1 |

18 | 0 | 1 | 0 | 0 |

19 | 0 | 0 | 0 | 0 |

Number . | Linear discriminant method . | Linear diagonal matrix . | Quadratic discriminant method . | Bayes discriminant method . |
---|---|---|---|---|

1 | 1 | 1 | 1 | 1 |

2 | 1 | 1 | 1 | 1 |

3 | 1 | 1 | 1 | 1 |

4 | 1 | 1 | 1 | 1 |

5 | 1 | 1 | 1 | 1 |

6 | 1 | 1 | 1 | 1 |

7 | 0 | 1 | 1 | 1 |

8 | 0 | 1 | 0 | 0 |

9 | 0 | 0 | 0 | 0 |

10 | 0 | 0 | 0 | 0 |

11 | 0 | 0 | 0 | 0 |

12 | 0 | 0 | 0 | 0 |

13 | 0 | 0 | 0 | 0 |

14 | 0 | 1 | 1 | 1 |

15 | 0 | 1 | 0 | 0 |

16 | 0 | 1 | 1 | 1 |

17 | 0 | 1 | 1 | 1 |

18 | 0 | 1 | 0 | 0 |

19 | 0 | 0 | 0 | 0 |

*P* (*j|i*) (*i* = 1, 2) was used to represent the probability that samples originally belonging to group *i* were misjudged to belong to group *j*. The estimation of the misjudgment rate of system B was taken as an example of linear discriminant: *P* (1|0) = 0/6 = 0 and *P* (0|1) = 7/12. The misjudgment rate is Err = 0.5 *P* (1|0) + 0.5 *P* (0|1) = 0.29. The misjudgment rate of the linear diagonal matrix method was 29%.

Thus, the misjudgment rates of the five drainage systems A, B, C, D, and E were calculated, and the results are as shown in Table 9.

Drainage system . | Misjudgment rate (%) . | |||
---|---|---|---|---|

Linear discriminant method . | Linear diagonal matrix . | Quadratic discriminant method . | Bayes discriminant method . | |

A | 0 | 30 | 17 | 17 |

B | 29 | 46 | 46 | 46 |

C | 4 | 30 | 17 | 25 |

D | 4 | 40 | 40 | 32 |

E | 0 | 8 | 8 | 8 |

Drainage system . | Misjudgment rate (%) . | |||
---|---|---|---|---|

Linear discriminant method . | Linear diagonal matrix . | Quadratic discriminant method . | Bayes discriminant method . | |

A | 0 | 30 | 17 | 17 |

B | 29 | 46 | 46 | 46 |

C | 4 | 30 | 17 | 25 |

D | 4 | 40 | 40 | 32 |

E | 0 | 8 | 8 | 8 |

In system A, the misjudgment rates of the linear, quadratic, Bayesian, and diagonal matrix discriminant methods were 0, 17, 17, and 30%, respectively. The linear discriminant method had the best effect on the deciding the lowering of weir. The results showed that all the misjudgement methods provided accurate results when the weir could be lowered. When the rainfall level was less than the yellow warning (a rainfall of more than 50 mm in 6 h), the type of misjudgment was less harmful. However, when the rainfall level was at the orange (a rainfall of more than 50 mm within 3 h) or red (rainfall of more than 100 mm within 3 h) warning, the deep tunnel was filled in advance, making it difficult to achieve the objective of improving the flood prevention standard. In system B, the linear discriminant method was not ideal to judge whether a weir fell; here, the misjudgment rate was in the range of 29–46%. Although the linear discriminant method had the lowest misjudgment rate of 29%, the misjudgment in this method was associated with the decision of when the weir was to be lowered. Further, other judgment methods made accurately judged when the weir should have been lowered but misjudged when the weir did not need to be lowered. The judgment result was unsatisfactory, which may be due to the large deviation between the sample and overall distribution. The application effect of the linear discriminant method in systems C and E were relatively ideal. All the judgment methods made correct decisions regarding when the weir should have been lowered; all misjudgments occurred regarding only when the weir was not to be lowered. In system D, the misjudgment rates of the linear, quadratic, diagonal matrix, and Bayesian discriminant methods were 4, 40, 40, and 32%, respectively. The application effect of the linear discriminant method was better than those of other methods. Furthermore, in the last three types of discrimination, there were two misjudgments that indicated the weir should not be lowered. In summary, the application effect of linear discrimination in the distance discrimination method was the best while that of the Bayesian method was poor.

## CONCLUSIONS

In this study, five drainage systems in Shanghai, China, were taken as examples. Following the selection of relevant variables, four discriminant ML methods were used to assist decision-making on key steps of drainage system control. Among them, the linear discriminant method had the best judgment effect, and the average misjudgment rate was less than 10%, indicating a better auxiliary decision-making. However, misjudgment rates of different drainage systems differed because of the drainage system characteristics, the representativeness of control factors, the discriminant method, number of samples, and proximity of samples to the overall. Furthermore, the risk of some drainage systems relying entirely on auxiliary decision-making for operation control was high. However, because the effect of ML depends on the number of training samples, training samples can be continuously expanded by accumulating large amounts of effective data through static simulation and actual operation. The judgment effect should be improved as the number of training samples is closer to the population. Therefore, the application of ML in auxiliary decision-making under complex conditions still has certain theoretical and practical significance.

## DATA AVAILABILITY STATEMENT

All relevant data are included in the paper or its Supplementary Information.

## REFERENCES

*Energies*

**14**(6), 1765. doi:10.3390/en14061765