Abstract
This work presents an algorithm for real-time fault detection in the SCADA system of a modern water supply system (WSS) in an Italian alpine valley. By means of both hardware and analytical redundancy, the proposed algorithm compares data and isolates faults on sensors through analysis of residuals. Moreover, the algorithm performs a real-time selection of the most reliable measurements for the automated control of the WSS operations. A coupled model of the hydraulic and remote-control system is developed to test the performance of the WSS when the proposed algorithm is applied or not. Simulations show that the occurrence of errors in the sensors causes significant worsening in the economic, energy and mechanical performance of the infrastructure. In many cases, the operations of the WSS are seriously compromised. The error detection and measurement assessment performed by the proposed algorithm proves to be crucial for the safe control of the WSS.
INTRODUCTION
In north Italian mountainous and hilly areas, water supply (WS) to local communities is usually provided by municipal water supply systems (WSSs) that rely on local sources and operate independently from each other. In these regions, in fact, the availability of water resources is generally not an issue, and the high degree of territorial dispersion favors a decentralized water management. In the event of unexpected breakdowns or droughts (Carrera et al. 2013), however, this fragmentation results in inefficiencies and water crises. In order to increase the resilience of the WS service, a growing trend is the creation of inter-municipal water networks that connect multiple local WSSs over large areas (Massarutto 2000). Coordination in the operations of multiple WSSs and diversification of water sources entail economic, environmental, and water quality advantages (Anghileri et al. 2012; Bel & Warner 2015), and they are in line with the principle of integrated water resources management pursued by the European Water Framework Directive (EC 2000).
Large-scale water infrastructures require an automated regulation aimed at controlling the operations of the entire system, according to a centralized control philosophy. For this reason, they are managed by SCADA (Supervisory Control and Data Acquisition) systems (e.g., Coelho & Andrade-Campos 2014; Meseguer & Quevedo 2017) that (i) acquire and analyze real-time pressure and flow rate measurements provided by sensors in the key points of the network and (ii) remotely control the operations of the regulation devices (e.g., valves, pumps, turbines), according to predefined management rules and as a function of the data measured in real-time throughout the system.
In this framework, faulty sensors in the system may induce the SCADA to perform wrong regulations and, thus, result in faulty water supply system (WSS) operations. To ensure the robustness of the control operations, control systems have to be integrated with procedures that detect errors in sensors and activate in response actions for a safe management. This is crucial for large hydraulic infrastructures that transfer and supply water over large and steep areas. These systems are particularly vulnerable to failures, as the large volumes of water and the high pressures involved can have dangerous consequences when the system regulation is not optimal.
Fault detection and isolation (FDI) methods have been proposed in different fields of engineering and can be roughly classified into signal-based and method-based techniques (Gao et al. 2015).
Basic signal-based methods (e.g., Mourad & Bertrand-Krajewski 2002; Burnell 2003) evaluate data as valid or faulty by assessing the value or the variation of the signal. Advanced signal-based methods (e.g., Frank & Köppen-Seliger 1997; Maki & Loparo 1997) include techniques from machine learning, like neural networks. Signal-based methods are generally used when measurement redundancy is not available in the system, when the number of sensors under analysis is huge, and when a model of the system cannot be developed.
On the other hand, model-based methods validate data by using dependencies between different measurable signals. To this end, both hardware redundancy or analytical redundancy (Hwang et al. 2010; Gertler 2015) can be exploited. In hardware redundancy, measurements of the same physical variable are acquired by redundant sensors and compared. Conversely, analytical redundancy uses a mathematical model of the system as a comparison term (Frank 1996; Isermann 2005). Hardware redundancy is commonly adopted in safety-critical systems (Goupil 2011), but it involves significant extra costs. Moreover, sensors tend to have a similar lifetime and thus sensors that were installed at the same time in the infrastructure often fail simultaneously. In large-scale systems, analytical redundancy represents a convenient alternative due to the number of sensors, and the high cost of their installation, interconnection, and maintenance (Boukhris et al. 2001). Fault detection in model-based methods involves two steps: (i) residual generation and (ii) residual evaluation. A residual is obtained as the difference between two redundant measurements. Its signal is ideally zero when the system is operating correctly and non-zero when faults are present. Residual evaluation is the set of techniques that correctly identify the occurrence of an error starting from the analysis of the residuals (Frank & Ding 1997).
FDI techniques have been widely studied in the field of control engineering, such as in industrial plants (e.g., Gertler 1988; Özyurt & Pike 2004) and in automotive and aerospace engineering (Chen & Patton 1999). In the hydraulic and hydrological field, applications of FDI involved the analysis and reconstruction of hydrological time-series (e.g., Quevedo et al. 2010), leak/burst detection in water distribution networks (e.g., Krause et al. 2008; Casillas et al. 2013), and problems related to water quality monitoring (e.g., Eliades & Polycarpou 2010). A few works instead dealt with the identification of faulty sensors for a safe remote control of WSSs. Among these, Gabrys & Bargiela (1996) determined confidence intervals for flow rate measurements by means of a neural network approach. Ragot & Maquin (2006) proposed a method based on fuzzy logic for measurement fault detection in an urban WSS. Izquierdo et al. (2007) applied a FDI hybrid method based on neural networks and fuzzy theory. Meseguer et al. (2010) and Cugueró-Escofet et al. (2016) used fault signature matrices to identify a faulty sensor by means of the effect that the wrong measurement transmitted by this sensor potentially has on the other variables measured in the system. These methods proved to be efficient but their implementation in real hydraulic systems requires a certain degree of control theory expertise and a relevant computational cost.
Starting from the work of Fellini et al. (2018), the goal of this paper is to present an easy-to-implement and robust algorithm for fault detection in the SCADA system of a modern WSS. To increase the safety and the reliability of the control system, measurements from sensors are assessed in real time. By the analysis of residuals, redundant measurements are compared, and faults are automatically detected. Moreover, in case of errors, the developed algorithm ensures continuity in the control operations and prevents interruptions in the water supply. In addition to the introduction of a new FDI method, tools for the application and validation of the algorithm in typical WSSs are also provided in this work. In this way, the application of the method to real WSSs is facilitated.
The work is organized as follows. After this introduction, the ‘Materials and methods’ section presents first the logic and the structure of the proposed algorithm for fault detection. Then, the application of the algorithm to a modern and multipurpose WSS is illustrated. To conclude the ‘Materials and methods’ section, the developed tools for the validation of the proposed algorithm are described. In the ‘Results’ section, the performances of the WSS before and after the application of the fault detection method are assessed. Finally, the main conclusions obtained from the present work are drawn.
MATERIALS AND METHODS
Algorithm for fault detection in sensors
An easy-to-implement and robust method for real-time fault detection in the SCADA system of a WSS has been developed. This method is based on the redundancy concept and thus can be applied when different measurements of the same physical quantity are available.
Usually, critical sensors in a SCADA system are duplicated. However, double redundancy provides error detection but not recovery. In fact, if two different measurements are read by two redundant sensors, the supervisory system can easily detect the presence of an error but cannot automatically recognize which one is the faulty device. In this case, a technician's on-site intervention is the only solution to identify the problematic sensor, and this may lead to an interruption of the control operations of the WSS. In large-scale water infrastructures, this interruption is extremely unsafe.
To guarantee continuity in real-time control operations, a procedure for the automatic detection and correction of faulty measurements is thus required. This is achieved with an error detection system based on triple redundancy. In this case, if one of the three sensors fails, the other two sensors can isolate the faulty sensor and provide a useful and reliable measurement.
The algorithm we present detects faults by comparing three values of the same physical variable, provided by two redundant gauges (hardware redundancy) and by a hydraulic equation (analytical redundancy). In this way, resilience is achieved and the expensive installation of three redundant sensors is avoided.




Algorithm for fault detection (Step I) in a WSS sensors and for the choice of the most reliable measurement (Step II) to be used in the control operations.
Algorithm for fault detection (Step I) in a WSS sensors and for the choice of the most reliable measurement (Step II) to be used in the control operations.











Besides real-time error detection, the developed algorithm automatically selects the most accurate data to be used in the control operations of the SCADA system (step II in Figure 1). For each set of three measurements (, B,
), the minimum residual is identified (
in Figure 1). The two measurements involved in the calculation of M are the closest to each other and therefore considered the most representative of the real physical value. The algorithm selects one (
in Figure 1) of these two measurements as input for the control operations. This last choice is based on technical considerations. Generally, priority is given to direct measurements compared to indirect ones (i.e., those calculated from quantities different from the measured one). For example, in Figure 1, when
the algorithm selects A as the input for the control operations. Measurement B would have been an equally correct datum but for technical reasons A is considered more reliable (e.g., A is directly given by a sensor while B is obtained from a hydraulic balance or from a sensor with lower instrumental precision).
The here-presented algorithm is designed for real-time applications. In remote-controlled WSSs, programmable logic controllers (PLC), networked to the SCADA systems, receive measurements from the sensors. The fault detection algorithm is implemented in the PLCs and evaluates the reliability of these measurements. Reliable measurements are thus selected by the algorithm and adopted for the management of the control devices (e.g., valves, turbines). The frequency for this measurement check is case specific. It can be lower than the data acquisition frequency but has to be higher than the frequency of the adjustment operations, and therefore of the rate of variation of the hydraulic properties in the system.
Application of the algorithm to a real WSS
The proposed algorithm for fault detection and measurement assessment is non-specific and applies to WSSs with three basic requirements: (i) a SCADA communication network collects measurements from sensors throughout the system; (ii) programmable logic controllers (PLCs) operate the control devices (e.g., valves, turbines) according to the dynamics of a number of measured variables (e.g., flow rates, tank levels); (iii) three independent measurements of each control variable can be obtained from redundant sensors or models. A case study is introduced to improve the understanding of the method, to present a real application, and to show some useful tools for its validation.
Case study
The case study is described in detail by Fellini et al. (2017) and involves a modern WSS (Figure 2) located in an Alpine valley in the northwest of Italy. The WSS consists of an 80-km-long water main connecting 20 municipal WSSs. The water main collects high quality water from an alpine reservoir and conveys a maximum flow rate of 500 l/s. The municipal WSSs (see the inset in Figure 2) are self-sufficient under ordinary conditions as they are supplied by local sources (springs and wells). However, they receive water from the water main when the local sources fail, their quality is low, or water treatment in local plants is expensive. Needle valves with electronic actuators regulate the flow delivered from the water main to the municipal tanks. In the upper valley, four inline tanks (,
,
, and
) split the water main in order to limit the static water pressure in the pipes. A Pelton turbine with electronically controlled Doble needles adjusts the flow entering in each inline tank. In this way, hydropower generation is also performed.
Scheme of the WSS adopted as case study. The water main is split by four main tanks (the rectangles denominated S1–S4) and it supplies 20 municipal tanks (the small white squares). The inset shows a typical municipal water system with the storage tank supplied by mountain springs, local wells, and by the new water main. The flow rate and level sensors involved in the regulation operations are respectively represented as small diamonds and arrows overlaid by a circle. The control devices, i.e., needle valves and turbines, are depicted as ‘bow tie’ symbols and square with cross symbols.
Scheme of the WSS adopted as case study. The water main is split by four main tanks (the rectangles denominated S1–S4) and it supplies 20 municipal tanks (the small white squares). The inset shows a typical municipal water system with the storage tank supplied by mountain springs, local wells, and by the new water main. The flow rate and level sensors involved in the regulation operations are respectively represented as small diamonds and arrows overlaid by a circle. The control devices, i.e., needle valves and turbines, are depicted as ‘bow tie’ symbols and square with cross symbols.
Valves and turbines are the active elements that regulate the flow rate in the entire WSS. Local PLCs control these devices according to (i) predefined management rules, (ii) flow rate and level data measured by sensors installed near the regulation devices, and (iii) data received from the distant sensors networked in the SCADA system. Data transmission in the SCADA system is guaranteed by an optical fiber network.



















The water level in the inline tank is instead used to control the flow rate towards the municipal tanks in order to optimize the distribution of water in the whole system. Tank
has the largest capacity, and thus its level is an indicator of water availability in the system. According to
dynamics and to a predefined priority list (based on technical and economic criteria), the municipal WSSs to be supplied by the water main are selected. Alternatively, the municipal tanks are supplied by the local wells. In both cases, the flow rate to the municipal tank is proportional to the tank water level. As for the inline tanks, an emergency level threshold is set for the municipal tanks. When the water level falls below this threshold, the water main supplies the tank with a predefined constant flow rate. The involved measurements for the control of the needle valves are thus given by level sensors in the municipal tanks (e.g.,
to
) and by
.
Notice that all the municipal WSSs are actually provided with meters. However, the level and flow rate sensors of four of them are not shown in Figure 2. These sensors are not considered in the simulations discussed in the following sections because the management rules in which they are involved are rather specific.
Assessment of hardware and analytical redundancy in the system
The above-presented fault isolation procedure is effective if three measurements A, B, and C of the same physical quantity are available. As introduced in the ‘Case study’ section, the key quantities for the control operations are: (i) water levels in the inline tanks ( to
in Figure 2) and in the municipal tanks (
to
); and (ii) flow rates along the water main (
and
) and towards the municipal WSSs in the upper valley (
to
).







(a) Level ( and
) and flow rate (
and
) sensors installed in the tanks (with area
) of the WSS. The system of equations provides three redundant values (
, B, and
) of the tank level. (b) Flow rate sensors installed along the pipe that supply the municipal tanks (
and
) and along the water main (
and
). The system of equations provides three redundant values (
, B, and
) of the flow rate towards the municipal tank. (c) Flow rate sensors installed at the entrance of the inline tanks (
and
), along the water main (
), and along the pipes that supply the municipal tanks (
). The system of equations provides three redundant values (
, B, and
) of the flow rate into the inline tank.
(a) Level ( and
) and flow rate (
and
) sensors installed in the tanks (with area
) of the WSS. The system of equations provides three redundant values (
, B, and
) of the tank level. (b) Flow rate sensors installed along the pipe that supply the municipal tanks (
and
) and along the water main (
and
). The system of equations provides three redundant values (
, B, and
) of the flow rate towards the municipal tank. (c) Flow rate sensors installed at the entrance of the inline tanks (
and
), along the water main (
), and along the pipes that supply the municipal tanks (
). The system of equations provides three redundant values (
, B, and
) of the flow rate into the inline tank.
Notice that this balance is independent from the level measurement sent by sensors and
. Otherwise, any error in one of the two level sensors would also propagate within C and the error identification method would fail.





Finally, for the flow rates along the water main ( and
in Figure 2), redundant flow rate sensors at the entrance and at the exit of the inline tanks are installed (Figure 3(c)). The flow rate balance (Equation (5)) is used to compute the third redundant measurement.
Tolerance intervals for the residuals
In the proposed error detection algorithm, a tolerance interval for each residual has to be defined. Within this interval the residual is accepted. The tolerance interval is obtained from the sum of the maximum precision errors expected for the two measurements involved in the calculation of the residual, i.e., a combination of two between ,
, and
(see Equation (2)).













In Equation (9), increases with each time step. To avoid this progressive growth of the tolerance range and to keep it significant in the validation process,
is forced equal to
when it exceeds a threshold value equal to twice the tolerance of the installed level sensors (e.g.,
).
Kalman filters (e.g., Piatyszek et al. 2000; Ciavatta et al. 2004) could be used alternatively to Equations (4) and (9) to produce an estimate of the variable C, with its uncertainty . However, to reduce the uncertainty related to the modeled variable C, Kalman filters require a recursive update of the estimated variable based on the real value measured by a sensor. In the method proposed in this work, however, C has to be independent from the measured values (i.e., A and
). For this reason, the implementation of Kalman filters is here avoided.
Figure 4 shows the residuals ,
, and
and their tolerance intervals (see Equations (1) and (2)) for the assessment of the level measurement in tank
. The tolerance interval of residuals
and
(
and
) present a trend of interrupted growth as
and
are a function of
. In the proposed example, the level sensor
is affected by drift (see section ‘Error modeling’). As a consequence, residuals
and
exceed their tolerance intervals and the sensor malfunction is detected.
Residuals (continuous lines) and their tolerance intervals (dashed lines) for level measurement in tank S1. Level sensor is affected by drift. Consequently, residuals
and
present a drifted signal and exceed their respective tolerance ranges. The tolerance intervals of residuals
and
are not constant as they are a function of
(see Equation (9)).
Residuals (continuous lines) and their tolerance intervals (dashed lines) for level measurement in tank S1. Level sensor is affected by drift. Consequently, residuals
and
present a drifted signal and exceed their respective tolerance ranges. The tolerance intervals of residuals
and
are not constant as they are a function of
(see Equation (9)).
In the considered WSS, flow rate is measured by electromagnetic flow meters with a maximum instrumental error () equal to 0.25% of the flow rate in transit, while levels are measured by ultrasonic level sensors with
equal to 0.15% of the full-scale (this latter is equal to 6 meters).
Tools for the validation of the fault detection algorithm
The efficacy of the proposed algorithm in detecting errors in sensors is assessed through numerical simulations. To perform these simulations, a numerical model of the hydraulic and control operations is developed, and realistic errors in sensors are simulated. To assess in a quantitative way the performance of the WSS under the different simulated scenarios, custom performance indexes are introduced.
Hydraulic and control model
A simulation model was developed to analyze the performances of the WSS in different scenarios (Fellini et al. 2017). This model consists of a coupled hydraulic and control model.
The hydraulic model is a system of non-linear equations describing (i) the flow-head loss relation in pipes, at valves and at turbines, (ii) the flow continuity at nodes, and (iii) the boundary conditions at tanks. Time evolution of the system is modeled by a succession of steady states with duration Δt. At each time step, the flow in the pipes and the pressure at nodes are computed. Moreover, the water level in tanks is updated using a mass balance equation.
The control model simulates the supervision and control operations of the SCADA system. A measurement from each one of the installed sensors is modeled as , where M is the flow or level computed by the hydraulic model at the exact point where the sensor is located. This value is perturbed with the error ɛ to model different kinds of sensor faults. These measurements are used as input data for the decision-making algorithms that simulate the control actions of the SCADA system, that is, the governance of valves and turbines.
Error modeling
Measurement errors are of different types and originate from different causes. In this work, we simulate errors that usually affect meters in monitoring systems (Balaban et al. 2009; Sharma et al. 2010). These are: random errors, instrumental drifts, sensor breakdowns and interruption of signal transmission (Table 1).
Types of considered errors in sensors and their simulation in the control model of the WSS
Type of error . | Subtype . | Error simulation . |
---|---|---|
Random error | – | Gaussian distribution with μ= 0 and σ=Emax/3 |
Drift | Zero drift | Instrument calibration curve with intercept increasing over time from 0 to 20% of the instrument full-scale |
Sensitivity drift | Instrument calibration curve with angular coefficient increasing over time from 1 to 1.2 | |
Breakdown | Minimum constant value | Zero constant value |
Maximum constant value | Constant value stuck at the instrument full-scale | |
Abrupt oscillations | Binomial distribution taking value 0 or the instrument full-scale with equal probability 0.5 | |
Loss of signal | – | NaN |
Type of error . | Subtype . | Error simulation . |
---|---|---|
Random error | – | Gaussian distribution with μ= 0 and σ=Emax/3 |
Drift | Zero drift | Instrument calibration curve with intercept increasing over time from 0 to 20% of the instrument full-scale |
Sensitivity drift | Instrument calibration curve with angular coefficient increasing over time from 1 to 1.2 | |
Breakdown | Minimum constant value | Zero constant value |
Maximum constant value | Constant value stuck at the instrument full-scale | |
Abrupt oscillations | Binomial distribution taking value 0 or the instrument full-scale with equal probability 0.5 | |
Loss of signal | – | NaN |
Random errors (also called statistical errors) are intrinsic errors that depend on the precision limitation of the measurement devices (e.g., Fuller 2009). In the simulated control system, random errors are modeled as realizations of a Gaussian distribution with mean equal to zero and with a standard deviation that depends on the instrumental precision. In particular, according to the three-sigma rule (Pukelsheim 1994), we estimate the standard deviation as one-third the maximum instrumental error ().
Instrumental drift is a progressive bias in the measurement output which increases slowly in time (e.g., Webster 1998). It is caused by various environmental issues and by mechanical wear. In unaltered meters, the relation between the input (i.e., the value of the physical variable) and the output (i.e., the measurement) of the measurement process is a bisector. When this ideal bisector moves vertically, i.e., the instrument output has an offset with respect to the input, the meter is affected by ‘zero drift’. This offset may increase over time. When the angular coefficient of the bisector changes over time, the sensor is affected by ‘sensitivity drift’. In the simulated control model, the calibration curve of the instrument is disturbed by (i) an increase of the offset from zero to 20% of the instrument full-scale occurring over 10 days and (ii) by a 20% increment of the angular coefficient over the same period.
Sensor breakdowns are usually caused by mechanical damage or electrical issues in internal connections. In the event of a breakdown, the operation of the meter is compromised, and the measured quantity deviates completely from the measured physical variable, i.e., the sensor output is usually stuck at a constant value equal to the full-scale of the instrument or to the minimum detectable value. Alternatively, the measure oscillates unstably between these two values. Stuck sensors are simulated in the control model by setting the measured value on the minimum detectable value or on the full scale of the instrument. Unstable oscillations are generated through a random extraction over time between these two extremes.
Loss of signal is a complete loss of sensor data, due to a failure in the sensor or in the transmission network. In this case, the sensor output is simulated as a NaN (i.e., Not a Number).
Performance indexes














RESULTS
Numerical simulations of the WSS operations are performed to (i) analyze the effects of the transmission of faulty measurements from the meters to the telecontrol system and (ii) verify the efficiency of the error identification algorithm.
First, we simulate the operation of the WSS in the case of errors in the sensors and complete lack of measurement redundancy. Then, the error identification algorithm is introduced into the control model, the same error scenarios are reproduced, and the resilience of the system is investigated.
In these analyses, only the measurements that are essential for the WSS control operations are considered. These measurements are: the water level in the inline and municipal tanks (–
) and the flow rate along the water main and towards the municipal tanks of the upper valley (
–
). In Figure 2 and in the ‘Case study’ section, details are given about the location of these measurements and how they are involved in the control operations.
For each of these measurements, three different kinds of errors are simulated: (i) drift, (ii) constant minimum value, and (iii) constant full-scale value. Moreover, in all the simulations, the measurements are disturbed by random errors due to instrumental precision. Interruption of signal transmission and abrupt oscillations in the instrument output were also tested. Differently from measurement errors, the lack of signal is easily detected by the control system even without a specific error detection algorithm. Abrupt oscillations are instead quite rare. For these reasons, and for the sake of conciseness, the results of these simulations are not discussed in the following.
Performances of the WSS without application of the validation algorithm
In Figure 5, the performances of the WSS, without the application of the algorithm for fault detection, are shown. In these simulations, one sensor at a time is affected by errors. Three different types of errors are analyzed: (i) drift, (ii) minimum constant value, and (iii) maximum full-scale value. Only the sensors that are essential for the WSS control operations are considered. In Figure 5(a), the index for the different scenarios is reported. Level and flow rate meters are ranked on the x axis according to the
value. For the simulations with
, the sensors are ranked according to the sum of the indexes
,
, and
. In Figure 5(b), the simulations with
are reported as a function of the three performance indexes
,
, and
. Since the index values are highly concentrated around 1, a better readability is obtained using logarithmic axes of the quantities
and
, and using a logarithmic scale for the shade of the marker fill that represents the quantity
. In this way, the best scenarios are located near the origin of the axes and present blank markers. The results of the benchmark simulation cannot be visualized on this graph since zero values cannot be represented on a logarithmic scale.
(a) Normalized time to breakdown (index ) for the three different types of errors. (b) Visualization of the three performance indexes (
,
, and
) for the simulations with
. Circle and triangle markers represent the results for simulations with drift and constant minimum value errors, indexes
and
are reported in the log axes of the chart, while a different shade of the marker fill is assigned according to the mechanical index value (
).
(a) Normalized time to breakdown (index ) for the three different types of errors. (b) Visualization of the three performance indexes (
,
, and
) for the simulations with
. Circle and triangle markers represent the results for simulations with drift and constant minimum value errors, indexes
and
are reported in the log axes of the chart, while a different shade of the marker fill is assigned according to the mechanical index value (
).
Drift error (first panel in Figure 5(a) and triangle markers in Figure 5(b)) mainly affects level sensors in the municipal tanks. In fact, level sensors to
are characterized by
. Based on the drifted level signal, the water level in the tank is overestimated by the SCADA and thus the water flow to the tank from both the water main or the local wells is under-rated and insufficient to match local water consumptions. Drift errors in the level sensors of the four inline tanks (
to
) are less critical and jeopardize the system operations only in the long term. In particular, sensors
,
, and
are rather resilient and their performances are comparable with the benchmark case (see Figure 5(b)). Regarding flow rate sensors, no criticalities are evident except for
, that is, the flow meter involved in the regulation of the
turbine and thus in the regulation of the water level in
. This tank is characterized by the smallest volume compared to the other inline tanks, so it is more vulnerable to errors in flow rate measurements. Considering the performances of the WSS under the effect of drifted flow rate measurements to the municipal tanks of the upper valley (
,
, and
), an overestimated flow rate induces a reduction of the flow discharged by the turbines (see Equation (3)), a lower hydroelectric production and thus a decrease of the
index. Moreover, the flow rate balances that govern the opening and closing operations of the turbines are altered and thus a decrease of
index is also observed.
When the faulty sensor settles to the minimum constant value (second panel in Figure 5(a) and circle markers in Figure 5(b)), the WSS operation is generally not compromised. A wrong minimum level in the municipal water tanks ( to
) entails an excessive water inflow from the local wells and thus a greater energy consumption by the pumping plants and water loss due to water overflowing from the tank. For these reasons, both the
and
indexes greatly decrease (Figure 5(b)). However, critical emptying of the municipal tanks does not occur. On the other hand, inline tanks are more vulnerable to this kind of error since emergency water supply from local wells is not provided. Based on a wrong minimum water level in tanks
(
), the SCADA system forces a complete closure (see the ‘Case study’ section) of turbine
(
), thus impeding the water supply of the downstream tank
(
). Given its size, tank
is emptied when this error arises in sensor
. Similarly, when a minimum level is measured in tank
, turbine
is completely opened and the upstream tank
is critically emptied. For these reasons,
and
show the highest vulnerability to this kind of error. A decrease of
and
indexes is observed for errors in the flow rate measurements that disturb the flow rate balances for the regulation of turbine operations (
,
, and
in Figure 5(b)).
Errors inducing wrong full-scale measurements of level or flow rate in the devices are the most critical for the remote-control system. In municipal tanks, if the level meter indicates that the tank is full, then the tank is not supplied either from the WSS water main or from local sources, and it quickly empties. In inline tanks, based on a wrong maximum level in and
(
), the SCADA forces the complete opening(closure) of turbine
and
(
). Consequently, tanks
,
, and
are rapidly emptied. When a faulty maximum level is measured in
, the water availability in the WSS is overestimated, the municipal tanks are excessively supplied, and thus, the water level in
rapidly decreases. Flow rate sensors are involved in the regulation of the turbine operations and they also exhibit a high vulnerability to this kind of error.
The simulation results shown in Figure 5 evidence the vulnerability of level and flow rate gauges to the most common types of errors. Moreover, the analysis of the performance indexes in Figure 5(b) underlines that even if the system does not reach breakdown, errors still lead to significant worsening in the economic, energy, and mechanical performance of the infrastructure. In particular, it is interesting to observe two main trends in the graph. The simulations related to errors in the level sensors are distributed along the bisector, indicating a similar deterioration both for the index and for the
index. The simulations of errors in flow rate sensors, on the other hand, are distributed along a vertical line, indicating that the main consequence is a decrease in energy (and mechanical, if the intensity of the marker fill is observed) performances.
Application of the validation algorithm
After the analysis of the WSS vulnerability to faults in sensors, the above-presented algorithm for real-time fault detection and measurement selection was integrated in the control model of the WSS. First, the available gauges in the different districts of the WSS were checked (hardware redundancy). Second, the balance equations that provide an additional value of the measured physical variables were written (analytical redundancy). Third, for each one of the considered sensors, the algorithm reported in Figure 1 was coded. Finally, different error scenarios were simulated to evaluate the efficiency of the integration in the control model of the error detection algorithm.
To show in detail how the control system operates, Figure 6 reports the results for the assessment of level measurement for the inline tank . Figure 6(a) shows the redundant signals of the tank water level. Figure 6(b) depicts the results of error assessment by the fault detection algorithm.
(a) Level values from level sensors (continuous line) and
(dotted line) and from the balance equation of flow rate into and out of the tank (dashed line). Measurements transmitted by
experience (1) drifts, (2) minimum and (3) full-scale constant values, (4) abrupt oscillations, and (5) transmission interruptions. (b) Error detection and measurement selection by the fault detection algorithm. (c) Zoom of the early detection of the drift error. As in Figure 4, the three residuals
,
, and
are reported with the extremes of their tolerance ranges (dashed lines). (d) Zoom of the abrupt oscillations (4). (e) Zoom of the signal showing errors due to instrumental precision.
(a) Level values from level sensors (continuous line) and
(dotted line) and from the balance equation of flow rate into and out of the tank (dashed line). Measurements transmitted by
experience (1) drifts, (2) minimum and (3) full-scale constant values, (4) abrupt oscillations, and (5) transmission interruptions. (b) Error detection and measurement selection by the fault detection algorithm. (c) Zoom of the early detection of the drift error. As in Figure 4, the three residuals
,
, and
are reported with the extremes of their tolerance ranges (dashed lines). (d) Zoom of the abrupt oscillations (4). (e) Zoom of the signal showing errors due to instrumental precision.
As introduced in Figure 3(a), two level sensors are installed in each inline tank ( and
in Figure 3(a)), providing hardware redundancy. A third value of the tank level is obtained using a mass balance equation of the flow rate into and out of the tank, measured by sensors
and
(Figure 3(a)). The combination of hardware and analytical redundancy guarantees three simultaneous measurements of the same physical variable.
In the exemplifying scenario reported in Figure 6, the level sensor (continuous line in Figure 6(a), 6(c), 6(d) and 6(e)) is affected by multiple consecutive errors: (1) drifts, (2) minimum and (3) full-scale constant values, (4) abrupt oscillations, and (5) transmission interruption. The level sensor
(dotted line in Figure 6(a), 6(c), 6(d) and 6(e)) and the flow rate sensors
and
operate correctly instead, and they are only disturbed by slight random errors due to instrumental precision (Figure 6(e)). Note that instrumental errors in the flow rate sensors scarcely affect the precision of the level value (dashed line in Figure 6(e)) obtained from the balance equation.
As shown in Figure 6(a), the level measured by sensor is almost equal to the value resulting from the balance equation. On the other hand, the level acquired by
considerably differs from the previous ones, due to instrumental failures. By means of the residual analysis, the developed algorithm detects with high precision these failures. Figure 6(c) shows in detail the early detection of the drift error. In the first panel, the three curves are very closed one to each other. However, residuals
and
, in the second and fourth panels exceed their respective tolerance thresholds represented by the dashed lines. As a consequence,
is marked as faulty in the last panel. In Figure 6(b), the outcomes of this error detection procedure are highlighted for all the simulation time. Besides error detection, the algorithm selects in real-time the most reliable measurement (Figure 6(b) and 6(c)) to be used in the WSS control operations. Under ordinary conditions, the algorithm alternatively selects the value measured by one of the two level sensors
and
. In case of error detection for
, the measurement transmitted by
is the only one to be selected. As shown in Figure 6(b), the level value obtained from the balance equation is never identified as erroneous because the flow meters involved in the balance operate correctly. Furthermore, the value is never selected by the algorithm as the reference datum for the control operations. In fact, the direct measurements from level sensor
and
are always favorable to the indirect one. In any case, the level measurement obtained from the balance is fundamental for the operation of the error identification algorithm as it guarantees the triple redundancy on which the algorithm is based.
Performances of the WSS with application of the validation algorithm
The simulations of the response of the system to faulty sensors, discussed above and reported in Figure 5, were performed again. This time, the algorithm for the identification of measurement errors was introduced into the control system. As for the previous analysis, for each one of the considered level and flow rate measurements, three simulations were conducted. The main sensor was disturbed by (i) drift, (ii) faulty full-scale and (iii) minimum constant values. Precision errors were simulated for the second redundant sensor and for the sensors involved in the indirect quantification of the third redundant measurement. For each scenario, the WSS dynamics were reproduced for a period of 1 month.
Results show that in all the simulated scenarios, the algorithm accurately identifies errors in sensors and transmits reliable measurements to the remote-control system. In this way, adjustments of the control devices (turbines and valves) are regularly managed and critical situations that compromise the safety of the aqueduct are prevented. The positive feedback of the system to the introduction of the fault detection algorithm is at first assessed observing that the index is equal to 1 for all the error scenarios. This indicates that the WSS operates without criticalities for the entire duration of the simulation. However, the
index is not enough to evaluate the improvement of the system performance for the scenarios in which the WSS proved to be resilient even in the absence of the error identification algorithm (i.e., scenarios with
in Figure 5(a)). For these cases, we calculated the difference between the
,
, and
performance indexes after and before the introduction of the fault detection procedure (Figure 7).
Increment of the performance indices (a),
(b), and
(c) after the introduction of the fault detection algorithm in the control system of the WSS. Triangle (circle) markers report the results of simulations with drift (constant minimum value) errors in the sensors.
Increment of the performance indices (a),
(b), and
(c) after the introduction of the fault detection algorithm in the control system of the WSS. Triangle (circle) markers report the results of simulations with drift (constant minimum value) errors in the sensors.
Results reveal a general increase in the energy index (), in particular when errors are detected in the flow rate sensors (
to
). In fact, the correct regulation of the flow rate through the turbines guarantees the optimization of the hydroelectricity production.
Water saving () is enhanced when the error detection algorithm is applied to level sensors. The water level is the control parameter for the flow rate towards the tanks. When levels are properly measured, the excessive supply of water for the inline and municipal tanks is avoided, thus limiting overflow conditions.
The mechanical index () significantly increases when the signal from the flow rate sensors is verified. As mentioned in Equation (3), these measurements are involved in the balance equations that control the flow rate through the turbines. When the flow rate sensors are faulty, these water balances are only met at the monitoring level but not in the hydraulic system. This results in continuous adjustments of the flow through the turbines. Conversely, when flow rate measurements are assessed, the number of turbine operations is minimized. The implementation of the algorithm induces instead a slight decrease in the mechanical index, when level sensors are stuck on the minimum value (circle markers for level sensors in Figure 7(c)). However, this decrease proves that the WSS is operating properly. In fact, as discussed in the ‘Case study’ section, when a minimum constant value is erroneously measured in one of the municipal tanks, the control system automatically supplies the tank with an emergency constant flow rate from the water main. Due to this constant flow rate, the natural variation of water supply given by consumptions in the municipalities is reduced, as well as the adjustments of the flow rate through the turbines. When level sensors in the inline tanks are stuck on the minimum constant value, turbines are either completely open or completely closed. Therefore, in these cases, the control operations of the WSS are less dynamic, the number of turbine maneuvers decreases, but also the energy and water saving performances evidently worsen. In fact, a negative
index is always followed by positive
and
indices.
DISCUSSION AND CONCLUSIONS
In this work, an algorithm is proposed for real-time assessment of data measured in the SCADA system of modern and multipurpose WSSs. An automated remote-control system based on reliable flow rate and level measurements is crucial for the safe and optimal operation of these water infrastructures. The developed algorithm compares redundant data provided by both redundant sensors and analytical models. By means of the analysis of residuals, failures and gross errors in sensors are detected. Moreover, the algorithm performs a real-time selection of the most reliable measurements to be used in the control operations.
The effectiveness of the method was assessed through numerical simulations of a coupled hydraulic and control model of an alpine WSS, taken as case study. The considered WSS is currently under construction and thus real datasets from sensors are not available. However, the most common measurement errors have been carefully simulated, based on the technical specifications of the sensors most frequently installed in modern WSSs. By means of these simulations, the system vulnerability to different types of errors in sensors was first analyzed. Four performance indices were defined to assess the achievement of safety, energy, mechanical and water supply targets of the water infrastructure in the different scenarios. Results showed that in most cases, errors in sensors critically undermined the operation of the WSS. Moreover, the most vulnerable sensors were clearly identified. Then, the proposed algorithm was introduced in the control model. The simulations revealed that the algorithm ensures error detection with a high degree of accuracy and guarantees continuity in the system operations. As a result, the performance indices showed a noticeable increase.
The developed algorithm is robust and easy to implement. Moreover, the proposed tools for modeling the hydraulic behavior of the WSS, its control system, and sensor errors are useful to carry out preliminary studies on the performance and safety of a WSS in the design phase. Within this approach, hardware redundancy can be optimized.
The algorithm can verify the reliability of a control variable only in the case of a single faulty sensor among those involved in the estimation of the three redundant estimates of the variable. In the event of multiple, simultaneous errors the algorithm is not effective. Also, water leaks can disturb the hydraulic balances involved in the analytically redundant measurements. In this case, if the corresponding sensors (hardware redundancy) are not faulty, the algorithm identifies the presence of an error in the hydraulic balance and can thus help in localizing water leaks. Once detected, water leaks must be fixed, or the balance equation implemented in the algorithm must be modified to take these outflows into account. If one of the two redundant sensors is out of order, the unexpected occurrence of water leaks precludes the correct functioning of the algorithm. At the state of the art, the method therefore requires that in the event of a sensor malfunction, a maintenance intervention on site is arranged in a short time so as to prevent a second failure being added to the first one. Another aspect to be considered for the application of the proposed FDI in real WSSs is the frequency of sensor data validation. The fault detection algorithm should assess data with a frequency that is in line with the measurement frequency of the sensors, and with the speed of the WSS dynamics. All these aspects, i.e., the frequency of sensor maintenance, on-site human intervention, and data validation, are case-dependent and should be examined during the design of the control and fault detection system.
ACKNOWLEDGMENTS
We gratefully acknowledge SMAT Group for the financial support to this research and for providing valuable information.