Due to the uncertainty in output caused by environmental changes, significant discrepancies are expected between the surface flow velocities predicted using deep learning methods and the instantaneous flow velocities. In this paper, a two-stage deep learning flow velocity measurement algorithm is proposed. During the external calibration process, the upper and lower frames of the recorded water flow video are cyclically traversed to acquire predicted flow velocity values using the deep learning velocity measurement algorithm. Meanwhile, the pixel displacement is obtained using the sparse optical flow tracking method and then post-processed to derive the velocity calibration value and pixel calibration value. During the detection process, the deep learning-predicted flow velocity is internally calibrated using the velocity calibration value and the pixel calibration value to adapt to changes in water flows. Compared with the pre-improved algorithm, the method achieves the minimum root mean square error in five different flow velocity videos and maintains high accuracy when the flow velocity changes rapidly. The obtained results are very promising and can help improve the reliability of video flow rate assessment algorithms.

  • New methods are needed to ensure the instability of the video-based surface velocity prediction due to environmental changes.

  • Avoid direct measurement of high velocity, using the slow to fast process to indirectly get accurate high velocity.

  • We provide a river data set for deep learning model training, which is convenient for researchers interested in this to expand their research.

Hydrologic data, such as water levels, precipitation, and discharge, have been collected since ancient times. All of this field data have supported hydrological analyses, as well as the design of efficient infrastructure and management strategies (Gao 2020). With the continued threat of climate change, the need for accurate data to document all hydrological processes, including the most extreme ones, is now more important than ever. The information generated from this data can facilitate rapid mobilization and decision-making processes, supporting related systems (Nohara et al. 2018; Gourbesville 2020; Ma & Gourbesville 2020). Among all the fundamental data that need to be collected, the flow and velocity of natural streams, especially under extreme conditions, have been a major research issue for hydrologists for centuries (Bao et al. 2012). The extreme flood conditions obviously pose significant challenges for direct observation. They require comprehensive monitoring and measurement solutions that are not easily deployable in multiple locations across various catchments. Moreover, they involve a significant financial investment that may not be feasible in many situations.

With the development of sensor and image analysis technologies, non-contact flow velocity detection methods have attracted the attention of hydrologists due to their advantages of low cost, easy maintenance, and safety compared with traditional flow velocity detection methods such as acoustic waves and flow meters (Collins & Emery 1988; Yang et al. 2008). Subsequently, the large-scale particle image velocimetry (LSPIV) technology uses bubbles, ripples, and other information floating on the surface of the fluid as large particles, which is widely used to replace the artificial tracer particles in particle image velocimetry. This method utilizes image processing techniques to obtain the displacement of large particles on the image, and the average velocity of the particles on the image can be obtained with a known exposure interval (Fujita & Komura 1994; Bechle et al. 2012). Then, the Spatiotemporal Image Velocimetry (STIV) solution judges the texture features of the water flow surface based on a marked velocity line without the need for tracers, and then obtains the one-dimensional velocity of the water flow surface. However, due to the interference of image noise, the measurement results are greatly deviated (Zhang et al. 2018). With the development of machine learning technology, some progress has been made in the research on the relationship between surface flow velocity and texture features by feature extraction and data dimensionality reduction of river surface images (Tauro et al. 2014). Researchers also use a compressed sensing image analysis method to extract image features to establish a river surface velocity estimation scheme (Wang et al. 2018). Meanwhile, a generative adversarial network flow measurement method based on conditional boundaries was proposed to address the problem that small differences between different flow velocity classes lead to classification difficulties (Wang et al. 2019). In order to utilize the motion information of water flow, the optical flow estimation method between frame differences was shown to be superior to the traditional block matching method (Chen & Mied 2013). In recent years, several scholars have also proposed optical flow estimation velocimetry methods based on tracing and deep learning (Pizarro et al. 2020; Yao 2022).

The traditional optical flow estimation algorithm (Farneback 2003) generally uses mathematical formulas to model the image, and then solves the formula coefficients based on the similarity between continuous images to calculate the approximate representation of each pixel neighborhood, to obtain the position information of each pixel in the previous frame moving to the next frame. However, it is usually affected by natural light variations that are common under natural conditions (Liu et al. 2014). In contrast, the optical flow estimation algorithm based on deep learning uses a large number of images during the training phase. This solution allows to limit the interference of illumination. The classification algorithm then combines the optical flow estimation images to predict the flow rate without the need for distance calibration of the camera (Wang et al. 2019). However, the existing deep learning flow measurement algorithm has not given a good solution to the problem of output uncertainty. To overcome this difficulty, this paper proposes a new calibration method of predicted velocity based on the general velocity measurement architecture of optical flow model combined with a detection model. The main contributions of this paper are as follows:

  • (1) A flow velocity external calibration (EC) algorithm and a pixel EC algorithm are designed to minimize the error between the predicted flow velocity output from the model and the reference flow velocity (Le Coz et al. 2014), and to provide an a priori calibration reference value for the internal flow rate calibration.

  • (2) Internal calibration of deep learning-predicted flow rates based on the reference values provided by the EC to obtain more accurate prediction outputs.

  • (3) Pixel calibration values from the EC are used to calibrate the flow rate calibration values to adapt to the rapid changes in the current flow velocity.

Recurrent All-pairs Field Transform (RAFT) algorithm use

RAFT (Teed & Deng 2020) is an optical flow estimation algorithm based on deep learning. Unlike previous work (Ilg et al. 2017), RAFT maintains a high-resolution optical flow field, eliminating the structural design of estimating optical flow at low resolution. It is mainly composed of three parts, as shown in Figure 1.
Figure 1

The network structure of RAFT, the input is the river area frame, and the output is the RGB optical flow image.

Figure 1

The network structure of RAFT, the input is the river area frame, and the output is the RGB optical flow image.

Close modal

Feature extractor

The feature extraction module is divided into two parts: feature encoder and context encoder. It consists of convolutional neural networks (CNNs) that in turn consists of six residual layers (He et al. 2016). The output feature map of each layer is halved in width and height compared to the input image, and the number of channels increases. The two consecutive images in the water flow video are used as the input of the feature extraction module. The Feature Encoder is composed of two CNNs with shared weights. The features are extracted from the two images for inner product and output to the Visual Similarity Calculator. The Context Encoder is composed of a CNN that extracts features to the Updator module for iterative update of optical flow.

Visual similarity calculator

The input is the inner product vector of two image features output by Feature Encoder, and then a four-layer pyramid feature is constructed, where . When the initial state of the optical flow of the given image with respect to the xy coordinate is , each pixel in Frame1 can be mapped to the corresponding point on Frame 2, as shown in Equation (1):
(1)
In order to obtain the image-rich semantic information, the neighborhood grid of , as shown in Equation (2), where represents the set of coordinates around with radius less than r. Combined with the initial optical flow value in Updator, the search is performed on all levels of pyramids by , and the lower the pyramid corresponds to the more global semantic features:
(2)

Updator

The updated optical flow value, the feature vector queried in the feature pyramid of Visual Similarity Calculator and the feature vector output by Context Encoder are input to predict the update of optical flow after Gate Recurrent Unit (Cho et al. 2014) and two convolution layers to predict the update of the optical flow, as shown in Equation (3). The update operator estimates a series of optical flow starting from the initial point , and outputs the last updated optical flow result after reaching the set number of iterations, and after processing, the RGB optical flow map of the corresponding water flow image is obtained:
(3)

EfficientNet algorithm

EfficientNet is a very efficient convolutional neural network for extracting image features (Tan & Le 2019). The structure is shown in Figure 2 and is composed of MBConv structure, and MBConv is composed of an inverted residuals structure and forward residual structure. In Figure 2, the number of convolution cores of each structural block is shown in the red font. The input to predict the corresponding flow value is the RGB optical flow image output by the optical flow network, and after a series of MBConv structures and serial convolution blocks, the feature map channel dimension is reduced, as shown in the green area. Finally, the image features are fused into one pixel value through the fully connected layer, which is used as the numerical output of the prediction flow to complete the deep learning flow rate prediction algorithm.
Figure 2

EfficienNet flow detection network, input is the RGB optical flow image, output is the predicted flow velocity value.

Figure 2

EfficienNet flow detection network, input is the RGB optical flow image, output is the predicted flow velocity value.

Close modal
The proposed approach divides the velocity detection algorithm into two parts: EC process and detection process, as shown in Figure 3. In particular, the initial values of the two parts of the flow velocity prediction are obtained by the deep learning algorithm in ‘Deep learning flow velocity detection algorithm’ above. The EC algorithm is added before the detection process. The purpose of the EC algorithm is to provide a prior for the detection process part that is convenient to guide the calibration of the output flow velocity value. We then designed an internal calibration algorithm within the detection algorithm to allow the flow velocity calibration value from the EC and the pixel calibration value to adjust the predicted value to make the result more accurate.
Figure 3

Flow chart of internal and EC velocity detection algorithm.

Figure 3

Flow chart of internal and EC velocity detection algorithm.

Close modal

EC process

Inside the EC process, the treatment is divided into flow velocity calibration part and pixel calibration part. The flow velocity calibration part is shown in the EC process of Figure 3, including the cyclic traversal of the upper and lower frames of the input image for optical flow estimation and then the flow velocity detection to obtain the predicted value list . The pixel calibration part first performs corner detection on the first frame to obtain the corner set , and then uses the sparse optical flow algorithm (Barron et al. 1992) to track the detected corners, as shown in Equation (4). and represent the previous frame and the current frame, and the tracked corner set is obtained. The state quantity selects the corner set of the tracking corresponding to and obtains the corner set of the previous frame and the corner set of the current frame, as shown in Equation (5):
(4)
(5)
where and denote the previous frame and the current frame to get the set of corner points tracked . The state quantity filters out the set of corner points of corresponding to track to get the set of corner points of the previous frame and the set of corner points of the current frame after filtering, as shown in Equation (5).
The pixel distances between and are calculated, the pixel distances are sorted from largest to smallest, and the distance threshold is set and then filtered once more to finally obtain pixel distances. The average pixel shift of the current water flow image is obtained by averaging the pixel distances. After looping through the input image, the pixel displacement list is obtained, as shown in Equation (6):
(6)
where represents the Euclidean distance, and is generally between 5 and 10, representing the number of corners. M is the average operation.
After obtaining the aperiodic sequence V and , the post-processing part of the EC process is initiated. The number of points m of the sub-multipoint average and the moving interval are set to perform the -point moving average that is generally set to about one-fifth of the number of m frames i. The interval is set to about one-half of the multipoint average m, as shown in Equations (7) and (8), to obtain the corresponding video stream rate candidate calibration value and pixel candidate calibration value .
(7)
(8)
where , represents the number of moving averages. In Equation (7), represents the number of moving intervals of the th time, and represents the average of the th time. In Equation (8), and represent the velocity candidate calibration value and the pixel calibration value of corresponding to the flow video, respectively.
The predicted velocity value list V and the pixel displacement list from smallest to largest is produced and the median is taken out as shown in Equation (9):
(9)
where means the velocity sequence sorted from smallest to largest, means taking out the median, and are the flow velocity second candidate calibration values and pixel second candidate calibration values.
The minimum values in and are used as the final flow velocity calibration value and pixel calibration value of the water flow video corresponding to , as shown in Equation (10) that is used for the internal calibration part of the detection process:
(10)

Detection process

Inside the detection process, the protocol is divided into two parts: detection and internal calibration. The detection part reads the water flow video according to the upper and lower frames. After the optical flow estimation network mentioned in Figure 1 and the multi-classification network illustrated in Figure 2, the initial predicted flow velocity value is obtained and used for the internal calibration part. The internal calibration part is then divided into flow velocity internal calibration and pixel internal calibration.

Internal calibration of flow velocity

The number of anomalies and the list of anomalous elements are initialized to null values. The internal calibration is then performed as shown in Figure 4. When the first detection or the number of exceptions is equal to 0, the predicted value of the video frame is assessed to know if it is near to its calibration value , as shown in Equation (11):
(11)
where represents the abnormal probability value that is used to determine whether deviates from too much. If is lower than , it means that the fluctuation of is within the normal range, and no abnormal processing is performed in the next step. If the probability is greater than , it means that the predicted value is abnormal and needs to be calibrated. The calibration formula is shown in Equation (12):
(12)
where represents the sum of the abnormal list , j is the number of image frame traversal. After the calibration of the predicted flow velocity value by the flow velocity calibration, the value is calibrated by the pixel calibration value.
Figure 4

Flow chart of internal and EC velocity detection algorithm.

Figure 4

Flow chart of internal and EC velocity detection algorithm.

Close modal

Intra-pixel calibration

Repeat the corner detection and optical flow tracking process in ‘EC process’ above, add the pixel displacement between the current frame and the previous frame to the pixel list , and take the length of the abnormal element list as the length of . The average value of pixel displacement obtained by combining Equations (7) and (8) is compared with the pixel calibration value , and the updated flow velocity calibration value is obtained by combining the flow velocity EC value, as shown in Equation (13):
(13)

Datasets

Several video data sets were collected from different rivers for the model training. They represent various flow conditions such as turbulence, reflection, calmness, and dark light. The five video clips have been recorded with a resolution of 1,920 × 1,080 pixels, and two velocity measurement regions are selected for each clip, a total of 10 speed measurement areas, as shown in Figure 5. The average velocity values between two adjacent frames on the river surface of each measurement region were obtained from the open-source software Fudaa-LSPIV as a reference flow velocity value for the video frames.
Figure 5

Display maps of 10 velocity measurement area for the RAFT model training.

Figure 5

Display maps of 10 velocity measurement area for the RAFT model training.

Close modal
In order to simulate the river scene in complex environment, the speed measurement area is enhanced by affine transformation and color transformation data. Affine transformation is rotation, translation, and scaling operations, as follows:
(14)
(15)
(16)
where is a pixel coordinate of the image of the speed measurement area, A includes the rotation and scaling operation of the image, B includes the translation operation of the image, and the image of the speed measurement area after affine transformation needs to undergo color transformation to simulate the real shooting environment. The color transformation used in the approach includes the Gaussian blur and contrast change analysis:
  • (1) Gaussian blur: Gaussian kernel function is used to convolve the image, and the blur degree of the image can be controlled by changing the size of the Gaussian kernel. For the water flow image taken by the camera, different Gaussian kernels can be used to simulate the camera at different focal lengths, as follows:
    (17)
    where A is the magnitude, , is the center point coordinate, and , is the variance.
  • (2) Contrast change: Under the condition of keeping the average brightness unchanged, by changing the difference of different brightness levels between the brightest white and the darkest black in the light and shade area of the image, the light and shade changes of the light in the real world are simulated, as follows:
    (18)
    where k belongs to [−1,1], is the average brightness of the image, and I is the original value of the pixel.
After data enhancement, 10 test areas with the same resolution are used to perform two consecutive frames of optical flow calculation with RAFT optical flow network. The two-dimensional optical flow map represents the motion information of each pixel, and it is converted into RGB map for convolutional neural network training test. For the test, 10,024 water flow images with motion information generated by dense optical flow are combined with corresponding velocity labels to form an experimental data set. The RGB optical flow image corresponding to Figure 5 is shown in Figure 6. Due to the difference in texture features, the generated optical flow image is also different from its corresponding velocity label.
Figure 6

Display maps of 10 velocity measurement area optical flow.

Figure 6

Display maps of 10 velocity measurement area optical flow.

Close modal

Hardware configuration

The experimental hardware configuration is shown in Table 1.

Table 1

Hardware configuration

GPUCPUOperating system
NVIDIA Intel(R) Xeon(R) Gold 6,132 CPU Linux 
Tesla M60/8GB 2.60 GHz/64GB 3.10.0–1160.49.1 
GPUCPUOperating system
NVIDIA Intel(R) Xeon(R) Gold 6,132 CPU Linux 
Tesla M60/8GB 2.60 GHz/64GB 3.10.0–1160.49.1 

Experiment setting

The regression model (Tan et al. 2020) is selected as the experimental benchmark model. The training epoch is set to 300, the training batch is 32, the first three epochs are used for warming up the model, the initial learning rate is 0.01, the input size of the training and test images is set to 640 × 640 pixels, and the optimizer is AdamW (Loshchilov & Hutter 2017). The experimental hyperparameter settings for adding some are shown in Table 2.

Table 2

Hyperparameter settings

HyperparametersDefault
Batch division 320 
Momentum 0.937 
Weight decay 0.0005 
Saturation 1.5 
Exposure 1.5 
HyperparametersDefault
Batch division 320 
Momentum 0.937 
Weight decay 0.0005 
Saturation 1.5 
Exposure 1.5 

Evaluation metrics

In the approach, the root mean square error (RMSE) is used to measure the error between the predicted flow velocity values and the reference flow velocity labels, as follows:
(19)
where represents the number of test RGB optical flow images, m represents the number of test images, represents the predicted flow velocity value of the current th RGB optical flow image, and represents the actual flow velocity labels of the current th RGB optical flow image.

Experiment

Experimental comparison of external alignment algorithms

The water flow video recorded at Beichuan mountain torrent (Sichuan, China) is used as the test set to assess the performance of the algorithm. The maximum width of the river is 3.2 m, the average water depth is about 0.46 m, and the slope is 3%. The pixel outer calibration performs the corner point detection on the first frame to obtain the corners of each velocity measurement region. In order to ensure that the number of corners is not too small, the approach compares a variety of corner detection algorithms (Shi & Tomasi 2002; Rosten et al. 2009) in the calmer water corner detection effect, as shown in Figure 8.

In the case of setting the same corner threshold, the Oriented Fast and Rotated Brief (ORB) algorithm detects more corners than other corner detection algorithms, and the corners can reflect more texture features. The corner point of pixel (80, 236) in Figure 7(c) reflects the texture features of the image, but (a), (b) are not detected. After comparison, the method proposed in this paper uses ORB algorithm to detect corner points of pixel displacement.
Figure 7

Comparison of corner detection algorithms.

Figure 7

Comparison of corner detection algorithms.

Close modal
Figure 8

Sparse optical flow tracking in two velocity measurement regions: (a) corner of initial detection and (b) corner trajectory tracked after 10 frames.

Figure 8

Sparse optical flow tracking in two velocity measurement regions: (a) corner of initial detection and (b) corner trajectory tracked after 10 frames.

Close modal

When traversing the upper and lower frame images each time of the water flow video, the predicted flow velocity output by the EC part is stored in the flow velocity list. The out-of-pixel calibration part will perform the sparse optical flow tracking based on the corner points detected in the first frame and then store the corner pixel displacement obtained each time into the displacement list. As shown in Figure 8(a), two velocity measurement areas in the flood time flow video are selected for out-of-pixel calibration. The number of detected corner points is 4 and 5, respectively. The trajectory obtained by sparse optical flow tracking of the corner points is shown in Figure 8(b). The sparse optical flow tracking algorithm can judge the direction of water flow and assess the pixel displacement of water flow.

In the actual scene, the displacement of each position on the surface of the river water body is different. The above detection and tracking algorithm obtains the displacement of the key feature points on the river surface, which is used to reflect the displacement of each position on the actual river surface, that is, the overall displacement of the river surface. After traversing multiple video frames, the obtained displacement list obtains the calibration pixel value in the EC through the calculation formula in ‘EC process’ above.

It is important to test the effect of the number of traversals of the EC algorithm on the output calibrated flow velocity values and the pixel calibration values of the EC algorithm. Figure 9 shows the experimental results for a total of five flow measurement areas in two rivers with low- and high-flow velocity. The video duration is 10 min. On those graphs, the solid line represents the actual flow velocity values measured by the velocimeter, and different colors show the different flow measurement regions, and the dashed line indicates the output results when traversing the image up to 25 frames. When the traversal number is around 25 frames, the calibrated flow velocity value reaches the nearest inflection point to the actual flow velocity value in the five velocity measurement regions, and the calibrated flow velocity value after the inflection point fluctuates around the inflection point. In particular, as shown in Figure 9(b) for the high-flow velocity region, the calibrated flow velocity fluctuates more in the interval of the number of calibration frames from 10 to 20, but the amplitude of fluctuation decreases with the increase of the number of calibrations. The experimental results of the pixel EC are similar to the experimental results of the velocity EC, and its pixel calibration also reaches stability around 25 frames. So the traversal number of the EC algorithm is set to 25 frames. The calibration flow velocity values and pixel calibration values of the five velocity measurement regions were obtained by the EC algorithm and then used in the internal calibration algorithm for calibration.
Figure 9

The influence of image frame traversal times on EC parameters: (a) changes of the calibration parameters at low flow rates and (b) changes of the calibration parameters at high flow rates.

Figure 9

The influence of image frame traversal times on EC parameters: (a) changes of the calibration parameters at low flow rates and (b) changes of the calibration parameters at high flow rates.

Close modal
Table 3 shows the flow velocity calibration values and pixel calibration values of the five flow measurement areas when the EC algorithm traverses 10, 25 and 50 frames, the RMSE between the predicted flow velocity value and the corresponding actual value when traversing 50 frames without the EC algorithm. In Figure 10, the velocity measurement regions 1–5 correspond sequentially to the five line segments in Figure 9, where the reference flow velocity values range from 0.1 to 0.8 m/s in low to high intervals. According to Table 3, when traversing 10 frames, the RMSE of the flow velocity calibration value and the pixel calibration value in the five measurement areas are mostly greater than 0.1, and the errors between the flow calibration values and the reference values are larger. When the traversal reaches 25 frames, the historical values are gradually averaged according to the post-processing, which is closer to the true value of the measurement, and the RMSE is below 0.07. The RMSE of the flow velocity calibration value of the velocity measurement area 2 and the pixel calibration value of the velocity measurement area 1 reached the minimum at 50 frames, but the calibration time was too long. However, when the EC is not used, the RMSE corresponding to the traversal of 50 frames reaches the maximum in the five velocity measurement regions. In terms of comprehensive performance, the EC traversal of 25 frames provides the best performance.
Table 3

The influence of the traversal times of the EC algorithm on the calibration parameter RMSE

The velocity measurement areaError
RMSE (10 frame m/s)RMSE (10 frame pix)RMSE (25 frame m/s)RMSE (25 frame pix)RMSE (50 frame m/s)RMSE (50 frame pix)RMSE (m/s)
Measurement area 1 0.1351 0.1118 0.0535 0.0621 0.0602 0.0593 0.1652 
Measurement area 2 0.1203 0.0753 0.0619 0.0475 0.0576 0.1033 0.1364 
Measurement area 3 0.1146 0.0935 0.0514 0.0693 0.0741 0.0724 0.1272 
Measurement area 4 0.1609 0.1590 0.0504 0.0468 0.0536 0.0492 0.2114 
Measurement area 5 0.1832 0.1647 0.0621 0.0515 0.0679 0.0524 0.2335 
The velocity measurement areaError
RMSE (10 frame m/s)RMSE (10 frame pix)RMSE (25 frame m/s)RMSE (25 frame pix)RMSE (50 frame m/s)RMSE (50 frame pix)RMSE (m/s)
Measurement area 1 0.1351 0.1118 0.0535 0.0621 0.0602 0.0593 0.1652 
Measurement area 2 0.1203 0.0753 0.0619 0.0475 0.0576 0.1033 0.1364 
Measurement area 3 0.1146 0.0935 0.0514 0.0693 0.0741 0.0724 0.1272 
Measurement area 4 0.1609 0.1590 0.0504 0.0468 0.0536 0.0492 0.2114 
Measurement area 5 0.1832 0.1647 0.0621 0.0515 0.0679 0.0524 0.2335 
Figure 10

The influence of abnormal probability factor and abnormal constant on the change of flow velocity: (a) influence of different values of and (b) influence of different values of .

Figure 10

The influence of abnormal probability factor and abnormal constant on the change of flow velocity: (a) influence of different values of and (b) influence of different values of .

Close modal

Experimental comparison of internal calibration algorithms

The internal calibration algorithm allows to calibrate the predicted output flow velocity value according to the calibration parameter output by the EC algorithm. The internal calibration algorithm has influence on the prediction results and should be analyzed. When the flow velocity changes drastically, the internal calibration algorithm needs to compare the predicted flow velocity value with the calibration flow velocity value according to the current frame. Figure 10 shows the influence of the value of the abnormal probability factor and the abnormal constant on the internal calibration results.

Taking velocity measurement area 1 as an example, the change of flow velocity is simulated by frame extraction, and the red line segment in Figure 11(a) represents the actual flow velocity. During the test, no frame extraction is performed within 10 frames, and the internal calibration algorithm is loaded with an abnormal element list to facilitate subsequent calibration. The flow velocity reaches the maximum at 140 frames. The weakly calibrated black line segment (, ) cannot track the change of the measured flow velocity in time and generates a large error with the reference measurement. However, the strongly calibrated green line segment can track the rapid change of the flow velocity in time by continuously calibrating the flow velocity that is closest to the instant value. With the decrease of flow velocity, the influence of parameter is obviously weaker, and the error between strong calibration and weak calibration is relatively small. When the flow rate continues to increase rapidly (after 300 frames), the error increases significantly again compared to the reference measurement, and the influence of parameter is enhanced. Appropriately reducing the value of the parameter to increase the frequency of calibration within the algorithm will make the prediction result more accurate, which is generally set to 8.
Figure 11

Test results of intra-pixel calibration when the flow velocity changes.

Figure 11

Test results of intra-pixel calibration when the flow velocity changes.

Close modal

The test environment of the sensitivity of the parameter is the same as that of the parameter , and the error analysis is performed with reference measurements. As shown in Figure 11(b), the flow velocity reaches the maximum value at 140 frames. When the threshold is adjusted to 0.6 (, ), the error of the flow velocity curve reaches the maximum value relative to the curve of other thresholds, but it is smaller than the maximum error of the parameter (, ). This is because the parameter is only responsible for judging the abnormal flow velocity in the algorithm stage. In fact, the calibration is caused by the parameter , so the parameter has a weak effect on the flow velocity. Similarly, the relative error is small when the flow velocity decreases, and the error increases when the value increases rapidly. The value of parameter can be appropriately reduced to lower the threshold for screening abnormal flow velocity, and then the prediction results become more accurate by increasing the frequency of calibration. In summary, the parameter setting of , will make the prediction results more accurate.

The RMSE corresponding to the test flow velocity of and parameter is shown in Table 4. The tests data of 450 frames are used after frame extraction to verify. It can be seen that the parameter setting of , minimizes the RMSE (represented by bold), and the parameter setting of , maximizes the error to 0.2134. Therefore, it can be concluded that with the increase of and parameters, the error of the predicted output velocity value will also increase. This can be explained by the unsteady flow conditions in rive. Consequently, a high frequency calibration process is needed.

Table 4

RMSE of and parameters test flow velocity

 0.3 0.3 0.3 0.3 0.4 0.5 0.6 
 12 16 20 
RMSE 0.0684 0.0875 0.1222 0.2134 0.0831 0.1024 0.1545 
 0.3 0.3 0.3 0.3 0.4 0.5 0.6 
 12 16 20 
RMSE 0.0684 0.0875 0.1222 0.2134 0.0831 0.1024 0.1545 

Based on the output prediction flow velocity results with , parameter settings, the number of interval frames for pixel calibration needs to correspond to the anomaly constant. Here, intra-pixel calibration is performed every eight frames. The test results of the intra-pixel calibration are shown in Figure 11. The initial calibration value after EC is shown in the blue point in the lower left corner, and then the pixel calibration at eight frame intervals is continuously performed to obtain the updated calibration value, as shown in the orange point. The flow velocity calibration value after pixel calibration is closer to the real measurement value. So the accurate flow velocity calibration value is used to predict the calibration of the output, as shown by the green line, where the predicted output is calibrated according to the flow velocity calibration value. When the flow velocity changes rapidly within the river, the pixel calibration process ensures that the flow velocity calibration value can synchronize with the change of the instant value, and also lays a foundation for the accuracy of the predicted output.

In the internal calibration, the pixel calibration values and the corresponding real flow velocity values of seven different frames are selected as shown in the second and third lines of Table 5. Among them, the 10th frame represents the pixel calibration value of the EC output of 5.218. The last line is the internal calibration flow velocity calculated by Equation (13) according to the internal calibration pixel value. The visible error is small and can represent the relationship between the internal calibration pixel value and the internal calibration flow velocity value obtained in the internal calibration algorithm.

Table 5

Calibration experiment of intra-pixel calibration

Frame10 (EC)5090140210300370450
Pixel value (pix) 5.218 16.144 24.319 31.877 22.653 18.678 28.934 41.153 
Real flow velocity (m/s) 0.108 0.374 0.553 0.656 0.571 0.387 0.486 0.865 
Flow velocity (m/s) 0.114 0.351 0.526 0.631 0.560 0.399 0.505 0.880 
Frame10 (EC)5090140210300370450
Pixel value (pix) 5.218 16.144 24.319 31.877 22.653 18.678 28.934 41.153 
Real flow velocity (m/s) 0.108 0.374 0.553 0.656 0.571 0.387 0.486 0.865 
Flow velocity (m/s) 0.114 0.351 0.526 0.631 0.560 0.399 0.505 0.880 

The relationship between the internal calibration pixel values and the corresponding true flow velocity values is shown in the blue dots in Figure 12. The line segment represented by Equation (13) is shown in the orange line. The line fitted between the internal calibration pixel value and the true flow velocity value is close to the orange line, indicating that Equation 13 can be used to update the internal calibration value to make the predicted more accurate.
Figure 12

Relationship between the internal calibration pixel values and the corresponding true values.

Figure 12

Relationship between the internal calibration pixel values and the corresponding true values.

Close modal
After traversing the first 50 frames of the five velocity measurement areas to obtain the corresponding calibrated flow velocity values by the EC algorithm, the flow velocity prediction is performed in real time for the images after 50 frames of the corresponding velocity measurement areas in the detection process, with the experimental parameters set to , . Figure 13 shows the comparison between the EC flow velocity, the calibration flow velocity, and the deep learning prediction flow velocity for five velocity measurement areas with a duration of 200 s, with a flow rate sampling interval of 4 s and a total of 50 flow rate sampling points. The original deep learning algorithm predictions fluctuate widely, have poor accuracy, and are in error with the external calibrated flow velocity. The calibrated flow velocity after internal calibration fluctuates within the range of EC flow velocity, with a small amplitude and high accuracy. In disaster warning, it is necessary to maintain stable flow velocity values to prevent transient flow disturbances (Yang 2021).
Figure 13

Comparison of the predicted values of the flow velocity under different flow measurement regions: (a) low-flow velocity regions and (b) high-flow velocity regions.

Figure 13

Comparison of the predicted values of the flow velocity under different flow measurement regions: (a) low-flow velocity regions and (b) high-flow velocity regions.

Close modal

Comparison of different video velocimetry techniques

Table 6 compares the RMSE of this method with other commonly used video velocimetry techniques, including optical tracking velocimetry (OTV) (Tauro et al. 2018) and conditional boundary equilibrium generative adversarial networkor (CBEGAN) (Wang et al. 2019). OTV uses ORB combined with sparse optical flow to calculate the pixel displacement in the same region, and the predicted flow velocity value is obtained according to the mapping relationship between the actual distance and the image distance (Hu et al. 2021). The classification model in CBEGAN adopts the EfficientNet used in this paper, and the training data are the same.

Table 6

Comparison of different video velocimetry techniques

The velocity measurement areaRMSE
OTACBEGANOurs
Measurement area 1 0.117 0.085 0.0547 
Measurement area 2 0.103 0.076 0.0413 
Measurement area 3 0.095 0.092 0.0651 
Measurement area 4 0.214 0.152 0.091 
Measurement area 5 0.385 0.197 0.099 
The velocity measurement areaRMSE
OTACBEGANOurs
Measurement area 1 0.117 0.085 0.0547 
Measurement area 2 0.103 0.076 0.0413 
Measurement area 3 0.095 0.092 0.0651 
Measurement area 4 0.214 0.152 0.091 
Measurement area 5 0.385 0.197 0.099 

Compared with the other two algorithms, the proposed method achieves the smallest RMSE in the five velocity measurement regions. The highest RMSE of OTA may be due to the interference of video pixels and environment. The corner points detected by OTA algorithm are missing as shown in Figure 7, which leads to inaccurate pixel displacement and affects the stability of the predicted flow velocity value. When the velocity is measured in the high-flow velocity region, the RMSE reaches a maximum of 0.385, which may be due to the influence of the breakpoint problem during the sparse optical flow tracking process (Liuxia et al. 2019). CBEGAN directly uses the classification model to label and classify the flow velocity value. Because the training data set is difficult to cover all the environmental impacts, it will also cause inaccurate output results. In the framework of internal and EC in this paper, the optical flow estimation method improves the prediction accuracy of the classification model, and then uses the pixel displacement to correct the model to predict the flow velocity value, which is equivalent to the fusion of the above two algorithms, so that their respective advantages complement each other, and RMSE is minimized. At the same time, the EC stage of this method uses the velocity value of adjacent time periods to be equivalent to the velocity pretreatment, which can also be used to reduce RMSE in the other two methods and, through the combined analysis of Table 6 and Figure 13, our method used in the high-flow river area also has high accuracy and stability.

In the proposed approach, a two-stage velocity measurement algorithm based on depth learning method is implemented. In the EC process, the flow velocity calibration value and pixel calibration value are obtained by preprocessing the output results of the two algorithms in a period of time. Then the predicted flow velocity output by the deep learning algorithm is internally calibrated to obtain the calibrated predicted flow velocity in the flow velocity detection stage. When the flow velocity changes (rain, confluence), the pixel calibration value can be used to predict the flow velocity update. The experimental results show that the flow velocity calibration values provided by the EC can reflect the instant flow velocity in the corresponding velocity measurement area. The subsequent detected flow velocities can also be based on the accurate predicted flow velocities around the flow velocity calibration values, and the internal calibration algorithm can output the corresponding more accurate changing flow velocity values when the flow velocity changes drastically. Those preliminary results are very promising. While maintaining accuracy, our method has an advantage over traditional methods in that it does not require knowledge of the internal and external parameters of the camera, thus reducing additional computational costs. Furthermore, any deep learning optical flow algorithm can be integrated into our two-stage framework.

Obviously, and even if the flow conditions covered by the different recorded videos are covering a wide range of hydraulic situations, additional tests should be performed with additional examples collected on the field and especially for fast transient conditions that are observed during extreme hydrological events. The possibility to produce efficient and reliable flow velocity monitoring solutions based on video recording is a major target and could significantly contribute to assess extreme conditions and provide warning signals to exposed populations.

All relevant data are available from an online repository or repositories: https://zenodo.org/records/10685712.

This work was funded by the National Key R & D Program of China, Project Number (Project No. : 2023YFC3006700); topic Five Number: 2023YFC3006705.

The authors declare there is no conflict.

Bao
X.
,
He
Z. G.
,
Wang
Z. Y.
,
Wu
G. F.
,
Liu
G. H.
&
Qian
J. L.
2012
Evaluation of flooding damage caused by dam flood in typhoon storm-affected area
.
Journal of Zhejiang University: Engineering Science
46
(
9
),
1638
1646
.
Barron
J. L.
,
Fleet
D. J.
,
Beauchemin
S. S.
&
Burkitt
T. A.
1992
Performance of optical flow techniques
. In
Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition
,
June
,
Champaign, IL
,
USA,
pp.
236
242
.
Bechle
A. J.
,
Wuc
H.
,
Liu
W.-C.
&
Kimura
N.
2012
Development and application of an automated river estuary discharge imaging system
.
Journal of Hydraulic Engineering Flow
138
(
4
),
327
339
.
Chen
W.
&
Mied
R. P.
2013
Optical flow estimation for motion-compensated compression
.
Image and Vision Computing
31
(
3
),
275
289
.
Cho
K.
,
Van
M. B.
,
Gulcehre
C.
,
Bahdanau
D.
,
Bougares
F.
,
Schwenk
H.
&
Bengio
Y.
2014
Learning Phrase Representations using RNN Encoder-Decoder for Statistical Machine Translation
. In
Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing (EMNLP)
,
October
,
Doha, Qatar
, pp.
1724
1734
.
Farneback
G.
2003
Two-frame motion estimation based on polynomial expansion
.
Lecture Notes in Computer Science
2749
,
363
370
.
Fujita
I.
&
Komura
S.
1994
Application of video image analysis for measurements of river-surface flows
.
Proceedings of Hydraulic Engineering
38
,
733
738
.
Gao
C. X.
2020
Application of hydrological information technology in flood control
.
Hydropower
10 (
12
),
39
40
.
Gourbesville
P.
2020
Which models for decision support systems? Proposal for a methodology
. In:
Advances in Hydroinformatics: SimHydro 2019-Models for Extreme Situations and Crisis Management
.
Springer
Singapore
, pp.
3
17
.
He
K.
,
Zhang
X.
,
Ren
S.
&
Sun
J.
2016
Deep Residual Learning for Image Recognition
. In
Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition
,
Jun
,
Las Vegas
,
USA
, pp.
770
778
.
Hu
X. X.
,
Wang
H. N.
&
Zheng
X.
2021
An Image Distance Information Extraction Method Based on Feature Objects and Perspective Transformation
.
Patent no. CN201911388812.7
.
Ilg
E.
,
Mayer
N.
,
Saikia
T.
,
Keuper
M.
,
Dosovitskiy
A.
&
Brox
T.
2017
Flownet2.0: Evolution of optical flow estimation with deep networks
. In
Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition
,
2017 July
,
Hawaii
,
USA
, pp.
2462
2470
.
Le Coz
J.
,
Jodeau
M.
Marchand
B.
&
Boursicaud
R. L.
2014
Image-based velocity and discharge measurements in field and laboratory river engineering studies using the free FUDAA-LSPIV software
. In
Proceedings of the International Conference on Fluvial Hydraulics
, pp.
1961
1967
.
Liu
J.
,
Zu
J.
,
Zhang
Y.
&
Zhang
H. Y.
2014
Optical flow estimation method under the condition of illumination change
.
Journal of Image and Graphics
19
(
10
),
1475
1480
.
Liuxia
X. D.
,
Shen
D. F.
,
Zhang
X. X.
&
Zhang
G. Y.
2019
Improved LK method tracks mobile ball in complex background
.
Computer Systems and Applications
28
(
7
),
221
227
.
Loshchilov
I.
&
Hutter
F.
2017
Decoupled weight decay regularization. arXiv preprint arXiv:1711.05101
.
Nohara
D.
,
Gourbesville
P.
&
Ma
Q.
2018
Towards development of effective decision support systems for integrated water resources planning and management
.
Disaster Prevention Research Institute Annuals
61
(
B
),
702
710
.
Pizarro
A.
,
DalSasso
S. F.
,
Perks
M. T.
&
Manfreda
S.
2020
Identifying the optimal spatial distribution of tracers for optical sensing of stream surface flow
.
Hydrology and Earth System Sciences
24
(
11
),
5173
5185
.
Rosten
E.
,
Porter
R.
&
Drummond
T.
2009
Faster and better: A machine learning approach to corner detection
.
IEEE Transactions on Pattern Analysis & Machine Intelligence
32
(
1
),
105
119
.
Shi
J.
&
Tomasi
C.
2002
Good features to track
. In
Proceedings of IEEE Conference on Computer Vision and Pattern Recognition
, pp.
593
600
.
Tan
M. X.
&
Le
Q.
2019
Efficientnet: Rethinking model scaling for convolutional neural networks
. In
Proceedings of International Conference on Machine Learning, PMLR
,
June
,
Nanchang
,
China
, pp.
6105
6114
.
Tan
M. X.
,
Pang
R. M.
&
Le
Q.
2020
EfficientDet: Scalable and efficient object detection
. In
Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition
,
June
,
: Seattle
,
USA
, pp.
10778
10787
.
Tauro
F.
,
Grimaldi
S.
&
Porfiri
M.
2014
Unraveling flow patterns through nonlinear manifold learning
.
Plos One
9
(
3
),
e91131
.
Tauro
F.
,
Tosi
F.
,
Mattoccia
S.
,
Toth
E.
,
Piscopia
R.
&
Grimaldi
S.
2018
Optical tracking velocimetry (OTV): Leveraging optical flow and trajectory-based filtering for surface streamflow observations
.
Remote Sensing
10
(
12
),
2010
.
Teed
Z.
&
Deng
J.
2020
Raft: Recurrent all-pairs field transforms for optical flow
. In
Proceedings of the European Conference on Computer Vision
,
Aug 23
.
Cham
:
Springer
, pp.
402
419
.
Wang
W. L.
,
Qiu
H.
&
Zheng
J. W.
2018
Estimation method of river surface velocity based on compressed sensing image analysis
.
Journal of Hydroelectric Engineering
37
(
5
),
69
79
.
Wang
W. L.
,
Yang
S. L.
,
Zhao
Y. W.
&
Li
Z. R.
2019
River surface velocity estimation based on conditional boundary equilibrium generative adversarial network
.
Journal of Zhejiang University (Engineering Edition)
53
(
11
),
2118
2128
.
Yang
G.
2021
Study on River Flow Measurement Based on CNN and Image Processing
.
Shandong University
,
Shandong
.
Yang
H. L.
,
Ma
S. S.
,
Zhang
J. Y.
,
Fan
M. J.
&
Zhang
F.
2008
Comparative study on ultrasonic and velocity instrument in the flow measurement of open channel
.
Journal of Shandong Agricultural University(Natural Science Edition)
39
(
2
),
301
304
.
Yao
Q.
2022
Research on River Surface Velocimetry Based on Optical Flow Estimation
.
Zhejiang University
,
Zhejiang
.
Zhang
Z.
,
Zhou
Y.
,
Li
X. R.
,
Chen
H.
&
Liu
L. H.
2018
Development and application of image flow measurement system
.
Water Conservancy Informatization
2018
(
03
),
7
13
.
This is an Open Access article distributed under the terms of the Creative Commons Attribution Licence (CC BY-NC-ND 4.0), which permits copying and redistribution for non-commercial purposes with no derivatives, provided the original work is properly cited (http://creativecommons.org/licenses/by-nc-nd/4.0/).

Supplementary data