Vertical slot fishways are hydraulic structures which allow the upstream migration of fish through obstructions in rivers. The appropriate design of these devices should take into account the behavior and biological requirements of the target fish species. However, little is known at the present time about fish behavior in these artificial conditions, which hinders the development of more effective fishway design criteria. In this work, an efficient technique to study fish trajectories and behavior in vertical slot fishways is proposed. It uses computer vision techniques to analyze images collected from a camera system and effectively track fish inside the fishway. Edge and region analysis algorithms are employed to detect fish in extreme image conditions and Kalman filtering is used to track fish along time. The proposed solution has been extensively validated through several experiments, obtaining promising results which may help to improve the design of fish passage devices.
INTRODUCTION
The construction of water resources management works, such as dams, weirs, water diversions, and other barriers, leads to significant changes in the river ecosystem. These structures constitute a physical barrier to fish natural movements, which negatively impacts their populations. In fact, this interruption of free passage has been identified as the main reason for the extinction or the depletion of numerous species in many rivers (Jackson et al. 2001).
One of the solutions to restore longitudinal connectivity of rivers is the construction of fishways, vertical slot fishways being a common and widely used type (Figure 1). These devices basically consist of a channel with a sloping floor that is divided by baffles into a series of pools. Water runs downstream in this channel through a vertical slot from one pool to the next one below. The hydraulic characteristics of vertical slot fishways vary according to the geometric dimensions and configuration of the pools and baffles and have been studied by both numerical and physical modeling (Rajaratnam et al. 1986; Puertas et al. 2004; Tarrade et al. 2008; Bermúdez et al. 2010; Chorda et al. 2010).
At the same time, several authors have examined fish swimming capabilities (Dewar & Graham 1994; Blake 2004) and found evidence that fish are confronted by a challenging hydrodynamic environment when they swim upstream vertical slot fishways. However, these works are generally carried out in experimental flumes, in which flow features differ significantly from those found in vertical slot fishways, and little is known about fish behavior in these particular artificial conditions. Besides, very few works have studied the interaction between the biological and physical processes that are involved in swimming upstream a vertical slot fishway (Puertas et al. 2012). Consequently, it is necessary to develop new methodologies to analyze fish behavior in these devices, which in turn will contribute to the establishment of more effective fishway design criteria.
One of the main difficulties in studies of fish passage is that the existing mechanisms to measure fish behavior, such as direct observation or placement of sensors on the specimens, are impractical or seriously affect the fish behavior (Castro-Santos et al. 1996). In this context, techniques based on the use of imaging and acoustic information can constitute an important alternative. Acoustic transmitters or scanners and video cameras have been used to observe fish for some time (Armstrong et al. 1992; Steig & Iverson 1998). In recent works, computer vision techniques have been used in applications such as fish tracking in a tank using color contrast and fluorescent marks (Duarte et al. 2004) or fish recognition in clear water by color properties and background subtraction (Chambah et al. 2004).
Considering a wider perspective, different authors have used image processing techniques to detect, count, or track fish. These works use techniques such as stereo vision (Petrell et al. 1997), background models (Morais et al. 2005), shape priors (Clausen et al. 2007), local thresholding (Chuang et al. 2011), moving average algorithms (Spampinato et al. 2008), particle image velocimetry techniques (Deng et al. 2004), pattern classifiers applied to the changes measured in the background (Lines et al. 2001), or artificial neural networks (ANN) (Rodriguez et al. 2011). Finally, some techniques based on infrared imaging (Baumgartner et al. 2010) or laser imaging detection and ranging technologies (Mitra et al. 2006) can also be found in the literature. Recently, these technologies have been integrated with Kalman filtering techniques to improve fish tracking. In Shortis et al. (2013) they were applied to monitor fish recorded with stereo imaging devices, whereas in Jensen & YangQuan (2013) they were used to locate fish tagged with radio transmitters.
However, it should be noted that most of the published techniques have been carried out in calm water conditions and with controlled light sources and therefore are not suitable to be used inside a fishway. In addition, some of these techniques use marks or light sources, which may influence fish behavior, while others employ special and expensive sensors, which may only be used at certain points of the structure.
This work proposes a technique to analyze fish behavior in vertical slot fishways, which is intended to complete the fishway design methodology with experimental results. In each test, living fish are introduced into a fishway model equipped with an overhead camera system, and fish trajectory and behavior are studied by means of computer vision techniques. The method can provide researchers with valuable information about fish–structure interaction, such as fish resting areas and times, fish velocities and accelerations, or passage success rates. As well, the results can be used to formulate and calibrate models that predict fish movement behavior (Weber et al. 2006). This may ultimately contribute to improving the efficiency of these types of devices.
This work continues the research started in Rodriguez et al. (2011), in which the camera acquisition system and the methodology to conduct the experiments are described. In this previous paper, a new technique combining image processing and ANN is proposed to automatically detect fish in the images. In this case, only preliminary results were obtained and experimental tests were carried out with a single fish species. Also, the resulting accuracy was not compared with that of other techniques.
This paper focuses on improving the image analysis techniques to detect and track fish in a fishway. To this end, a new and more efficient technique is proposed, based on the combination of region and edge analysis. This technique replaces the ANN described in Rodriguez et al. (2011) which, although providing satisfactory results, was found to be slower and less accurate. Also, a new procedure based on the extended Kalman filter was integrated in the technique to effectively track fish in the presence of image occlusions or in a multiple fish scenario. It uses probabilistic analysis to compare detected and predicted fish positions and allows the separation of different trajectories and removal of anomalous detections.
Finally, several extensive experiments have been conducted to support the conclusions of this work, using 259 individuals of four fish species and different light and flow conditions. In total, the technique is applied to about 500 millions of images recorded in 15 assays performed from 2010 to 2012.
PROPOSED TECHNIQUE
The proposed technique detects and tracks fish in a vertical slot fishway, analyzing the images acquired with a multicamera system. It uses a combination of computer vision techniques and processes the information to obtain different parameters regarding the interaction between fish and fishway.
The technique has been tested in a full-scale fishway model, located at the CEDEX (Center for Studies and Experimentation of Public Works in Madrid, Spain), which consists of a 20 m long, 1.5 m wide, and 1 m deep flume (Figure 1). It contains 11 pools, with the geometric dimensions detailed in Figure 2. The slope of the fishway during the tests was 7.5% and the total discharge was set to 250 L/s.
The camera system consists of 28 cameras with fisheye lenses, placed in the fishway in an overhead perspective and partially submerged. The seven upper pools are covered (four cameras have been installed in each pool, as shown in Figure 3) and turbulence and surface reflections are avoided. The cameras are integrated into a monitoring and data acquisition system, which is described in detail in Rodriguez et al. (2011).
The main steps of the proposed technique can be summarized as follows:
(1) camera calibration: the image distortion is eliminated and a projective model is designed to integrate measurements from the different cameras into a common coordinate space;
(2) segmentation: a dynamic background model of the scene is created and subtracted from the image to highlight regions where the fish can be found. A technique based on the combination of region and edge analysis is then applied to detect fish;
(3) representation and interpretation: the detected objects are translated into a descriptive representation which can be operated with. A first filtering step is done at this point;
(4) tracking: the detected objects are used as the input of a filter extended from the Kalman model. This is an algorithm that uses a motion model and works in a similar way to a Bayesian filter. It allows the deletion of isolated noise detections, separation of different fish, estimation of hidden positions, and prediction of next positions;
(5) filtering: a filtering process is performed on the obtained fish trajectory in order to overcome artifacts in the detections and to ensure a soft trajectory;
(6) data processing: the data are integrated with hydraulic information and processed to calculate the output information, such as fish velocities or accelerations.
Camera calibration
The first step of the algorithm is to calculate the transformation from coordinates in a particular camera to real coordinates in the fishway. This is performed in two stages: first, the parameters to correct and scale the image are obtained for each camera; second, the transformation of each camera into a common coordinate system is calculated.
To obtain the matrix M and the distortion coefficients ki and pi, the real geometry of a calibration pattern is compared with the geometry observable in the images of the pattern. The parameters of the model are solved using the technique proposed in Zhang (1999), minimizing the reprojection error with the Levenberg-Marquardt optimization algorithm. In this way, the coordinates of the point in the real space (X, Y, Z) are calculated from the coordinates of the point in the image (xi,yi), taking into account xc = X/Z and yc = Y/Z for a given Z. In this work, the Z value was calculated assuming that the average distance between the fish and the bottom is 5 cm. The underlying assumption is that the fish swim preferentially near the bottom of the flume, which was verified in preliminary experimental tests and is consistent with observations made in similar studies (Mateus 2007; Silva et al. 2012; Branco et al. 2013).
In addition, the refraction of light in the water must be considered. Thus, cameras should be calibrated underwater or refraction should be modeled with an additional transformation. In this case, an affine model is employed to perform this task.
To solve the equation above, a number of visual marks have been placed in the areas where the vision field of the cameras overlaps (Figure 3).
Segmentation
Image segmentation is the process of dividing an image into multiple parts, in this case, with the aim of separating the fish from the background. Every segmentation algorithm addresses two problems: the criteria for a good partition and the method for achieving an efficient one (Yilmaz et al. 2006). In statistics, this problem is known as cluster analysis and is a widely studied area with hundreds of different algorithms (Szeliski 2011).
Therefore, it is first necessary to find a variable or group of variables (features) which allow a robust separation of the fish from the background, as well as to choose a classification technique according to the selected variables. This problem requires analysis of the distinctive visual properties of the fish and the background (Cheng et al. 2001; Zhang et al. 2008).
At present, the most common criteria to detect fish in images are based on color features and a priori knowledge of the background. However, these techniques do not perform well in underwater images, even for calm water and high quality images, due to the low levels of contrast (Lines et al. 2001). Besides, acquired images in this study will be characterized by extreme luminosity changes and huge noise levels, making texture and features based on a priori color properties useless.
Taking this into account, different techniques are considered in this work. They are detailed in the Results section and provide a comparative framework for evaluating the system performance. These techniques involve a two-step process. First, the variability of the images is reduced using knowledge of the background. Second, an adaptive analysis is performed either on the discontinuities of the image (edge-based classification) or on the local similarity of pixels (region-based classification). Owing to the need to operate the system by non-experts, only non-supervised techniques have been considered and, given the huge amount of images to be analyzed, computational complexity was decided as being a critical factor.
One of these techniques, previously developed in Rodriguez et al. (2011), consists of a self-organizing map (SOM) neural network (Kohonen 1982). The SOM model is aimed at establishing a correlation between the numerical patterns supplied as an input and a two-dimensional output space (topological map). This characteristic can be applied to image segmentation and SOM networks have been widely used in the image analysis field (Ahmed & Farag 1997; Verikas et al. 1997; Waldemark 1997; Ngan & Hu 1999; Dong & Xie 2005). Although promising results were obtained in this early work (Rodriguez et al. 2011), the SOM approach requires more computational time compared to more straightforward techniques, such as the ones proposed in this paper. Besides, the SOM technique depends on the training patterns selected.
In this work, a combination of two simple techniques is selected, together with an image preprocessing procedure and a dynamic background modeling, to overcome the limitations of the SOM approach. The first selected technique is a modern implementation of the Canny edge detector, which proved to be less noisy than other edge-finding methods such as the Sobel or the Prewitt operators. Edges corresponding to the frontiers between fish and background are obtained by means of four directional filters which detect horizontal, vertical, and diagonal discontinuities in the derivatives of the images (Canny 1986).
To reduce the false positives obtained, the objects detected by the edge analysis are filtered using a second segmentation technique. This second technique is the Otsu method, which performs a region classification by automatically thresholding the image histogram. It is a fast and efficient technique to separate dark objects in a white background (Sezgin & Sankur 2004). The final outcome will include only the objects detected by the edge algorithm which overlap 95% with those found with the region technique.
To include knowledge about the background in the method, a dynamic background is calculated forming a synthetic image. Before the application of the segmentation technique, the image is normalized and the foreground is extracted using this background model. This procedure is known in computer vision as background subtraction.
To update the background image along time, BIi−1 and Ii images are divided into four regions which are considered separately. Each region of the background image BIi−1 is updated with the current image Ii if no objects were detected and if time elapsed from the previous update exceeds a certain value (which was empirically set by default to 30 frames).
To enhance the image quality, the images are preprocessed using a standard contrast-limited adaptive histogram equalization technique. Also, the borders where the waterproof cases of the cameras produced a black region (without information) were masked.
Representation and interpretation
As a result of the segmentation process, the image is divided into different regions representing background and possible fish. At this point, it is possible to use a higher level processing, adding knowledge extracted from the characteristics of real fish, to interpret the segmented image. To this end, the objects detected in the previous step are translated into convenient descriptors, which can be used to perform different operations: its area, its centroid (calculated as the average position of the body pixels), and the minimum ellipse containing the body.
Subsequently, an algorithm classifies each detected body into fish or non-fish categories. The operation of this algorithm is divided in three stages, as shown in Figure 5. In the first stage, the detected bodies are discarded or classified as either fish or small bodies. To this end, a shape criterion based on value ranges of the above descriptors is defined for each fish species. In the second stage, close fish bodies are joined if the resulting body verifies the shape criterion. Finally, small bodies are either joined with detected fish or discarded. Figure 6 shows the obtained results when applying image interpretation.
Tracking
Tracking is the problem of generating an inference about the motion of one or more objects from a sequence of images. It can be solved using several approaches, including motion estimation and feature matching techniques. Some of the most important approaches consider a statistical point of view and formulate the problem as a prediction correction process based on Bayesian theory.
The Kalman filter is designed to track multiple objects, which are referred to as fish or tracks. The essential problem which is solved at this point is the assignment of detections to fish. To associate detections to tracks, a cost is assigned to every possible pair of fish–detection. The cost is understood as the probability of that detection to correspond to the current fish position. It is calculated using the distance from the detected position to the predicted position and to the last confirmed position of the fish. To this end, the minimum of the Euclidean distances is selected as cost metric.
Therefore, every detection is assigned to the track with the lower cost, provided that it is lower than a selected value and each track can only be assigned to one detection. When a new detection is assigned to a fish, the predicted position for that instant is confirmed and corrected. Detections which remain unassigned to any existing fish are assumed to belong to new tracks. In addition, if a fish remains unassigned for too long, its track is closed, so no new assignments can be made to that fish. Fish without enough detections are assumed to be noise and deleted. The operation of the assignment algorithm is described in the schematic of Figure 7 and the results obtained in a situation with two fish are shown in Figure 8. In conclusion, this technique does not only obtain trajectories from detections, but also allows filtering some of the false positives of the system and estimating the fish position when it is not detected in the images.
Filtering
The result of the process so far is the position vector of every detected fish along time, representing its full trajectory in the fishway. However, these results are still expected to show certain undesirable phenomena caused by the small variability of the calculated position of the centroid, the parts of fish which are hidden by bubbles, and the errors in perspective and alignment of planes, when the fish moves from one camera's field of view to another.
To solve these problems and to remove some of the noise still present in the results, a complex filtering process is required. First, the relative position of the cameras is taken into account to solve differences between simultaneous observations. Thus, when the fish is detected simultaneously by two or more cameras, its position is the average of all the observed positions. To determine whether two observations from different cameras belong to the same target, the trajectories resulting from the Kalman filter are compared. If they start and end in adjacent cameras at similar times, they are then merged. When more than one fish crosses simultaneously from one camera field of view to another, the distance from the last predicted position in the old camera to the first position in the new camera is used as a cost function.
In the next step, a moving average filtering process is applied (SIGMUR 2003) and outliers are detected by thresholding the distance between filtered and original positions. Therefore, while normal detections are simply replaced by their filtered ones, the outliers are substituted by the average of the previous and next confirmed detections. This implies that predicted positions near outliers are no longer valid and they are hence replaced using interpolation techniques.
Data processing
As noted above, the water velocity in the fishway can be evaluated by means of experimental studies or numerical models. In this case, the velocity field in the pools was computed with a numerical model based on the two-dimensional depth averaged shallow water equations. The experimental validation of this model in 16 different fishway designs, as well as a detailed description of the model equations, can be found in Bermúdez et al. (2010).
In addition, further information regarding fish behavior can be obtained from the analysis of the trajectory and the times spent in the fishway. On one hand, fish response to physical factors such as current velocity or turbulence levels can be studied and preferential fish paths and areas for rest can be determined. On the other hand, total ascending times, which are an important component of passage delay, can be calculated. In this line, resting times, passage success, and total distances covered can be also examined. Although further research is needed, the analysis of these parameters can contribute to the definition of key factors in fish passage through these devices. In this way, the need to provide specific resting zones or to keep the velocity and turbulence levels below a certain threshold can strongly affect the fishway design.
RESULTS
Accuracy of the system
As aforementioned, the techniques published so far have been carried out in calm water or with controlled light conditions. They are generally based on visual marks or specific sensors and cannot be applied in the context of this work. Thus, to compare the proposed technique with other algorithms, some widely used non-supervised segmentation approaches have been implemented and tested by means of the previous metrics.
Specifically, the following techniques have been used:
Region: a region segmentation algorithm based on the Otsu method;
Edge: a modern implementation of the Canny edge detector;
Edge–region: a combination of the two previous techniques. It is the one proposed in this paper;
SOM pix: a SOM neural network based on the RGB intensity values of the image from the neighborhood of each pixel;
SOM avg: a SOM neural network in which input is the local average of the RGB values in a window centered in the neighborhood of the pixel;
SOM feat: a SOM neural network that uses two different image features: the local average of the RGB values in a window centered in the neighborhood of the pixel and the standard deviation of the RGB values in the column and file of the pixel.
Three different versions of each technique were implemented: one without background information and another two in which a background is used to normalize the image. In the latter case, either a static frame without fish or the proposed dynamic background technique was used to model the background. In every case, images were enhanced with a standard contrast-limited adaptive histogram equalization technique.
When using the SOM techniques, a three-layer topology with three processing elements (neurons) in each layer was selected and a 3 × 3 window was used for each input. To mitigate the dependence of the results on the selected training patterns, all networks were trained using three different data sets. In addition, the SOM network proposed in Rodriguez et al. (2011) was considered in the comparative.
The results obtained after the representation and interpretation step for the different techniques are shown in Table 1. It can be observed that the results are strongly dependent on the background model. Only the proposed technique performed well without background modeling and, in general, dynamic background achieved better results than the static one.
Average results . | Precision . | Recall . | False pos. rate . | False neg. rate . | Time (s/frame) . |
---|---|---|---|---|---|
No background | |||||
SOM pixel | 0.43 | 0.29 | 0.57 | 0.71 | 1.38 |
SOM avg. | 0.55 | 0.35 | 0.45 | 0.65 | 3.60 |
SOM feat. | 0.63 | 0.26 | 0.37 | 0.74 | 11.63 |
Edge | 0.86 | 0.73 | 0.14 | 0.27 | 0.34 |
Region | 0.40 | 0.39 | 0.60 | 0.61 | 0.40 |
Edge–region | 0.93 | 0.71 | 0.07 | 0.29 | 0.31 |
Static background | |||||
SOM pixel | 0.91 | 0.49 | 0.09 | 0.51 | 1.55 |
SOM avg. | 0.96 | 0.72 | 0.04 | 0.28 | 3.41 |
SOM feat. | 0.96 | 0.62 | 0.04 | 0.38 | 11.22 |
SOM (Rodriguez et al. 2011) | 0.85 | 0.69 | 0.15 | 0.31 | 3.37 |
Edge | 0.65 | 0.75 | 0.35 | 0.25 | 0.35 |
Region | 0.94 | 0.81 | 0.06 | 0.19 | 0.19 |
Edge–region | 0.94 | 0.82 | 0.06 | 0.18 | 0.32 |
Dyn. background | |||||
SOM pixel | 0.96 | 0.78 | 0.04 | 0.22 | 1.15 |
SOM avg. | 0.96 | 0.61 | 0.04 | 0.39 | 11.62 |
SOM feat. | 0.96 | 0.72 | 0.04 | 0.28 | 3.61 |
Edge | 0.67 | 0.78 | 0.33 | 0.22 | 0.32 |
Region | 0.95 | 0.78 | 0.05 | 0.22 | 0.17 |
Edge–region (proposed) | 0.95 | 0.82 | 0.05 | 0.18 | 0.31 |
Average results . | Precision . | Recall . | False pos. rate . | False neg. rate . | Time (s/frame) . |
---|---|---|---|---|---|
No background | |||||
SOM pixel | 0.43 | 0.29 | 0.57 | 0.71 | 1.38 |
SOM avg. | 0.55 | 0.35 | 0.45 | 0.65 | 3.60 |
SOM feat. | 0.63 | 0.26 | 0.37 | 0.74 | 11.63 |
Edge | 0.86 | 0.73 | 0.14 | 0.27 | 0.34 |
Region | 0.40 | 0.39 | 0.60 | 0.61 | 0.40 |
Edge–region | 0.93 | 0.71 | 0.07 | 0.29 | 0.31 |
Static background | |||||
SOM pixel | 0.91 | 0.49 | 0.09 | 0.51 | 1.55 |
SOM avg. | 0.96 | 0.72 | 0.04 | 0.28 | 3.41 |
SOM feat. | 0.96 | 0.62 | 0.04 | 0.38 | 11.22 |
SOM (Rodriguez et al. 2011) | 0.85 | 0.69 | 0.15 | 0.31 | 3.37 |
Edge | 0.65 | 0.75 | 0.35 | 0.25 | 0.35 |
Region | 0.94 | 0.81 | 0.06 | 0.19 | 0.19 |
Edge–region | 0.94 | 0.82 | 0.06 | 0.18 | 0.32 |
Dyn. background | |||||
SOM pixel | 0.96 | 0.78 | 0.04 | 0.22 | 1.15 |
SOM avg. | 0.96 | 0.61 | 0.04 | 0.39 | 11.62 |
SOM feat. | 0.96 | 0.72 | 0.04 | 0.28 | 3.61 |
Edge | 0.67 | 0.78 | 0.33 | 0.22 | 0.32 |
Region | 0.95 | 0.78 | 0.05 | 0.22 | 0.17 |
Edge–region (proposed) | 0.95 | 0.82 | 0.05 | 0.18 | 0.31 |
It can also be noted that the worst results in terms of accuracy are obtained with the edge technique. On the other hand, SOM models based on features achieve good accuracy, but require a higher computational time. Finally, although the region technique obtained quite good results, the proposed technique yielded the same level of precision with the best recall, without increasing significantly the execution time.
Once the representation and interpretation step is completed, fish detections are processed with the tracking algorithm. As explained above, this algorithm can operate as a filter, using the confirmed positions of the tracked fish. The results obtained using this configuration are shown in Table 2. Techniques without background have been discarded, as they generally achieved low accuracy.
Average results . | Precision . | Recall . | False pos. rate . | False neg. rate . | Time (s/frame) . |
---|---|---|---|---|---|
Static background | |||||
SOM pixel | 0.97 | 0.81 | 0.03 | 0.19 | 1.14 |
SOM avg. | 0.98 | 0.74 | 0.02 | 0.26 | 3.71 |
SOM feat. | 0.97 | 0.64 | 0.03 | 0.36 | 11.97 |
SOM (Rodriguez et al. 2011) | 0.88 | 0.75 | 0.12 | 0.25 | 3.27 |
Edge | 0.70 | 0.83 | 0.30 | 0.17 | 0.33 |
Region | 0.97 | 0.84 | 0.03 | 0.16 | 0.18 |
Edge–region | 0.97 | 0.84 | 0.03 | 0.16 | 0.31 |
Dyn. background | |||||
SOM pixel | 0.97 | 0.80 | 0.03 | 0.20 | 1.15 |
SOM avg. | 0.98 | 0.74 | 0.02 | 0.26 | 3.67 |
SOM feat. | 0.97 | 0.63 | 0.03 | 0.37 | 11.87 |
Edge | 0.75 | 0.85 | 0.25 | 0.15 | 0.33 |
Region | 0.97 | 0.81 | 0.03 | 0.19 | 0.17 |
Edge–region (proposed) | 0.98 | 0.85 | 0.02 | 0.15 | 0.31 |
Average results . | Precision . | Recall . | False pos. rate . | False neg. rate . | Time (s/frame) . |
---|---|---|---|---|---|
Static background | |||||
SOM pixel | 0.97 | 0.81 | 0.03 | 0.19 | 1.14 |
SOM avg. | 0.98 | 0.74 | 0.02 | 0.26 | 3.71 |
SOM feat. | 0.97 | 0.64 | 0.03 | 0.36 | 11.97 |
SOM (Rodriguez et al. 2011) | 0.88 | 0.75 | 0.12 | 0.25 | 3.27 |
Edge | 0.70 | 0.83 | 0.30 | 0.17 | 0.33 |
Region | 0.97 | 0.84 | 0.03 | 0.16 | 0.18 |
Edge–region | 0.97 | 0.84 | 0.03 | 0.16 | 0.31 |
Dyn. background | |||||
SOM pixel | 0.97 | 0.80 | 0.03 | 0.20 | 1.15 |
SOM avg. | 0.98 | 0.74 | 0.02 | 0.26 | 3.67 |
SOM feat. | 0.97 | 0.63 | 0.03 | 0.37 | 11.87 |
Edge | 0.75 | 0.85 | 0.25 | 0.15 | 0.33 |
Region | 0.97 | 0.81 | 0.03 | 0.19 | 0.17 |
Edge–region (proposed) | 0.98 | 0.85 | 0.02 | 0.15 | 0.31 |
However, the tracking technique can also estimate hidden positions of the fish. This is done by the Kalman algorithm, which predicts fish locations based on the motion model (Equation (6)). Following this procedure, the results include both confirmed and estimated positions (Table 3).
Average results . | Precision . | Recall . | False pos. rate . | False neg. rate . | Time (s/frame) . |
---|---|---|---|---|---|
Static background | |||||
SOM pixel | 0.93 | 0.91 | 0.07 | 0.09 | 1.14 |
SOM avg. | 0.95 | 0.86 | 0.05 | 0.14 | 3.71 |
SOM feat. | 0.93 | 0.80 | 0.07 | 0.20 | 11.97 |
SOM (Rodriguez et al. 2011) | 0.73 | 0.88 | 0.27 | 0.12 | 3.27 |
Edge | 0.44 | 0.93 | 0.56 | 0.07 | 0.33 |
Region | 0.94 | 0.94 | 0.06 | 0.06 | 0.18 |
Edge–region | 0.94 | 0.94 | 0.06 | 0.06 | 0.31 |
Dyn. background | |||||
SOM pixel | 0.94 | 0.91 | 0.06 | 0.09 | 1.15 |
SOM avg. | 0.95 | 0.86 | 0.05 | 0.14 | 3.67 |
SOM feat. | 0.94 | 0.80 | 0.06 | 0.20 | 11.87 |
Edge | 0.48 | 0.93 | 0.52 | 0.07 | 0.33 |
Region | 0.94 | 0.93 | 0.06 | 0.07 | 0.17 |
Edge–region (proposed) | 0.95 | 0.94 | 0.05 | 0.06 | 0.31 |
Average results . | Precision . | Recall . | False pos. rate . | False neg. rate . | Time (s/frame) . |
---|---|---|---|---|---|
Static background | |||||
SOM pixel | 0.93 | 0.91 | 0.07 | 0.09 | 1.14 |
SOM avg. | 0.95 | 0.86 | 0.05 | 0.14 | 3.71 |
SOM feat. | 0.93 | 0.80 | 0.07 | 0.20 | 11.97 |
SOM (Rodriguez et al. 2011) | 0.73 | 0.88 | 0.27 | 0.12 | 3.27 |
Edge | 0.44 | 0.93 | 0.56 | 0.07 | 0.33 |
Region | 0.94 | 0.94 | 0.06 | 0.06 | 0.18 |
Edge–region | 0.94 | 0.94 | 0.06 | 0.06 | 0.31 |
Dyn. background | |||||
SOM pixel | 0.94 | 0.91 | 0.06 | 0.09 | 1.15 |
SOM avg. | 0.95 | 0.86 | 0.05 | 0.14 | 3.67 |
SOM feat. | 0.94 | 0.80 | 0.06 | 0.20 | 11.87 |
Edge | 0.48 | 0.93 | 0.52 | 0.07 | 0.33 |
Region | 0.94 | 0.93 | 0.06 | 0.07 | 0.17 |
Edge–region (proposed) | 0.95 | 0.94 | 0.05 | 0.06 | 0.31 |
The precision of the system is increased significantly if the algorithm operates only as a filter. However, the use of predicted positions improves the recall, without losing precision when compared to the results before the tracking step (Table 1).
It must be taken into account that some of the new false positives that appear when using predictions are not errors. In fact, they may reflect the position of the fish when it is not observable in the images and the fish center position has not been manually marked.
Overall, the proposed technique obtained the best accuracy of all tested algorithms. It achieved one of the lowest false positive rates, false negative rates, and execution times. Hence, the technique is considered to obtain reliable results: it detects the fish in most situations and finds their true positions with a high probability.
Tracking errors
As shown in the previous section, the proposed technique performs comparatively better than the other implemented methods, in terms of precision and recall. However, the ability of the algorithm not only to detect fish in images but also to generate trajectories from fish still has to be tested. In this section, the capability of the system to observe fish along time is studied, which implies measuring the efficiency in assigning detections to fish.
Although there is not a standard metric to perform this task, it can be measured by analyzing and counting tracking errors. From a general point of view, this type of error can be classified as follows:
Type 1: the output trajectory of a detected fish contains isolated noise detections;
Type 2: a fish is not detected and does not generate a trajectory;
Type 3: a group of noise detections is classified as a new fish and a trajectory is generated for a non-existing fish;
Type 4: two or more trajectories are created for a single fish. This can happen if a group of noise detections interfere with the trajectory of a tracked fish, or if a tracked fish is lost for a long period of time;
Type 5: two or more fish interact during some time, causing occlusions and overlapping in the images. This results in the assignment of some of the detections to the wrong trajectory.
Type 1 errors are reflected as false positives in the results of Tables 1,2–3. On the other hand, Type 2 errors do not occur, in practice, since video sequences are long enough to ensure that every fish is detected. Besides, most errors of Type 5 are not visually observable by human operators, since they usually correspond to changes in the relative position of two or more fish when they are occluded by turbulence or bubbles. Under these conditions, fish are not visually distinguishable from each other. However, interaction among fish was barely observed in high-velocity areas and it does not affect the global analysis of fish behavior in resting zones.
Finally, longer sequences are required to analyze Type 3 and Type 4 errors and two new data sets have been created for this purpose. The first data set consists of a long sequence of 46,000 frames with a single fish moving in only one pool. The second one contains 10 sequences of 1,000 frames each, where two fish interact in the same pool. The obtained results can be seen in Table 4. They show a very low error rate, with one tracking error every 5,000 frames or more. These results confirm that the proposed system is suitable for obtaining fish trajectories from recorded images.
Data set . | Type 3 errors . | Type 4 errors . | Error rate (per 1,000 frames) . |
---|---|---|---|
Single fish | 7 | 0 | 0.15 |
Two fish | 0 | 2 | 0.18 |
Data set . | Type 3 errors . | Type 4 errors . | Error rate (per 1,000 frames) . |
---|---|---|---|
Single fish | 7 | 0 | 0.15 |
Two fish | 0 | 2 | 0.18 |
Experimental results
The proposed system was applied to 15 assays conducted in the full-scale vertical slot fishway model located at the CEDEX laboratory (Figure 1). During the corresponding migration period, four different species (a total of 259 fish) were tested: Iberian barbel (Luciobarbus bocagei), Mediterranean barbel (Luciobarbus guiraonis), Iberian straight-mouth nase (Pseudochondrostoma polylepis), and brown trout (Salmo trutta), as shown in Figure 9. The recordings of each assay last approximately 12 hours and the recording frequency is 25 Hz.
Passage success, understood as the percentage of fish ascending the entire fishway, was evaluated by analyzing the fish full trajectory. The results were verified by means of passive integrated transponder tag technology and direct observation. Overall, passage success during the experiments was low, regardless of species, and varied considerably with fish size (Table 5). In general, larger individuals presented a higher rate of success in ascending the entire fishway, relative to small specimens of the same species.
Species . | Size (cm) . | Number of fish . | Passage success (%) . |
---|---|---|---|
Iberian barbel | 0–15 | 12 | 33.3 |
15–20 | 12 | 75.0 | |
20–25 | 5 | 80.0 | |
>25 | 34 | 41.2 | |
Mediterranean barbel | 0–15 | 6 | 0.0 |
15–20 | 8 | 0.0 | |
20–25 | 11 | 54.6 | |
>25 | 12 | 25.0 | |
Iberian straight-mouth nase | 0–15 | 61 | 9.8 |
20–25 | 34 | 41.2 | |
Brown trout | 0–15 | 5 | 0.0 |
15–20 | 43 | 14.0 | |
20–25 | 14 | 42.9 | |
25–30 | 2 | 100.0 | |
Total | 259 | 28.6 |
Species . | Size (cm) . | Number of fish . | Passage success (%) . |
---|---|---|---|
Iberian barbel | 0–15 | 12 | 33.3 |
15–20 | 12 | 75.0 | |
20–25 | 5 | 80.0 | |
>25 | 34 | 41.2 | |
Mediterranean barbel | 0–15 | 6 | 0.0 |
15–20 | 8 | 0.0 | |
20–25 | 11 | 54.6 | |
>25 | 12 | 25.0 | |
Iberian straight-mouth nase | 0–15 | 61 | 9.8 |
20–25 | 34 | 41.2 | |
Brown trout | 0–15 | 5 | 0.0 |
15–20 | 43 | 14.0 | |
20–25 | 14 | 42.9 | |
25–30 | 2 | 100.0 | |
Total | 259 | 28.6 |
On the other hand, the path chosen by fish moving from one pool to another and the specific resting zones actually exploited by the fish were identified. In the experiments, the individuals avoided high-velocity areas and used recirculation regions, in which velocity and turbulence levels are lower, to move within the pool and for resting before ascending through the higher-velocity area of the slot. Thus, a preliminary analysis of the fish trajectories revealed that when ascending the fishway, fish spent the vast majority of time in low-velocity areas. Despite the high transit times observed occasionally in these areas, no signs of disorientation and very few fall back movements were detected in the recordings. This suggests that the use of resting zones influences fish passage delay, but not fishway passage success.
As noted above, fish rest frequently in low-velocity areas and, in fact, consecutive ascents of more than four pools (without a resting period) have not been observed. However, low-velocity areas were not frequented uniformly by fish, which stayed most frequently in the zone located just downstream from the slot and behind the small side baffle (zone A in Figure 10). The exploitation of low-velocity areas for the four species can be seen in Table 6. The frequency of use of resting zones is expressed as the proportion between the time spent in a specific one and the total resting time during the ascent. The results suggest that only the recirculation regions located in the upstream part of the pools (zones A and B) play an important role in fish passage. If these results are confirmed in further assays, designs in which zone C becomes a high-velocity area (examples can be found in Bermúdez et al. (2010)) would not be a priori more unfavorable. Therefore, these results encourage the use of designs with low-velocity areas in the upstream part of the pools.
. | Frequency of use (%) . | . | ||
---|---|---|---|---|
. | A . | B . | C . | Avg. resting time (s) . |
Iberian barbel | 68.5 | 28.5 | 2.9 | 161 |
Mediterranean barbel | 88.4 | 10.5 | 1.1 | 325 |
Iberian straight-mouth nase | 99.8 | 0.0 | 0.2 | 269 |
Brown trout | 82.7 | 16.7 | 0.6 | 1,271 |
Total | 84.1 | 15.1 | 0.8 | 585 |
. | Frequency of use (%) . | . | ||
---|---|---|---|---|
. | A . | B . | C . | Avg. resting time (s) . |
Iberian barbel | 68.5 | 28.5 | 2.9 | 161 |
Mediterranean barbel | 88.4 | 10.5 | 1.1 | 325 |
Iberian straight-mouth nase | 99.8 | 0.0 | 0.2 | 269 |
Brown trout | 82.7 | 16.7 | 0.6 | 1,271 |
Total | 84.1 | 15.1 | 0.8 | 585 |
In addition, the paths taken by fish to swim from one pool to the next through the slot were analyzed separately. In general, two modes of successful ascents were observed, depending on the location of the individual within the pool before traversing the slot and the area used to approach it (Figure 11). The results suggest that all the selected fish species tend to follow similar trajectories and exploit the same flow regions during the ascent.
Finally, the observed speed, swimming speed, and acceleration have been calculated as described in the ‘Data processing’ section. Figure 12 shows a sample of obtained velocities and accelerations (in modulus), which are represented as a function of the traveled distance, and their respective polynomial fitting curves.
A clear trend is observed in fish swimming velocities, with a peak occurring immediately prior to crossing the slot and significant decrease in velocity once the fish passes to the next pool. This pattern is strongly influenced by the water velocities that fish are confronted by in the different regions of the fishway.
On the other hand, fish acceleration data are more scattered, being maximum accelerations observed when fish approach the slot. This is due to the fact that fish usually come from a resting area and need to reach a certain speed to traverse the slot. After crossing this high-velocity area, they usually decelerate.
Average maximum swimming speeds and accelerations can be seen in Table 7. Observed fish velocities are low relative to the flow velocity in the slot region, as shown in Figure 12. Hence, maximum swimming velocities are only slightly higher than the water velocity in the slot, regardless of the fish species.
. | Swimming speed (m/s) . | Acceleration (m/s2) . | ||
---|---|---|---|---|
Species . | Avg. maximum . | Std. deviation . | Avg. maximum . | Std. deviation . |
Iberian barbel | 1.51 | 0.27 | 1.13 | 0.60 |
Mediterranean barbel | 1.51 | 0.25 | 0.95 | 0.60 |
Iberian straight-mouth nase | 1.52 | 0.26 | 1.08 | 0.54 |
Brown trout | 1.60 | 0.25 | 1.31 | 0.74 |
. | Swimming speed (m/s) . | Acceleration (m/s2) . | ||
---|---|---|---|---|
Species . | Avg. maximum . | Std. deviation . | Avg. maximum . | Std. deviation . |
Iberian barbel | 1.51 | 0.27 | 1.13 | 0.60 |
Mediterranean barbel | 1.51 | 0.25 | 0.95 | 0.60 |
Iberian straight-mouth nase | 1.52 | 0.26 | 1.08 | 0.54 |
Brown trout | 1.60 | 0.25 | 1.31 | 0.74 |
Swimming velocities are usually defined as a function of fish body length (BL), since it is considered one of the most influential factors affecting speed (Beamish 1978). Figure 13 shows the maximum fish swimming velocities (vmax) obtained in the assays, expressed in BL/s. Differences between species are now observed, suggesting different levels of effort according to the sizes of the specimens. Obtained values do not exceed the maximum burst speed values proposed in the literature, which vary from 9 up to 15 BL/s (Weaver 1963; Webb 1975), 10 BL/s being the most accepted value (Cowx & Welcomme 1998).
CONCLUSIONS
In this work, a new technique is developed to automatically analyze fish behavior in fishways. It uses several computer vision techniques to detect and track fish in video sequences recorded by a camera system integrated in the fishway.
More specifically, it employs a combination of background modeling, edge, and region analysis to detect fish. Also, it takes advantage of the Kalman filter to obtain the trajectory of one or multiple individuals inside the fishway.
The proposed technique has been extensively tested and compared with different standard methods. It achieved the best balance of precision, recall, and execution time.
In addition, the system was applied to 15 assays conducted during 2 years with more than 250 living fish. It provided valuable information regarding fish behavior, including fishway efficiency, swimming trajectories, swimming velocities and accelerations, resting times, and preferential resting areas.
The analysis of these data, together with the results of upcoming experiments, is expected to improve fishway design criteria in the future.
ACKNOWLEDGEMENTS
This work was supported by FEDER funds and Spanish Ministry of Economy and Competitiveness (Ministerio de Economía y Competitividad) (Ref. CGL2012-34688 and PCT-380000-2007-3). The authors would also like to thank the Center for Hydrographic Studies of Center for Studies and Experimentation on Public Works (CEH-CEDEX), the Spanish Ministry of Education (FPU grant Ref. AP2009-2070), and the Spanish Ministry of Public Works.