FIGURE 3 Proportion of the movement angles (up, down, right=east, left=west) for the ground-truth and the three tracking architectures and for the two camera sets (Set 1: facing North and Set 2: facing South). The movement angles are spatial angles of yellowfin bream movement in two dimensions.

4. Discussion

We demonstrate a computer vision-based method for detecting and tracking individual fish in underwater footage. Our study incorporates open-source CV methods into a pipeline that allows scientists to assess animal movement in marine ecosystems. This method quantified animal behaviour and detected the expected tidal movement in our case study. The experimental results show that the proposed method is an effective and non-invasive way to detect and track small-scale movement of many fish in aquatic environments.
Previous ecological work has tracked fish in controlled environments (Papadakis, Glaropoulos & Kentouri, 2014; Qian et al., 2016; Bingshan et al., 2018; Sridhar, Roche & Gingins, 2019), used automated detections and counts as proxies for movement (Marini et al., 2018) and, most recently, used automated movement tracking algorithms to quantify movement (Francisco, Nührenberg & Jordan, 2020). Automated approaches tested in ‘real-world’ scenarios provide the best indication and evidence that CV is a robust technique for fish monitoring in aquatic ecosystems. In this paper, we propose an easily replicable and non-invasive method to measure fish movement in aquatic ecosystems by combining OD and OT algorithms. The object detection framework used in our study (Mask R-CNN) was recently shown to be robust and accurate enough to detect fish in a variety of aquatic conditions (Ditria et al., 2020b; Francisco, Nührenberg & Jordan, 2020). While other more recent OD frameworks have been developed since Mask R-CNN was published, our study further demonstrates that Mask R-CNN is capable of detecting fish in underwater footage. When evaluating the OT architectures, Seq-NMS had the best performance and was able to quantify the net movement of multiple individuals. While Seq-NMS is not an OT algorithm, it does require a high-performing OD model because it uses the OD outputs of every frame to create the detection links and track the movement direction. Additionally, for both OD and OT, we used frameworks that were not initially designed to detect fish in underwater footage. Our results add to the growing evidence that the learning capabilities and adaptability of CV methods can aid in the data collection and analysis of fish detection and tracking in aquatic ecosystems (Xiu et al., 2015; Villon et al., 2016; Marini et al., 2018).
A key benefit of camera-CV applications to animal movement research, and science more broadly, is that it can complement traditional data collection techniques (Lopez-Marcano et al., 2020). Cameras and CV can be deployed at many sites and cover large spatial extents but are limited by environmental factors and incapable of detecting and classifying complex ecological parameters such predatory interactions or the identification of morphologically similar, but taxonomically different species (Christin, Hervet & Lecomte, 2019). Traditional approaches (i.e. netting or in-water diver assessments) are still more capable at collecting the highest variety and complexity of ecological variables and parameters, but by combining cameras, automation and traditional approaches the spatial and temporal scope of monitoring can be increased. Moreover, camera-CV approaches do not require specialised equipment to study animal movement and the rapid analysis of imagery can provide movement data that is accurate, valid and consistent (Weinstein, 2018; Francisco, Nührenberg & Jordan, 2020).
CV techniques can enhance animal movement ecology through the streamlined collection of several sets of ecological information (Botella et al., 2018; Christin, Hervet & Lecomte, 2019), and this new data may revolutionize ecological studies. Traditional presence/absence data is used to understand the environmental drivers of a species’ geographic distribution, and the collection of presence/absence data from videos can easily be automated (Schneider, Taylor & Kremer, 2018; Schneider et al., 2019; González-Rivero et al., 2020; Kennedy et al., 2020). However, presence/absence data by themselves cannot inform us about how multiple ecological processes interact, and presence/absence data conflates movement of individuals with mortality (Zurell, Pollock & Thuiller, 2018). Future studies could use our combined OD and OT approach to simultaneously quantify species distributions and movement. The integration of movement data into species distribution models means that the models could accurately predict how the ranges of mobile species respond dynamically to environmental change through individual movement decisions and population level parameters like mortality (Bruneel et al., 2018).
The capacity to use our CV approach for monitoring fish populations is dependent on the ability to obtain and deploy several underwater cameras across the desired seascape. In this study, we deployed a six camera array in a fish corridor to maximise the chances to obtain movement data. However, each set and camera obtained unequal amounts of data and the array also resulted in repeated tracking of fish. Therefore, a major task when using camera-based technologies is to design and deploy an appropriate camera system to monitor animal interactions (Wearn & Glover-Kapfer, 2019). A recent global survey suggested that methodological improvements in the quality and accessibility of methods and analytical tools for camera-based technologies are still required (Glover-Kapfer, Soto-Navarro & Wearn, 2019). While our study demonstrates that fish can be detected and tracked automatically in aquatic ecosystems, further research into methodological designs (i.e. the optimal number of cameras needed to detect movement) are still required. The development of standardised camera-based methodologies, such as methodological guides for baited remote underwater surveys (Langlois et al., 2020) or for camera traps (Rovero et al., 2013), but specific to ecological camera-CV applications will help advance the applications of CV into movement ecology. Furthermore, the combination of both traditional and emerging techniques can provide data that can increase our understanding of complex movement behaviours in marine ecosystems (Christin, Hervet & Lecomte, 2019; Lopez-Marcano et al., 2020).
Remote camera systems and CV techniques can help provide robust, reliable and automatic tools to monitor and observe fish movement in marine ecosystems (Rowcliffe et al., 2016; Francisco, Nührenberg & Jordan, 2020). Technological advances have allowed us to better understand the complexities of animal movement, and our study shows that these techniques can be successfully applied in complex marine scenarios (Weinstein, 2018). By utilising a combination of CV frameworks, we demonstrated that automated tracking of fish movement between distinct seascapes (e.g. artificial and natural) is possible. We suggest that these methods are transferable to other types of fish corridors and other habitats, such as the mangrove, seagrass and coral reef continuum (Spampinato et al., 2008; Olds et al., 2018; Francisco, Nührenberg & Jordan, 2020). Further development of these models and architectures, such as integrated OD and OT with stereo video (Huo et al., 2018) and pairwise comparisons of detections (Guo et al., 2020), will likely lead to improvements in accuracy. Continual improvements in accuracy will provide a rigorous framework to study and quantify fish connectivity in the wild.