Keywords
Artificial Intelligence, connectivity, deep learning, dispersal, machine
learning, object tracking, underwater video
1.
Introduction
Computer vision (CV), the research field that explores the use of
computer algorithms to automate the interpretation of digital images or
videos, is revolutionising data collection in science (Waldchen &
Mader, 2018; Beyan & Browman, 2020). The use of remote camera imagery,
such as underwater stations, camera traps and stereography, has driven
the uptake of CV because it has the capacity to process and analyse
imagery quickly and accurately (Bicknell et al., 2016; Schneider et al.,
2019). In ecological studies, advances in CV have led to an increase in
sampling accuracy and repeatability (Waldchen & Mader, 2018). For
example, drones are being used to track grassland animals (van Gemert et
al., 2015) and estimate tree defoliation (Kälin et al., 2019),
underwater observatories with CV are monitoring deep sea ecosystems
(Aguzzi et al., 2019), and CV-capable dive scooters are being used to
monitor coral reefs at large spatial and temporal scales
(González-Rivero et al., 2020; Kennedy et al., 2020)
In the past few years, we have seen an increase in the uptake of CV to
study and monitor marine ecosystems. These applications are related to
the two main CV tasks – object detection (OD) and object tracking (OT).
OD and OT automate the task of gathering information about the type,
location and movement of objects of interest. OD has received the most
attention as OD models can count and identify species of interest in
underwater video footage (Christin, Hervet & Lecomte, 2019). For
example, OD models have been applied to detect seals (Salberg, 2015),
identify whale hotspots (Guirado et al., 2019), monitor fish populations
(Xiu et al., 2015; Salman et al., 2016; Villon et al., 2016; Marini et
al., 2018; Villon et al., 2018; Ditria et al., 2020b; Jalal et al.,
2020; Villon et al., 2020) and quantify floating debris on the ocean
surface (Watanabe, Shao & Miura, 2019). By comparison, the application
of OT is less advanced in marine ecosystems. Previous work has shown
that OT models can successfully track on-surface objects (see
topios.org) and underwater objects such as fish, sea turtles, dolphins,
and whales (Spampinato et al., 2008; Chuang et al., 2017; Xu & Cheng,
2017; Arvind et al., 2019; Kezebou et al., 2019). There is also evidence
that automated monitoring of fish in underwater ecosystems through the
combination of OD and OT is reliable and accurate (Spampinato et al.,
2008; Lantsova et al., 2016; Mohamed et al., 2020). However, no studies
have jointly applied OD and OT for animal movement studies. OD can help
advance the automatic collection of traditional presence/absence data of
different species (Xiu et al., 2015; Marini et al., 2018) and OT can
subsequently track multiple individuals and provide fine-scale tracking
data to assess behavioural and animal movement patterns across a range
of environments (Francisco, Nührenberg & Jordan, 2020). With a single
and non-invasive automated approach, two types of ecological information
can be obtained, which will provide individual level information of
different species and that enhances our ability to quantify the
environmental drivers of species abundance and behaviour.
The combination of OD and OT is particularly suited to the subfield of
marine animal movement because these tasks can provide the volume of
data required to quantify movement of numerous individuals
(Lopez-Marcano et al., 2020).In marine environments, animal movement
shapes predator-prey dynamics, nutrient dynamics and trophic functions
(Olds et al., 2018). For example, the movement of herbivorous fish
between seagrass and coral reefs helps maintain resilience by balancing
fish abundances with algal growth rates that vary spatio-temporally
(Pagès et al., 2014). The knowledge of animal movement is fundamental to
many research objectives in marine science, and collecting movement data
is challenging and requires substantial resources. Therefore, the
development and applications of emerging technologies (i.e. computer
vision) can help advance our understanding of animal movement across a
broad range of spatio-temporal dimensions and ecological hierarchies
(e.g. individuals, populations, communities).
In this study, we aimed to test the ability of deep learning algorithms
to track small-scale animal movement of many individuals in underwater
videos. We developed a CV pipeline consisting of two steps, OD and OT,
and we used the pipeline to quantify underwater animal movement across
habitats for ecological research. We tested and applied off-the-shelf OT
architectures to determine the efficacy and capacity of these emerging
techniques to be used for underwater ecological applications. To
demonstrate the applications of OD and OT, we deployed a 6-camera
network in a known coastal fish estuarine corridor and recorded the
movement of a common fisheries species (yellowfin bream,Acanthopagrus australis) . The corridor, located in the Tweed
River estuary, Australia, is located between a rockwall passage and a
seagrass meadow. Multiple estuarine fish such as sand whiting
(Sillago ciliata), river garfish (Hyporhamphus regularis ),
luderick (Girella tricuspidata) , spotted scat (Scatophagus
argus ), three-bar porcupinefish (Dicotylichthys punctulatu s) and
yellowfin bream, frequently move back and forth with the tides through
this corridor, representing a relatively challenging scenario (i.e. low
visibility and with currents also carrying floating debris) to showcase
the capacity of CV to detect the target species in a multi-species
assemblage and quantify the direction of movement. Testing the method
with fish tidal movement represents the ideal test, because of the
common knowledge on how and where fish move with the tidal patterns. We
expected the analysis of videos from cameras to detect and track bream
moving in the corridor consistent with the direction of the tidal flow.
For OD, we used an off-the-shelf model called Mask Regional
Convolutional Neural Network (Mask R-CNN) (He et al., 2017) that has
been shown to successfully and accurately detect and quantify fish in
estuarine ecosystems (Ditria et al., 2020b). We also benchmarked three
OT architectures: Minimum Output Sum of Squared Errors (MOSSE) (Bolme et
al., 2010), Sequential Non-Maximum Suppression (Seq-NMS) (Han et al.,
2016), and Siamese Mask (SiamMask)(Wang et al., 2019). Ultimately, we
demonstrate that these technologies can complement the collection and
analysis of animal movement data and potentially contribute to the
data-driven management of ecosystems.
2.
Methods