Introduction
Lifespan is the approximate maximum age that individuals of a given
species are expected to attain under favourable environmental
conditions. Derivations of a species’ lifespan are varied, including the
maximum recorded age of any single individual , the age to which a
proportion of the population survives , or, in fish, the age at which 95
% of the maximum or asymptotic length is reached . Lifespan derived in
any way is a fundamental life history parameter, allowing for
approximation of mortality and rate of population growth . Lifespan can
also provide an upper limit for an animal’s reproductive life phase,
except in the small number of species that undergo reproductive
senescence. The age at which sexual maturity is attained and either age
at death or age of reproductive senescence vary more extensively than
maximum lifespan, and rates of reproduction and mortality even more so .
Lifespan, in contrast, is a relatively stable trait within a given
species and can therefore be used to obtain generalisable information
about that species .
Lifespan’s utility in approximating life history makes it valuable for
species management. For example, it can be used to model sustainable
harvest levels for wild populations, such as in fisheries , but also
assessments of invasion potential , and extinction risk . Despite its
simplicity as a population parameter, and great value for a range of
animal population and species management applications, lifespan is often
not considered because there are no reliable estimates available.
Reported vertebrate lifespans range from eight weeks in the coral reef
pygmy goby (Eviota sigillata ) to approximately 400 years in the
Greenland shark (Somniosus microcephalus ) Identification of the
oldest individuals of a given species is often very difficult because
age information is sparse or absent. Long-lived species present a range
of practical difficulties for determining lifespan, as in the absence of
indirect estimation methods, research programmes rarely last as long as
the oldest individuals . Thus, despite its central importance to species
management and conservation, lifespan is unknown for most animals .
The ageing process is hypothesised to be an unintended consequence of
cell programming, involving molecular changes that leave traceable
genomic signatures . Consistent changes in a well-studied epigenetic
modification, DNA methylation, can be used to predict age in a growing
number of species . This is because, over the lifespan of an individual,
patterns of DNA methylation change, whereby highly methylated regions
become demethylated and sparsely methylated regions become methylated .
Along with other important epigenetic changes, these changes in DNA
methylation result in a loss of cellular functioning that is thought to
contribute to processes of aging . The term DNA methylation is generally
used to refer to methylation that occurs at cytosine-phosphate-guanine
(CpG) sites, or ‘CG’ sequences in the genome, where its occurrence and
function has been most extensively studied . CpG sites are concentrated
around transcription start sites and in promoter regions of genes, where
their density and DNA methylation levels are associated with changes in
gene activity . The elevated frequency of CpG sites in gene promoters
has been hypothesised to act as a buffer against age-related DNA
methylation changes and therefore correlate with species maximum
lifespan .
The association between promoter CpG density and lifespan was first
revealed in mammals and its predictive value was subsequently
demonstrated among all vertebrates . McLain and Faulk (2018) revealed
significant correlations between promoter CpG density and mammalian
lifespan for 1000 gene promoter regions; 5 % of the total examined.
Mayne et al. (2019) developed a model that used the CpG densities of 42
gene promoters to predict lifespan in vertebrates, accounting for 76 %
of the variation between known and predicted lifespans. The vertebrate
model highlighted unique relationships between CpG density and lifespan
in all major vertebrate groups, including fish, birds, mammals and
reptiles. However, because the prediction accuracy was lower in
non-mammalian vertebrates, these differences were attributed to low
sample size (n ≤ 63) and high sequence divergence . Previous lifespan
analyses have used human gene promoters as reference sequences,
resulting in fewer sequence matches, greater bias and lower accuracy in
distant relatives . Previous studies have also obtained lifespan
information from the Animal Aging and Longevity Database (AnAge) .
Although AnAge is a highly comprehensive and well curated database,
incorporation of lifespan data from additional sources (e.g.,
alternative online databases or manual literature search) is likely to
enable increased sample sizes and improve statistical power.
Fish (aquatic vertebrates with fins and gills) are a paraphyletic group
including class Actinopteri (ray-finned fishes), Chondrichthyes
(cartilaginous fishes), Sarcopterygii (fleshy-finned fishes),
Cephalaspidomorphi (e.g., lampreys) and Myxini (e.g., hagfishes). At
present, approximately 7000 fish species are subject to wild harvest,
each typically requiring species-specific life history information to
enable adequate fisheries management . A lack of data for the majority
of fished species significantly impedes management of sustainable
fisheries, with an estimated 35 % of global fish stocks now overfished
. Lifespan data is of particularly high value for management of fish
populations, as it can be used to approximate natural mortality rates ,
fisheries maximum sustainable yield and model population growth .
Here we report the development of a fish-specific genomic lifespan
predictor. The model was constructed using 1804 reported lifespan values
and the CpG density (measured as CpG observed/expected ratio) of
promoter regions from 442 fish genomes extracted using experimentally
defined zebrafish (Danio rerio ) promoter sequences. The model
predicts lifespan for any given fish species from the genome sequence of
a single individual, demonstrating the high value of promoter CpG
density alone to predict lifespan in fish.