Component 1: The
Repository
A dedicated repository is needed which will reference and index
information such as specimen metadata, image metadata and annotations
alongside machine learning models with their performance metrics and
outputs (Fig. \ref{205447}). Some infrastructures already exist, or are in
development, to accommodate some of these data types, such as GBIF for
specimen data, but none integrate the full spectrum of specimens,
images, models and model outputs. These existing infrastructures can be
reused, either by integrating or connecting with the repository or
becoming it by extending their own capabilities. The repository should
operate on the FAIR principles, facilitating data discovery and reuse.
This includes the support for, or provision of, persistent identifiers
for the different types of content, as well as different data standards.