Component 1: The Repository

A dedicated repository is needed which will reference and index information such as specimen metadata, image metadata and annotations alongside machine learning models with their performance metrics and outputs (Fig. \ref{205447}). Some infrastructures already exist, or are in development, to accommodate some of these data types, such as GBIF for specimen data, but none integrate the full spectrum of specimens, images, models and model outputs. These existing infrastructures can be reused, either by integrating or connecting with the repository or becoming it by extending their own capabilities. The repository should operate on the FAIR principles, facilitating data discovery and reuse. This includes the support for, or provision of, persistent identifiers for the different types of content, as well as different data standards.