Genetic diversity within tp0548
While the tp0488 gene in lagomorph infecting TPe C/L strains shows no defined sequence variability site at a chosen minimum variant frequency of 0.25, the tp0548 gene in our analysed samples had two hypervariable regions (V1-2). These regions range from 589,242-287 (V1) and 589,558-647 (V2) on the TP eC strain Cuniculi A reference genome (CP002103.1; Figure 3A) and were characterised by an aggregation of polymorphic sites, deletions and repeat-patterns. Briefly, V1 is characterized through indels and a dominating arginine, serine and glycine-coding composition. The V2 region is longer and includes various types of repetitions that are illustrated in Figure 3. Most strains (n = 203/287) present with type I repetitions that code for a KGGG amino acid motif. The median number of repetitions of this dominating type I repeat is three with a range of one to seven (Figure 3C). Besides the 228 strains that showed only one repeat type, 56 strains presented with a mosaic of two or three different repeat types, and three samples had no repeat sequence at all (Figure S4).
********************
Add Figure 3 about here
********************