Prediction of candidate NLRs
To predict the candidate NLR (nucleotide binding leucine rich repeat)
resistance genes in our reference transcriptome, the NLR-Parser pipeline
was applied (Steuernagel, Jupe, Witek, Jones, & Wulff, 2015). We picked
the highest scoring domain found per reading frame per transcript and
manually screened the ORFs containing that domain. The transcripts were
left out if the ORF was too short or if they were missing the
appropriate start and stop codons and BLAST queries did not identify
hits from NCBI nr database. The NLR transcripts were considered to be
expressed in a genotype level assembly if the count was greater than one
in at least two replicates. The Venn diagram of expressed and
differentially expressed NLR genes across all genotypes was made using
venn package (https://CRAN.R-project.org/package=venn) in R.
Evolutionary analysis of the NLR transcripts was carried out usingAntirrhinum majus L. (snapdragon) proteins and coding sequences
(Li, Zhang, et al., 2019) as outgroup, since it is evolutionarily the
most recently diverged plant from P. lanceolata where full genome
assembly is available. Multiple sequence alignment of all of the NLRs in
reference assembly was carried out using MAFFT, and the phylogenetic
tree was estimated using FastTree (Price, Dehal, & Arkin, 2010).
ClusterPicker (Ragonnet-Cronin et al., 2013) script with 90 percent
similarity and genetic distance of 0.03 with gap option was used to cut
the phylogenetic tree into clusters. The longest sequence in each
cluster was used as BLAST queries against snapdragon proteins and the
closest hit was selected to represent the ancestral state and added to
the cluster. The phylogenetic tree was first visualized using the newick
format output tree produced by ClusterPicker and the final visualization
was carried out with ggtree package (Yu et al., 2017) in R.