Variant detection
Sequencing reads were first aligned using BWA version 0.7.17 to the human reference genome obtained from the Genome Reference Consortium Human Build 38 patch release 13 (GRCh38.p13). Reads not mapped in proper pairs to the human reference genome were extracted using samtools version 1.10 (flag -F 2), and subsequently aligned to the P. vivax PvP01 reference genome from PlasmoDB (version 46) using BWA. Duplicate reads were removed with Picard’s MarkDuplicates (version 2.22.4). Variant detection was performed using the Genome Analysis ToolKit (GATK) version 4.1.4.1, using in a first step the HaplotypeCaller command in GVCF mode for individual chromosomes. GVCF files were merged using the GenomicsDBImport, followed by genotyping using GenotypeGVCFs, resulting in one vcf file per chromosome. The vcf files were filtered according to GATK best practices. Finally, for most downstream analysis the core genome (14 chromosomes, excluding subtelomeric regions and low-complexity domains and the apicoplast and mitochondrial sequences) was selected using the BCFtools query command.