HIGHLIGHTED
PROJECTS
New reference genomes for Plasmodium ovale spp will assist large-scale genomic studies of these neglected malaria parasites.
Malaria-Profiler
PAST PROJECTS
A molecular barcode to inform the geographical origin and transmission dynamics of Plasmodium vivax
Using deep learning to identify recent positive selection in malaria parasite sequence data
Mass drug administration with dihydroartemisinin-piperaquine impact on drug resistance
Distinctive genetic structure and selection patterns in Plasmodium vivax from South Asia and East Africa
Population dynamics and drug resistance mutations in Guinea-Bissau
Drug resistance profile and clonality of Plasmodium falciparum parasites in Cape Verde
Drug resistance surveillance of Plasmodium falciparum in Ouélessébougou, Mali
Geographical classification of malaria parasites through applying machine learning
New insights into sulfadoxine-pyrimethamine antimalarial drug resistance
Plasmo Maps
Population genetic analysis of Plasmodium vivax
Population dynamics around Lake Victoria, Kenya
Population genetic analysis of Plasmodium knowlesi
Predicting the most likely evolutionary trajectories in the step-wise accumulation of resistance mutations
Selective whole genome amplification of Plasmodium malariae reveals insights into pop structure
Global genetic diversity of var2csa in Plasmodium falciparum, pregnancy & vaccine development
Genetic diversity of Plasmodium vivax isolates from pregnant women
New reference genomes for Plasmodium ovale spp will assist large-scale genomic studies of these neglected malaria parasites.
Despite Plasmodium ovale curtisi (Poc) and wallikeri (Pow) being important human-infecting malaria parasites that are widespread across Africa and Asia, little is known about their genome diversity. Morphologically identical, Poc and Pow are indistinguishable and commonly misidentified. Recent rises in the incidence of Poc/Pow infections have renewed efforts to address fundamental knowledge gaps in their biology, and to develop diagnostic tools to understand their epidemiological dynamics and malaria burden. A major roadblock has been the incompleteness of available reference assemblies (PocGH01, PowCR01; ~ 33.5 Mbp). Here, we applied multiple sequencing platforms and advanced bioinformatics tools to generate new reference genomes, Poc221 (South Sudan; 36.0 Mbp) and Pow222 (Nigeria; 34.3 Mbp), with improved nuclear genome contiguity (> 4.2 Mbp), annotation and completeness (> 99% Plasmodium spp., single copy orthologs). Subsequent sequencing of 6 Poc and 15 Pow isolates from Africa revealed a total of 22,517 and 43,855 high-quality core genome SNPs, respectively. Genome-wide levels of nucleotide diversity were determined to be 2.98 × 10–4 (Poc) and 3.43 × 10–4 (Pow), comparable to estimates for other Plasmodium species. Overall, the new reference genomes provide a robust foundation for dissecting the biology of Poc/Pow, their population structure and evolution, and will contribute to uncovering the recombination barrier separating these species.
People who worked on this project:
mhiggins
jphelan
dward
tgclark
scampino
emanko
Malaria-Profiler
Malaria-Profiler - a pipeline which allows users to analyse Plasmodium malaria whole genome sequencing data to predict species and potential drug resistance. Follow the instructions below to upload a new sample or view analysed runs. The pipeline searches for small variants (SNPs and indels) in genes associated with drug resistance. It will also report the species and geographical location. By default it uses Trimmomatic to trim the reads, BWA (or minimap2 for nanopore) to align to the reference genome and freebayes to call variants.
People who worked on this project:
jthorpe
emanko
aturkiewicz
jphelan
A molecular barcode to inform the geographical origin and transmission dynamics of Plasmodium vivax
Although Plasmodium vivax parasites are the predominant cause of malaria outside of sub-Saharan Africa, they not always prioritised by elimination programmes. P. vivax is resilient and poses challenges through its ability to re-emerge from dormancy in the human liver. With observed growing drug-resistance and the increasing reports of life-threatening infections, new tools to inform elimination efforts are needed. In order to halt transmission, we need to better understand the dynamics of transmission, the movement of parasites, and the reservoirs of infection in order to design targeted interventions. The use of molecular genetics and epidemiology for tracking and studying malaria parasite populations has been applied successfully in P. falciparum species and here we sought to develop a molecular genetic tool for P. vivax. By assembling the largest set of P. vivax whole genome sequences (n = 433) spanning 17 countries, and applying a machine learning approach, we created a 71 SNP barcode with high predictive ability to identify geographic origin (91.4%). Further, due to the inclusion of markers for within population variability, the barcode may also distinguish local transmission networks. By using P. vivax data from a low-transmission setting in Malaysia, we demonstrate the potential ability to infer outbreak events. By characterising the barcoding SNP genotypes in P. vivax DNA sourced from UK travellers (n = 132) to ten malaria endemic countries predominantly not used in the barcode construction, we correctly predicted the geographic region of infection origin. Overall, the 71 SNP barcode outperforms previously published genotyping methods and when rolled-out within new portable platforms, is likely to be an invaluable tool for informing targeted interventions towards elimination of this resilient human malaria.
People who worked on this project:
jdombrowski
jphelan
tgclark
scampino
Using deep learning to identify recent positive selection in malaria parasite sequence data
Using simulated genomic data, DeepSweep could detect recent sweeps with high predictive accuracy (areas under ROC curve > 0.95). DeepSweep was applied to Plasmodium falciparum (n = 1125; genome size 23 Mbp) and Plasmodium vivax (n = 368; genome size 29 Mbp) WGS data, and the genes identified overlapped with two established extended haplotype homozygosity methods (within-population iHS, across-population Rsb) (~ 60–75% overlap of hits at P < 0.0001). DeepSweep hits included regions proximal to known drug resistance loci for both P. falciparum (e.g. pfcrt, pfdhps and pfmdr1) and P. vivax (e.g. pvmrp1).
People who worked on this project:
emanko
jphelan
tgclark
scampino
Mass drug administration with dihydroartemisinin-piperaquine impact on drug resistance
The World Health Organization (WHO) recommends surveillance of molecular markers of resistance to anti-malarial drugs. This is particularly important in the case of mass drug administration (MDA), which is endorsed by the WHO in some settings to combat malaria. Dihydroartemisinin-piperaquine (DHA-PPQ) is an artemisinin-based combination therapy which has been used in MDA. This review analyses the impact of MDA with DHA-PPQ on the evolution of molecular markers of drug resistance. The review is split into two parts. Section I reviews the current evidence for different molecular markers of resistance to DHA-PPQ. This includes an overview of the prevalence of these molecular markers in Plasmodium falciparum Whole Genome Sequence data from the MalariaGEN Pf3k project. Section II is a systematic literature review of the impact that MDA with DHA-PPQ has had on the evolution of molecular markers of resistance. This systematic review followed PRISMA guidelines. This review found that despite being a recognised surveillance tool by the WHO, the surveillance of molecular markers of resistance following MDA with DHA-PPQ was not commonly performed. Of the total 96 papers screened for eligibility in this review, only 20 analysed molecular markers of drug resistance. The molecular markers published were also not standardized. Overall, this warrants greater reporting of molecular marker prevalence following MDA implementation. This should include putative pfcrt mutations which have been found to convey resistance to DHA-PPQ in vitro.
People who worked on this project:
smoss
emanko
tgclark
scampino
Distinctive genetic structure and selection patterns in Plasmodium vivax from South Asia and East Africa
Despite the high burden of Plasmodium vivax malaria in South Asian countries, the genetic diversity of circulating parasite populations is not well described. Determinants of antimalarial drug susceptibility for P. vivax in the region have not been characterised. Our genomic analysis of global P. vivax (n = 558) establishes South Asian isolates (n = 92) as a distinct subpopulation, which shares ancestry with some East African and South East Asian parasites. Signals of positive selection are linked to drug resistance-associated loci including pvkelch10, pvmrp1, pvdhfr and pvdhps, and two loci linked to P. vivax invasion of reticulocytes, pvrbp1a and pvrbp1b. Significant identity-by-descent was found in extended chromosome regions common to P. vivax from India and Ethiopia, including the pvdbp gene associated with Duffy blood group binding. Our investigation provides new understanding of global P. vivax population structure and genomic diversity, and genetic evidence of recent directional selection in this important human pathogen.
People who worked on this project:
jphelan
emanko
tgclark
scampino
Population dynamics and drug resistance mutations in Guinea-Bissau
Following integrated malaria control interventions, malaria burden on the Bijagós Archipelago has significantly decreased. Understanding the genomic diversity of circulating Plasmodium falciparum malaria parasites can assist infection control, through identifying drug resistance mutations and characterising the complexity of population structure. This study presents the first whole genome sequence data for P. falciparum isolates from the Bijagós Archipelago. Amplified DNA from P. falciparum isolates sourced from dried blood spot samples of 15 asymptomatic malaria cases were sequenced. Using 1.3 million SNPs characterised across 795 African P. falciparum isolates, population structure analyses revealed that isolates from the archipelago cluster with samples from mainland West Africa and appear closely related to mainland populations; without forming a separate phylogenetic cluster. This study characterises SNPs associated with antimalarial drug resistance on the archipelago. We observed fixation of the PfDHFR mutations N51I and S108N, associated with resistance to sulphadoxine-pyrimethamine, and the continued presence of PfCRT K76T, associated with chloroquine resistance. These data have relevance for infection control and drug resistance surveillance; particularly considering expected increases in antimalarial drug use following updated WHO recommendations, and the recent implementation of seasonal malaria chemoprevention and mass drug administration in the region.
People who worked on this project:
smoss
emanko
aosborne
tgclark
scampino
jphelan
Drug resistance profile and clonality of Plasmodium falciparum parasites in Cape Verde
Polymorphisms in pfk13 associated with artemisinin-based combination therapy (ACT) tolerance in Southeast Asia were not detected, but the majority of the tested samples carried the pfmdr1 haplotype NFD and anti-malarial-associated mutations in the the pfcrt and pfdhfr genes. The first whole genome sequencing (WGS) was performed for Cape Verdean parasites that showed that the samples cluster together, have a very high level of similarity and are close to other parasites populations from West Africa.
People who worked on this project:
dward
tgclark
scampino
Drug resistance surveillance of Plasmodium falciparum in Ouélessébougou, Mali
Sequence analysis of Plasmodium falciparum parasites is informative in ensuring sustained success of malaria control programmes. Whole-genome sequencing technologies provide insights into the epidemiology and genome-wide variation of P. falciparum populations and can characterise geographical as well as temporal changes. This is particularly important to monitor the emergence and spread of drug resistant P. falciparum parasites which is threatening malaria control programmes world-wide. Here, we provide a detailed characterisation of genome-wide genetic variation and drug resistance profiles in asymptomatic individuals in South-Western Mali, where malaria transmission is intense and seasonal, and case numbers have recently increased. Samples collected from Ouélessébougou, Mali (2019–2020; n = 87) were sequenced and placed in the context of older Malian (2007–2017; n = 876) and African-wide (n = 711) P. falciparum isolates. Our analysis revealed high multiclonality and low relatedness between isolates, in addition to increased frequencies of molecular markers for sulfadoxine-pyrimethamine and lumefantrine resistance, compared to older Malian isolates. Furthermore, 21 genes under selective pressure were identified, including a transmission-blocking vaccine candidate (pfCelTOS) and an erythrocyte invasion locus (pfdblmsp2). Overall, our work provides the most recent assessment of P. falciparum genetic diversity in Mali, a country with the second highest burden of malaria in West Africa, thereby informing malaria control activities.
People who worked on this project:
lvanheer
emanko
aosborne
tgclark
scampino
jphelan
Geographical classification of malaria parasites through applying machine learning
Malaria, caused by Plasmodium parasites, is a major global health challenge. Whole genome sequencing (WGS) of Plasmodium falciparum and Plasmodium vivax genomes is providing insights into parasite genetic diversity, transmission patterns, and can inform decision making for clinical and surveillance purposes. Advances in sequencing technologies are helping to generate timely and big genomic datasets, with the prospect of applying Artificial Intelligence analytical techniques (e.g., machine learning) to support programmatic malaria control and elimination. Here, we assess the potential of applying deep learning convolutional neural network approaches to predict the geographic origin of infections (continents, countries, GPS locations) using WGS data of P. falciparum (n = 5957; 27 countries) and P. vivax (n = 659; 13 countries) isolates. Using identified high-quality genome-wide single nucleotide polymorphisms (SNPs) (P. falciparum: 750 k, P. vivax: 588 k), an analysis of population structure and ancestry revealed clustering at the country-level. When predicting locations for both species, classification (compared to regression) methods had the lowest distance errors, and > 90% accuracy at a country level. Our work demonstrates the utility of machine learning approaches for geo-classification of malaria parasites. With timelier WGS data generation across more malaria-affected regions, the performance of machine learning approaches for geo-classification will improve, thereby supporting disease control activities.
People who worked on this project:
emanko
jphelan
tgclark
scampino
New insights into sulfadoxine-pyrimethamine antimalarial drug resistance
Plasmodium falciparum parasites resistant to antimalarial treatments have hindered malaria disease control. Sulfadoxine-pyrimethamine (SP) was used globally as a first-line treatment for malaria after wide-spread resistance to chloroquine emerged and, although replaced by artemisinin combinations, is currently used as intermittent preventive treatment of malaria in pregnancy and in young children as part of seasonal malaria chemoprophylaxis in sub-Saharan Africa. The emergence of SP-resistant parasites has been predominantly driven by cumulative build-up of mutations in the dihydrofolate reductase (pfdhfr) and dihydropteroate synthetase (pfdhps) genes, but additional amplifications in the folate pathway rate-limiting pfgch1 gene and promoter, have recently been described. However, the genetic make-up and prevalence of those amplifications is not fully understood. We analyse the whole genome sequence data of 4,134 P. falciparum isolates across 29 malaria endemic countries, and reveal that the pfgch1 gene and promoter amplifications have at least ten different forms, occurring collectively in 23% and 34% in Southeast Asian and African isolates, respectively. Amplifications are more likely to be present in isolates with a greater accumulation of pfdhfr and pfdhps substitutions (median of 1 additional mutations; P greater than 0.00001), and there was evidence that the frequency of pfgch1 variants may be increasing in some African populations, presumably under the pressure of SP for chemoprophylaxis and anti-folate containing antibiotics used for the treatment of bacterial infections. The selection of P. falciparum with pfgch1 amplifications may enhance the fitness of parasites with pfdhfr and pfdhps substitutions, potentially threatening the efficacy of this regimen for prevention of malaria in vulnerable groups. Our work describes new pfgch1 amplifications that can be used to inform the surveillance of SP drug resistance, its prophylactic use, and future experimental work to understand functional mechanisms.
People who worked on this project:
emanko
tgclark
scampino
Plasmo Maps
PlasmoMaps is a web-based tool that can be used to explore genomic sequence data of the non-falciparum malaria parasites, including P. malariae (n>200; XX SNPs; X countries), P. vivax (n>800; XX SNPs), P. ovale (n>50; XX SNPs), and P. knowlesi (n>200; XX SNPs). The tool integrates the presentation and summarisation (e.g., allele frequencies) of genomic variants with geographical location on maps, along with viewing extra variant information provided in the integrated genomics viewer (IGV). The tool comes with 2 separate maps to either search genomic variants via chromosome or via a selected gene.
People who worked on this project:
jthorpe
Population genetic analysis of Plasmodium vivax
Whilst P. vivax infections pose a serious risk to global health, genomic analyses of this species, particularly in South America where the parasite is predominant, are scarce in comparison to the more pathogenic P. falciparum. Brazil is a unique setting for malaria transmission, with distinct foci relating to the local environments and resultant vector landscapes. To date, all previously published WGS data from Brazil has originated from isolates obtained mainly from Acre and a few from Rondônia, in the north-western region of the country. Previous population genomic analyses have demonstrated that South American isolates (n = 146) are a distinct population with high genetic diversity,28 with three ancestral populations (Mexico, Peru, Colombia/Brazil),34 in contrast to our study, which reveals four main populations (n = 315; Brazil, Mexico/Colombia, Peru, Panama). Previous work has revealed geographical clustering of isolates from Brazil and Peru,28 but whilst closely related in our analysis, they are distinct. Earlier work focused solely on Mancio Lima, and found high levels of inbreeding.44 Studies of P. vivax from 4 countries (Brazil, Colombia, PNG, India), using microsatellite markers, have demonstrated high similarity between isolates from Brazil (Manaus) and India (Bikaner), and high genetic diversity irrespective of the transmission situation.98 Microsatellite data has also shown high diversity within and between Amazon parasite populations (Manaus, Porto Velho), with Amapa and Para infections being the most divergent,99 consistent with our findings that also suggest these two states are a distinct diverged clade.
People who worked on this project:
emanko
jdombrowski
scampino
tgclark
Population dynamics around Lake Victoria, Kenya
Characterising the genomic variation and population dynamics of Plasmodium falciparum parasites in high transmission regions of Sub-Saharan Africa is crucial to the long-term efficacy of regional malaria elimination campaigns and eradication. Whole-genome sequencing (WGS) technologies can contribute towards understanding the epidemiology and structural variation landscape of P. falciparum populations, including those within the Lake Victoria basin, a region of intense transmission. Here we provide a baseline assessment of the genomic diversity of P. falciparum isolates in the Lake region of Kenya, which has sparse genetic data. Lake region isolates are placed within the context of African-wide populations using Illumina WGS data and population genomic analyses. Our analysis revealed that P. falciparum isolates from Lake Victoria form a cluster within the East African parasite population. These isolates also appear to have distinct ancestral origins, containing genome-wide signatures from both Central and East African lineages. Known drug resistance biomarkers were observed at similar frequencies to those of East African parasite populations, including the S160N/T mutation in the pfap2mu gene, which has been associated with delayed clearance by artemisinin-based combination therapy. Overall, our work provides a first assessment of P. falciparum genetic diversity within the Lake Victoria basin, a region targeting malaria elimination.
People who worked on this project:
aosborne
emanko
scampino
tgclark
Population genetic analysis of Plasmodium knowlesi
The zoonotic Plasmodium knowlesi parasite is a growing public health concern in Southeast Asia, especially in Malaysia, where elimination of P. falciparum and P. vivax malaria has been the focus of control efforts. Understanding of the genetic diversity of P. knowlesi parasites can provide insights into its evolution, population structure, diagnostics, transmission dynamics, and the emergence of drug resistance. Previous work has revealed that P. knowlesi fall into three main sub-populations distinguished by a combination of geographical location and macaque host (Macaca fascicularis and M. nemestrina). It has been shown that Malaysian Borneo groups display profound heterogeneity with long regions of high or low divergence resulting in mosaic patterns between sub-populations, with some evidence of chromosomal-segment exchanges. However, the genetic structure of non-Borneo sub-populations is less clear. By gathering one of the largest collections of P. knowlesi whole-genome sequencing data, we studied structural genomic changes across sub-populations, with the analysis revealing differences in Borneo clusters linked to mosquito-related stages of the parasite cycle, in contrast to differences in host-related stages for the Peninsular group. Our work identifies new genetic exchange events, including introgressions between Malaysian Peninsular and M. nemestrina-associated clusters on various chromosomes, including in parasite invasion genes (DBPβ, NBPXα and NBPXβ), and important proteins expressed in the vertebrate parasite stages. Recombination events appear to have occurred between the Peninsular and M. fascicularis-associated groups, including in the DBPβ and DBPγ invasion associated genes. Overall, our work finds that genetic exchange events have occurred among the recognised contemporary groups of P. knowlesi parasites during their evolutionary history, leading to apparent mosaicism between these sub-populations. These findings generate new hypotheses relevant to parasite evolutionary biology and P. knowlesi epidemiology, which can inform malaria control approaches to containing the impact of zoonotic malaria on human communities.
People who worked on this project:
emanko
scampino
tgclark
Predicting the most likely evolutionary trajectories in the step-wise accumulation of resistance mutations
Pathogen evolution of drug resistance often occurs in a stepwise manner via the accumulation of multiple mutations that in combination have a non-additive impact on fitness, a phenomenon known as epistasis. The evolution of resistance via the accumulation of point mutations in the DHFR genes of Plasmodium falciparum (Pf) and Plasmodium vivax (Pv) has been studied extensively and multiple studies have shown epistatic interactions between these mutations determine the accessible evolutionary trajectories to highly resistant multiple mutations. Here, we simulated these evolutionary trajectories using a model of molecular evolution, parameterized using Rosetta Flex ddG predictions, where selection acts to reduce the target-drug binding affinity. We observe strong agreement with pathways determined using experimentally measured IC50 values of pyrimethamine binding, which suggests binding affinity is strongly predictive of resistance and epistasis in binding affinity strongly influences the order of fixation of resistance mutations. We also infer pathways directly from the frequency of mutations found in isolate data, and observe remarkable agreement with the most likely pathways predicted by our mechanistic model, as well as those determined experimentally. This suggests mutation frequency data can be used to intuitively infer evolutionary pathways, provided sufficient sampling of the population.
People who worked on this project:
emanko
scampino
tgclark
Selective whole genome amplification of Plasmodium malariae reveals insights into pop structure
The genomic diversity of Plasmodium malariae malaria parasites is understudied, partly because infected individuals tend to present with low parasite densities, leading to difficulties in obtaining sufficient parasite DNA for genome analysis. Selective whole genome amplification (SWGA) increases the relative levels of pathogen DNA in a clinical sample, but has not been adapted for P. malariae parasites. Here we design customized SWGA primers which successfully amplify P. malariae DNA extracted directly from unprocessed clinical blood samples obtained from patients with P. malariae-mono-infections from six countries, and further test the efficacy of SWGA on mixed infections with other Plasmodium spp. SWGA enables the successful whole genome sequencing of samples with low parasite density (i.e. one sample with a parasitaemia of 0.0064% resulted in 44% of the genome covered by ≥ 5 reads), leading to an average 14-fold increase in genome coverage when compared to unamplified samples. We identify a total of 868,476 genome-wide SNPs, of which 194,709 are unique across 18 high-quality isolates. After exclusion of the hypervariable subtelomeric regions, a high-quality core subset of 29,899 unique SNPs is defined. Population genetic analysis suggests that P. malariae parasites display clear geographical separation by continent. Further, SWGA successfully amplifies genetic regions of interest such as orthologs of P. falciparum drug resistance-associated loci (Pfdhfr, Pfdhps, Pfcrt, Pfk13 and Pfmdr1), and several non-synonymous SNPs were detected in these genes. In conclusion, we have established a robust SWGA approach that can assist whole genome sequencing of P. malariae, and thereby facilitate the implementation of much-needed large-scale multi-population genomic studies of this neglected malaria parasite. As demonstrated in other Plasmodia, such genetic diversity studies can provide insights into the biology underlying the disease and inform malaria surveillance and control measures.
People who worked on this project:
mhiggins
scampino
tgclark
Global genetic diversity of var2csa in Plasmodium falciparum, pregnancy & vaccine development
Malaria infection during pregnancy, caused by the sequestering of Plasmodium falciparum parasites in the placenta, leads to high infant mortality and maternal morbidity. The parasite-placenta adherence mechanism is mediated by the VAR2CSA protein, a target for natural occurring immunity. Currently, vaccine development is based on its ID1-DBL2Xb domain however little is known about the global genetic diversity of the encoding var2csa gene, which could influence vaccine efficacy. In a comprehensive analysis of the var2csa gene in >2,000 P. falciparum field isolates across 23 countries, we found that var2csa is duplicated in high prevalence (>25%), African and Oceanian populations harbour a much higher diversity than other regions, and that insertions/deletions are abundant leading to an underestimation of the diversity of the locus. Further, ID1-DBL2Xb haplotypes associated with adverse birth outcomes are present globally, and African-specific haplotypes exist, which should be incorporated into vaccine design.
People who worked on this project:
jdombrowski
scampino
tgclark
Genetic diversity of Plasmodium vivax isolates from pregnant women
Initially, the pregnant women were stratified into two groups—1 recurrence and 2 or more recurrences—in which no differences were observed in clinical gestational outcomes or in placental histological changes between the two groups. Then we evaluated the parasites genetically. An average of 18.5 distinct alleles were found at each of the MS loci, and the HE calculated for each marker indicates a high genetic diversity occurring within the population. There was a high percentage of polyclonal infections (61.7%, 108/175), and one haplotype (H1) occurred frequently (20%), with only 9 of the haplotypes appearing in more than one patient
People who worked on this project:
jdombrowski
hacfordpalmer
scampino
tgclark