Filling the African genome gap

22 February 2021 | Story Nadia Krige. Photo Nappy, Pexels. Read time 10 min.

When Professor Ambroise Wonkam, director of Genetic Medicine of African Populations (GeneMAP) in the University of Cape Town’s (UCT) Division of Human Genetics, first started formulating the idea of sequencing the genomes of three million Africans to build a more representative human reference genome, he anticipated a measure of resistance.

“My biggest reservation was that maybe it was too big, too crazy and too expensive,” Wonkam muses.

After publishing a comment piece titled ‘Sequence three million genomes across Africa’ introducing and unpacking his vision in illustrious journal, Nature on 10 February 2021, his anxieties proved to be unfounded, however, as it was met with an enthusiastic global response. 

“To be honest, I have never even written a science paper with original data that raised that much interest!” he says.

Filling a major gap

A good place to start talking about Wonkam’s Three Million African Genomes (3MAG) vision is to backtrack about thirty years to the launch of the Human Genome Project (HGP), an international research effort to determine the DNA sequence of the entire human genome.

Spanning more than a decade and costing $3 billion, the project resulted in the publication of the first accurate and complete human genome sequence.

As the HGP website explains, “[the project] gave us the ability, for the first time, to read nature's complete genetic blueprint for building a human being.”

During the course of the 10-year study, the cost of sequencing decreased significantly, turning it into a much more accessible technology.


“The reference genome sequences built from the HGP are missing many variants from African ancestral genomes.”

In many ways, it revolutionised medicine and led to thousands of genome-wide association studies (GWAS) that have been conducted to shed light on the role genes play in a host of diseases, conditions and treatments.

While the HGP strived to be as demographically representative as possible, subsequent studies have been focused largely on European and Asian populations, despite the fact that Africa contains more genetic diversity than any other continent.

In his paper, Wonkam reveals that so far that less than 2% of human genomes analysed have been those of people of African ancestry.

“The reference genome sequences built from the HGP are missing many variants from African ancestral genomes,” he writes. “A 2019 study estimated that a genome representing the DNA of the African population would have about 10% more DNA than the current reference.”

For the benefit of the world

While this oversight has had obvious drawbacks for the development of appropriate clinical interventions and health equity for African populations, Wonkam argues that it has also been to the detriment of the global population at large.

“The main reason for this is ancestry,” he explains. “Since this is where humans originated, we are all – in fact – African and would benefit from a more representative human reference genome.”


“By impacting on Africa, we will be addressing the imperative of equity that the human genome project that the 20 years that followed have not necessarily done optimally.”

Ultimately, Wonkam proffers that African genomes can reveal genes and variants that contribute to health and disease not found in previous Eurocentric studies.

Because they collectively have more genetic variations and less intermixing with other, non-African populations, finding variants likely to contribute to specific conditions will also be easier.

“By impacting on Africa, we will be addressing the imperative of equity that the human genome project that the 20 years that followed have not necessarily done optimally,” Wonkam says.

Bringing the 3MAG vision to life

Based on findings from previous studies and considering the continent’s vast ethnolinguistic and geographical diversity, Wonkam estimates that capturing the full scope of Africa’s genetic variation would require sequencing the genomes of at least three million African individuals.


“It has strengthened the training of African-based scientists and built a genetic community on the continent.”

Tackling a project of this magnitude is no mean feat and begs the question: where does one begin?

For Wonkam the answer is simple: start with the data that is currently available.

For the past ten years, the Human Heredity and Health in Africa (H3Africa) consortium - a collaboration between the African Society of Human Genetics (AfSHG), the National Institute of Health (NIH) in the United States (US) and the Wellcome Trust – has boosted the study of genomics and environmental determinants of diseases that are common among African populations by supporting 30 institutes across the continent. Drawing to a close in 2022, H3Africa has laid a firm foundation on which 3MAG can be built.

“H3Africa is probably the strongest base to work from,” says Wonkam. “It has strengthened the training of African-based scientists and built a genetic community on the continent. It has also sequenced the genomes of thousands of Africans already, which would be a good place for us to start.”

Apart from this, Wonkam says that other publicly available datasets include the UK Biobank (which includes 8 000 genomes labelled black or African) and the Trans-Omics for Precision Medicine (TOPMed) programme in the US.

Once relevant data has been collected from these resources, a principal component analysis will be conducted to determine how these individuals are related to one another and identify different clusters.


“We believe that technology will allow us to accelerate the path of sequencing.”

This will help researchers define ethnolinguistic and geographic groupings, as well as the proportion of participants that will be required from each section of the continent.

“The proportion of individuals we select from each region will be determined by how homogenous or heterogenous the populations are,” explains Wonkam. “A country like Sudan is extremely heterogenous, which means we’d need a large proportion of the population.”

Costs and collaboration

Achieving the goal of sequencing three million African genomes would probably take about a decade and cost $450 million core funding per year (about $1 500 per participant in total), Wonkam estimates. The plan would be to start with sequencing 300 000 in the first year.

“We believe that technology will allow us to accelerate the path of sequencing,” he says.

Following the publication of his paper earlier this month, Wonkam has been overwhelmed with positive feedback and interest from far and wide. This has been massively encouraging, as launching and maintaining a project of this nature will require support from African governments, research institutes and international organisations. Its success will rely heavily on collaborations across the continent and the globe, between both academic and corporate research initiatives.

Wonkam believes while the best time to have started this project was thirty years ago, the second-best time is now.

The publication of his paper has already set things in motion in ways that he couldn’t have predicted. Over the next few weeks and months, he will be facilitating meetings between major stakeholders such as H3Africa, AfSHG, the African Academy of Sciences and the African Union, as well as the World Health Organization (WHO) and funders like the NIH and the Wellcome Trust.

The first priority will be to establish governance, and related rules of engagement for managing the project. Wonkam says that it’s important for “the public, the government and the funders to trust that the project will be done in a way that is sustainable, ethically sound and will generate scientific knowledge, to improve health and disease in all people of the World, as well as a R&D product that can enhance the global and continental bioeconomy.”

While Wonkam may be credited with envisioning 3MAG, he by no means feels possessive over the project.

“My hope is that this paper may give voice to what many of us have been thinking,” he says. “And if it does, then I see many people carrying the project forward.”

Creative Commons License This work is licensed under a Creative Commons Attribution-NoDerivatives 4.0 International License.

Please view the republishing articles page for more information.