Advancements in information and digital technologies offer both a challenge and an opportunity to researchers, as they begin to collect and mine data on a scale never previously imagined. As the rate of data collection, the volume of data and the complexity of analysis increase, at the same time research enterprises are becoming more global. Large, data-intensive research groups now tend to be made up of researchers from around the world, all of whom need access to the same data sets and software systems. To stay globally competitive, research institutions must work together to meet the needs of this rapidly changing era.
The investment required to meet these needs is significant for a developing country such as South Africa, and beyond the means of any single entity. UCT is therefore working with other research institutions and with government to build a cloud-based platform that will allow researchers anywhere to work on massive data sets, using any device.
A range of partners have come together, under different consortia, to contribute to the creation of a cloud-based data-intensive research platform that will begin to provide a national solution to South Africa’s big-data science challenge.
To begin with, this platform will meet the needs of three strategic disciplines: astronomy, bioinformatics and geospatial research.
African Research Cloud (ARC)
The ARC, a collaboration between UCT, IDIA and North West University, is the prototype for a cloud-based service to researchers working in data-intensive disciplines. Established in 2016, the ARC is testing different models of data management, storage and transfer through radio astronomy and genomics projects.
“The initiative is a first for Africa, and will be a real benefit to researchers on the continent,” says Sakkie Janse van Rensburg, UCT’s executive director of Information Communication Technology Services (ICTS).
South African Data-Intensive Research Cloud (SADIRC)
Given the success of the ARC prototype, the next step is the expansion of the ARC to include a greater number of research institutes, including both universities and organisations such as SKA South Africa and the South African National Space Agency (SANSA). A memorandum of understanding was in development at time of writing, which will formally constitute SADIRC.
In time, it is hoped, SADIRC will expand to offer access to storage for massive data sets, as well as the tools and software to properly collaborate on, analyse and visualise the data – to all South African researchers, including those based at our most under-resourced institutions.
The establishment of the ARC meant that UCT was perfectly placed to lead a consortium of institutions in the Western Cape province of South Africa to put in a bid to the National Integrated Cyberinfrastructure System (NICIS), supported by the Department of Science and Technology. The goal of this bid was to build a data-intensive research facility in the Western Cape that would cater explicitly to the needs of researchers working in astronomy and bioinformatics. The bid was successful; and today, this project is known as ‘Ilifu’ (‘cloud’, in isiXhosa).
Ilifu will receive funding from the Department of Science and Technology for a period of three years. It will bring together the existing infrastructure and expertise of the various partner institutions, and build on that to create a hub for data-intensive research systems, platforms and tools in the Western Cape. A further mandate for Ilifu is the development of a research data management system (see the following chapter of this publication).
Within the three-year funding period, Ilifu is set to continue as a self-sustaining facility.
The sum is greater than its parts
While the investment in infrastructure is segmented, the offering itself is greater than the sum of its parts. Working together, Ilifu, the ARC and (in time) the SADIRC will provide researchers access – through an online portal – to the entire tiered infrastructure system, as one entity (see the National Integrated Cyberinfrastructure System box).
A researcher will thus be able to log in to the online portal, from any device in any location, to access the stored data sets and run the necessary programs to analyse and visualise the data.
“Cloud technology has the capacity to democratise big-data analytics,” says Professor Russ Taylor, Ilifu project lead. “This not only empowers individual researchers, giving them real control over their data, but also allows distributed organisations to work together as one.”
This work is licensed under a Creative Commons Attribution-NoDerivatives 4.0 International License.
Please view the republishing articles page for more information.