Fighting COVID-19 With the Power of Genomics and HPC

Researchers at Cardiff University are using the power of genomic sequencing and high performance computing to unlock the secrets of COVID-19.

In scientific laboratories around the world, efforts are under way to put the power of genomic sequencing and high performance computing (HPC) to work in the fight against COVID-19. At Cardiff University in Wales, a team of scientists is working with the COVID-19 Genomics UK Consortium (COG-UK), to unlock the secrets of the coronavirus that causes COVID-19.

This team is led by Dr. Thomas Connor, a distinguished researcher of the Cardiff sequencing center under the umbrella of COG-UK. The COG-UK organization brings together experts from across the UK National Health Service (NHS), academia and public health agencies for large-scale, rapid genomic sequencing and analysis of the coronavirus. This information can then be quickly shared with hospitals, the NHS and the government to help inform their responses to the pandemic.

“Genomic sequencing will help us to understand coronavirus and its spread,” Dr. Connor says in a Cardiff University news release. “By analyzing samples from people who have had confirmed cases of COVID-19, scientists can monitor changes in the virus at a national scale to understand how the virus is spreading and whether different strains are emerging. Having this information available will help in the clinical care of patients — and ultimately help to save lives.”

COG-UK also benefits from another project where Cardiff University has played a key role: the MRC CLIMB project. Dr. Connor leads the Medical Research Council (MRC) Cloud Infrastructure for Microbial Bioinformatics (CLIMB) project at Cardiff University with support from Supercomputing Wales to supply COG-UK with the computational resources needed to share and analyze the large volumes of COVID-19 genomics data now being generated across the UK.

Working with Dell Technologies

In these efforts, Dr. Connor and his colleagues are building on a longstanding relationship with Dell Technologies. The team works to enable CLIMB’s capacity to share and analyze large volumes of COVID-19 genomics data. With this solution in place, the University has the potential to sequence samples within 24 hours, allowing for real-time responses.

“CLIMB has become an essential national capability for microbiologists in the UK,” Dr. Connor says in a Dell Technologies case study. It serves more than 1,000 users and over 300 research groups from 89 research institutions, including universities, public health agencies and governmental organizations. In addition, CLIMB has provided training in bioinformatics to thousands of academics, students and clinical microbiologists across the UK and as far afield as Palestine, Gambia and Vietnam.

A look under the hood of MRC CLIMB

The core infrastructure for CLIMB is a Dell EMC cloud system running the open source OpenStack operating system. To enhance resiliency, CLIMB is spread over four sites, each with 500 TB of local scratch storage.

At the heart of the CLIMB environment is a large shared object storage system that provides about 2.5 petabytes of HPC data storage, which can be replicated between sites. This storage system is based on Red Hat Ceph Storage running on Dell EMC PowerEdge servers with Intel® Xeon® processors. This community system provides a place where researchers can store and share very large microbial datasets.

In addition, the CLIMB cloud environment offers access to a huge amount of memory — more than 78 terabytes of RAM. With all this muscle under the hood, CLIMB can run more than 1,000 virtual machines simultaneously, and each of these VMs can be preloaded with software, customized by end users and saved as snapshots for reuse by others on the infrastructure.

“For seven or eight years, I’ve had a really great relationship with the HPC team at Dell,” Dr. Connor notes. “They answer our questions and help out whenever we need help. They have been really accommodating in terms of helping us to get the solution that we need to do the work that we do. That’s a really positive thing that has come from my interactions with Dell.”

Another positive outcome is the results of the research powered by the HPC clusters that drive the high-throughput sequencing and bioinformatics used to fight infectious diseases and enable personalized healthcare.

“In the last 12 months, we have sequenced around 8,000 to 9,000 patient samples across our genomics programs,” Dr. Connor says, “and all of that has been processed through our hardware supplied by Dell.”

To learn more

For a closer look at the work of Dr. Thomas Connor and his colleagues at Cardiff University, see the Dell Technologies case study Unleashing the power of genomics.” And to learn more about the global effort to fight the deadly COVID-19 disease with the power of HPC, visit the COVID-19 High Performance Computing Consortium.