eMedLab, a partnership of seven leading bioinformatics research and academic institutions, is using a new private cloud, HPC environment and big data system to support the efforts of hundreds of researchers studying cancers, cardio-vascular and rare diseases.
The research focuses on understanding the causes of these diseases and how a person’s genetics may influence their predisposition to the disease and potential treatment responses.
The new HPC cloud environment combines a Red Hat Enterprise Linux OpenStack platform with Lenovo Flex System hardware to create virtual HPC clusters – bespoke to individual researchers’ requirements.
The system has been designed, integrated and configured by OCF, an HPC, big data and predictive analytics provider, working closely with its partners Red Hat, Lenovo, Mellanox Technologies and in collaboration with eMedlab's researchers.
‘Bioinformatics is a very, very data intensive discipline,’ said Jacky Pallas, Director of Research Platforms, University College London. ‘We want to study a lot of de-identified, anonymous human data. It’s not practical – from data transfer and data storage perspectives - to have scientists replicating the same datasets across their own, separate physical HPC resources, so we’re creating a single store for up to 6 Petabytes of data and a shared HPC environment within which researchers can build their own virtual clusters to support their work.’
The High Performance Computing environment is being hosted at a shared data centre for education and research, offered by digital technologies charity Jisc. The data centre has the capacity, technological capability and flexibility to support eMedLab’s HPC needs, with its ability to accommodate multiple and varied research projects concurrently in a highly collaborative environment.
The eMedLab partnership was formed in 2014 with funding from the Medical Research Council. Original members University College London, Queen Mary University of London, London School of Hygiene & Tropical Medicine, the Francis Crick Institute, the Wellcome Trust Sanger Institute and the EMBL European Bioinformatics Institute have been joined recently by King’s College London.
The Red Hat Enterprise Linux OpenStack Platform, a highly scalable Infrastructure-as-a-Service [IaaS] solution, enables scientists to create and use virtual clusters bespoke to their needs, allowing them to select compute memory, processors, networking, storage and archiving policies, all orchestrated by a simple web-based user-Interface. Researchers will be able access up to 6,000 cores of processing power.
‘We generate such large quantities of data that it can take weeks to transfer data from one site to another,’ said Tim Cutts, Head of Scientific Computing, the Wellcome Trust Sanger Institute. ‘Data in eMedLab will stay in one secure place and researchers will be able to dynamically create their own virtual HPC cluster to run their software and algorithms to interrogate the data, choosing the number of cores, operating system and other attributes to create the ideal cluster for their research.’
Cutts added: ‘The Red Hat Enterprise Linux OpenStack Platform enables our researchers to do this rapidly and using open standards which can be shared with the community.’
‘This is a tremendous and important win for Red Hat,’ said Radhesh Balakrishnan, general manager, OpenStack, Red Hat. ‘eMedLab’s deployment of Red Hat Enterprise Linux OpenStack Platform into its HPC environment for this data intensive project further highlights our leadership in this space and ability to deliver a fully supported, stable, and reliable production-ready OpenStack solution.’
Red Hat technology allows consortia such as eMedLab to use cutting edge self-service compute, storage, networking, and other new services as these are adopted as core OpenStack technologies, while still offering the service and support that Red Hat is known for. The use of Red Hat Enterprise Linux OpenStack Platform provides cutting edge technologies along with enterprise-grade support and services; leaving researchers to focus on the research and other medical challenges.