Dramatically reducing genomic pipeline run times from hours to minutes, a new bioinformatics processor called Dragen has been deployed for the first time by a UK institute. Collaboration between developer Edico Genome and The Genome Analysis Centre (TGAC) in the UK resulted in the first adaptation of the Dragen technology for the analysis of non-human genomes as part of the Institute’s endeavours to sequence the DNA of plant, animal and microbial species to promote a sustainable bioeconomy.
Initial evaluations of Dragen showed that mapping against the ash tree genome was 177 times faster per processing core than TGAC’s local high-performance computing (HPC) systems, requiring only seven minutes instead of three hours on one of the larger datasets. Alignment runs on the rice genome that take approximately two hours on TGAC’s HPC servers took just three minutes using Dragen.
‘We are really excited to be Edico Genome’s first Dragen customer in the UK, and we hotly anticipate utilising this ground-breaking technology to advance our mission to promote a sustainable bio-economy and maintain the UK’s food security,’ commented Dr Tim Stitt, project lead, and head of Scientific Computing at TGAC. ‘In particular, we are really interested to see how Dragen handles the wheat genome, which is five times bigger than the human genome and much more complex. Wheat is the staple diet for over 35 percent of the world’s population, which is predicted to increase to nince billion people by 2050.
‘By understanding the genomic building blocks of wheat, and its diversity, we can better inform breeders on how to improve their yields, particularly in areas where wheat is prone to disease and drought. Obviously the sooner we do this the better and Dragen can greatly help us in this mission.’
The Dragen Bio-IT Processor is integrated on a PCIe card and available in a pre-configured server, enabling seamless integration into bioinformatics workflows. Dragen is highly reconfigurable, using a field-programmable gate array (FPGA) to provide hardware-accelerated implementations of BCL conversion, compression, mapping, alignment, sorting, duplicate marking, haplotype variant calling and joint genotyping.
The Dragen system is therefore much faster than traditional approaches that execute algorithmic implementations in software. In a recent study published in Genome Medicine, Dragen sped up analysis of a whole genome from 22.5 hours to 41 minutes, while also achieving sensitivity and specificity of 99.5 percent. Similar efficiency gains could make an enormous impact due to the high throughput of genomic data processed at TGAC, where sequence alignment is critical to many sequencing projects.