The Laboratory Information Management System (LIMS) developed for the Moorea Biocode Project is to be made publicly available as a free beta version. The Biocode LIMS and data analysis components of the project were developed by Biomatters in collaboration with the Biocode Project researchers as a plugin for Biomatters’ Geneious Pro sequence analysis software.
The Geneious Biocode LIMS will give biologists around the world a tool to use in their own research, as well as access to the Moorea project’s final database. An accompanying ‘Biocode Genbank Submission’ plugin will allow researchers to upload their own sequence data from inside Geneious Pro directly to Genbank, the world’s largest public DNA sequence database.
The Biocode LIMS system provides an informatics pipeline for batch processing of samples from DNA extraction through to sequencing, identifying and re-running failed reactions, and identifying systematic errors that can be strategically addressed. It integrates with Geneious Pro’s existing sequence assembly tools and various Field Information Management Systems (FIMS), including TAPIR standard access protocols.
Once the specimen reaches the end of the pipeline, the Biocode Genbank submission plugin automates the submission of completed contigs to make the DNA sequences publically available. The reaction data from the LIMS database is combined with the field metadata from the FIMS database as a quality control mechanism including the completely tracked history.
Neil Davies, director of UC Berkeley’s Gump South Pacific Research Station and principal investigator of the Biocode Project says: ‘This is the first freely available, broadly applicable software tool to assist tracking materials through the DNA barcoding pipeline. No other freely available program allows the level of tracking and data quality assurance through a lab system. Importantly, it goes beyond DNA barcoding to accommodate multiple genetic markers for use in a broad range of biodiversity and ecogenomic studies.
‘The plate workflow approach taken has greatly simplified the process of identifying reaction failures, and setting them up to be run again. It manages data that used to be spread across multiple individuals and notebooks so we can search it, report success, or look for patterns. It has significantly reduced the human error that has been problematic in large-scale sequencing projects such as this in the past,’ he added.