Fostering Data Standardisation for Collaborative Innovation in the Analytical Lab
R&D-driven industries talk a lot about progressing towards the paperless laboratory. The ultimate aim is to capture and store all experimental, process, inventory and results data, from the earliest stages of discovery, through to manufacturing, QA/QC and even instrument management, in an electronic format.
While instrument and software vendors are developing the tools to facilitate hands-off laboratory workflows, what has been lacking is a unifying language for communication and standardised data formats. ‘Laboratories today are disconnected,’ states Gene Tetreault, senior director, BIOVIA product management at Dassault Systèmes.
‘Typically, you will find multiple method authoring systems and execution platforms, poor integration of existing electronic systems and equipment, reliance on error-prone, manual activities and disparate data formats. This disconnection means that packets of data from the same workflow are commonly locked into different dead data silos and are not transferrable.’
Tetreault uses the evolution of printers as a basic analogy. ‘Twenty years ago, when someone bought a new printer they would have to plug it in, connect it to a PC, download and install specific drivers and hope that the computer’s software and the printer software could understand each other. Every printer used different software, so it was often hit-and-miss. Things have progressed hugely since then. Today, you can connect a printer wirelessly with your laptop, the two communicate and you are set up in minutes.’
When it comes to laboratory informatics platforms, analytical software and hardware, we are still at the ‘early printer’ stage of communication, he suggests.
Instrument and software vendors develop their own data formats and languages. For some, these formats represent unique selling points and keep customers loyal. The software may offer a particular feature that isn’t available using other packages, and it’s easier to stay with your current software or instrument vendor when you are already set up to use their ‘language’, than to transition everything to a different vendor.
But using product-specific software can complicate the lab’s operation, Tetreault points out. The range of software commonly used in a laboratory may span laboratory information management systems (LIMS), electronic laboratory notebooks (ELNs), enterprise resource planning (ERP), manufacturing execution systems (MES), product lifecycle management (PLM) and corrective and preventive action (CAPA) software, but they often will be hard to interface.
‘Having various informatics platforms grounded on the same language and data formats results in substantial efficiency improvements through the ability to search, mine and analyse collated data. Unfortunately, different software and informatics platforms from competing vendors commonly do not interface’ stated Tetreault.
This data disconnect is something that life science companies have long realised, Tetreault continues. ‘They recognised that business processes in the laboratory were basically broken, because everything was ‘non-standard’. Data were being collected in electronic format, but there was no standard architecture.’
A more seamless operational informatics infrastructure with standardised languages and data formats would enable scientific reproducibility and improve compliance, data integrity and context, while maximising the ability to reuse data. ‘With this capacity follows lower total cost of ownership, greater efficiency and faster time to value for implemented solutions.’
With the understanding that standardisation could help speed product development and time to market, while reducing attrition and repetition, a number of like-minded life science companies set up the Allotrope Foundation in 2012.
The goal was to create a standardised data format for the acquisition, sharing, and management of all experimental and analytical data, process data and structured methodology. ‘It’s about enabling digital continuity,’ Tetreault states. ‘The challenge was to bridge all the existing gaps between software, hardware, data management and reporting tools, to enable more compliant data tracking and better data integrity, do away with dead data silos and simplify the overall laboratory operational landscape.’ Ultimately, data then becomes more transparent and accessible to the scientists, managers and decision makers.
Allotrope’s growing list of partners – the instrument and software vendors who are working together to shape this standardisation – including BIOVIA – are driving development of the overarching Allotrope Framework that supports that standardisation, as well as the elements of the standards and languages.
On top of the framework sits a single, unifying data format, the Allotrope data format (ADF), for managing all laboratory data and ancillary files. The ADF is underpinned by Allotrope taxonomies and ontologies, which represent a standardised vocabulary and terminology for describing tests, methods and processes.
Allotrope data models then provide a route for using the ontologies and taxonomies to categorise and describe the laboratory’s data holistically and reproducibly. Alongside the vendor partners, Allotrope Foundation members, including some of the world’s largest pharma and biopharma companies, are working to implement Allotrope Framework solutions in real-world settings within their laboratories, while providing feedback to the partners so that the framework continually evolves.
BIOVIA has worked as an Allotrope Foundation vendor partner since 2015, to help develop a standardised data framework that other software vendors can plug into.
‘It made perfect sense for us to join the Allotrope Foundation because we were already working towards an open laboratory system through which users could implement their own standards,’ Tetreault said. ‘Allotrope aligned perfectly with our goal of enabling laboratory networks of software and hardware to communicate with the same language.’
Vendors and developers are now working together on defining all the pieces and fitting them together. ‘Once we can agree on units, parameter names, process steps, measurement names, etc, there will be a dictionary we can all use to make laboratories consistent, and so make equivalent data understandable and comparable. And standardisation will help ensure data quality and integrity, as well as regulatory compliance, which are critical to any R&D or manufacturing environment.’
In October 2017 BIOVIA released the latest version of the Dassault Systèmes ONE Lab solution. This is the first informatics platform developed to support the unified laboratory concept with ADF implemented throughout the laboratory workflow. The ONE Lab solution includes ONE Research Lab, ONE Development Lab, and ONE Quality Lab configurations, to meet the needs of each type of laboratory operation.
Described as a platform for ‘ideation to commercialisation’, ONE Lab is designed to allow organisations to operate as a unified lab, enabling comprehensive executive insight, faster cycle times and technology transfer, digital compliance, and a simpler IT environment.
Tetreault continues, ‘the goal is to facilitate total laboratory connectivity from research to manufacturing, and support external collaboration, more efficient use of manpower, resources, and better business insight.’ ONE Lab effectively automates and standardises everything – the test procedures, instrument connections, etc – in small molecule or biologics research, development and quality control labs.
From the perspective of supporting an Allotrope Foundation framework for complete laboratory standardisation, the ONE Lab infrastructure allows laboratories to adopt a unified standard and language, while still using different software applications, because resulting data can be converted into the standard format.
Tetreault notes, ‘We have incorporated both reader and writer functionality components in our BIOVIA Pipeline Pilot solution to read and to write ADF files. In addition, we have the ability to synchronise with Allotrope taxonomy and ontology data for material, process, equipment and result data.’
What this means is that practically every ad hoc or repeated process is carried out according to validated procedures, recorded using the same language and reported under that single framework. ‘Laboratories can adopt Allotrope Foundation standard vocabularies, taxonomies and ontologies, so as you are developing an experiment or procedure, taking a measurement, describing materials or an operation in the laboratory, you are doing it in a standard language. It’s like adopting English as a standard spoken language.’
These are critical capabilities, Tetreault suggests, which enable the industry to generate information-rich data acquisition layers. ‘So, when we acquire data from systems, we can transform it into the ADF format, but on top of that, we have the ability to also standardise every procedure executed in the lab, including the most mundane or common processes such as compound registration, making solutions, or using and calibrating equipment.’
‘All of those procedures can now be based on building blocks of standard nomenclature and process steps, which are all configured around the S88 industry standard for authoring and defining processes, and for defining the materials and equipment associated with those processes. Layered on top of that, again you have the experiments and studies that use these standard procedures, standard data and standard language. All of that can now be stored in and retrieved from a cloud-based data lake that can be indexed and accessed by people from anywhere in the world,’ Tetreault continues.
Laboratory-wide standardisation creates huge value within an organisation, above and beyond just sticking to standard operating procedures (SOPs) and ensuring data integrity. Tetreault suggests. ‘It reduces inefficiencies and data utility challenges through consistency. The ability to easily access both current and historical data enables scientists and business managers to avoid repeating experiments. The ability to rapidly compare results informs scientific and operational decision making and aids regulatory review and audit. It helps you to understand your processes and data. It aids laboratory integration. Ultimately all your processes become more efficient and effective.’
Standardisation also makes it easier to collaborate with partners or service providers. ‘If everyone is using the same language and data format, then you can easily communicate and exchange information with a partner, contract research organisation (CRO), contract manufacturer or other service provider,’ Tetreault notes. ‘And you can do that without having to translate data or provide summaries in flat files such as PDFs.’
While an increasing number of vendors and life science companies are joining Allotrope as members to help develop the framework and demonstrate its utility in industrial settings, there is still work to be done. BIOVIA is collaborating with instrument vendors to align equipment with its ONE Lab infrastructure, so that laboratories can effectively purchase new equipment and interface directly through BIOVIA’s standard lab services without facing communication issues.
‘We look at a laboratory workflow from end to end. We provide the documentation of experiments, the execution of procedures and the management of equipment, all of those logistic functions in the lab. But we are not an equipment vendor, so we work with equipment vendor partners to make sure their equipment fits into the customer’s workflows using the ADF standard.’
Out of the box, the BIOVIA software is effectively empty of content, Tetreault explains. ‘It’s a bit like opening up a dictionary in which every page is blank, and then being able to press a button and pull in all of the language, or all of the words, the vocabularies that would go into that dictionary. You then start working with a full dictionary of terms for every laboratory process. There is no uncertainty about how to name or describe things, the terminology is clear.’
A growing number of partner software and instrument vendor companies are joining the Allotrope Foundation to continue developing the framework. Similarly, there is a broad spectrum of member companies that represent the end users, and this now includes many of the world’s major pharma and biopharma companies, who are working alongside the Allotrope Foundation on implementations of the ADF data standard, taxonomies and ontologies.
Firms don’t need to be members to take advantage of the ADF standard and language, but there’s a lot of motivation for vendors to join as partners. Members whose implementations are supported by Allotrope Foundation partners will receive direct feedback into how the framework evolves.
‘Some of those member companies are now publishing that they have thousands of data points in their data lakes that are being accessed by thousands of scientists on a daily basis, and they are already starting to realise major savings from not having to search for, or recreate data, because it is all accessible,’ Tetreault comments. ‘Scientists are able to work more efficiently, and are deriving more value from their experiments and analyses. Ultimately, that means that data can not only be captured as part of your experiments and procedures, but it can be reused over and over again, so you won’t have to do as many tests or as many experiments in the future.’
The industry is now working together to define that language and make compromises, so that the language is as relevant as possible to everyone.
‘Now there is tremendous value in coming to a compromise,’ Tetreault notes. ‘When we can all agree on units, on parameter names, on what to call process steps, then that dictionary is useful to everyone and you will be able to find all your data using standard terms. You will be able to exchange data within an organisation and with external partners, access and read it. BIOVIA is playing a key role in making this possible by adopting an underlying architecture to capture and store data, and to implement standards.’