Sophia Ktori considers how software integration helps ensure scientists work efficiently in the laboratory
Interconnection is an imperative for the modern laboratory, and seamless integration is mandatory from the perspective of the end user, who is generally a scientist. That’s the view of Leif Pedersen, president of software at Certara.
‘Scientists using lab technologies want to be able to do their work as effectively and efficiently as possible, and part of that requires interconnecting their systems and instrumentation in a seamless way.’
The key aim is to establish lab workflows to make it easier to find the answers to scientific questions, to save time and also – in a drug discovery or development setting, for example – to identify likely failures as soon as possible.
‘The overall goal is to gain a more insightful, aggregated view of distributed and collective data, enable analysis and decision making and to collaborate with colleagues in a way that can lead to achieve common goals faster.
‘In the life science arena, software vendors are trying to provide their customers with an environment that facilitates decision-making based on insight from many data sources,’ Pedersen said. ‘We’re trying to bring it all into a relevant dashboard, not just at the high level, like which molecules to proceed with or not, but also the ability to provide views into what to do next to progress the right candidates forward as fast as possible.’
Bringing data together
The concept of integration needs to be founded at the level of understanding about what data is generated in labs, and why and how that data can be used and combined with other data.
Pedersen said: ‘We have to think of both actual experimental data, and also the related metadata, which provides the context. There are some software tools on the market that excel in enriching the overall data platform, so that you can have even smarter intelligence by bringing together all of the different data in a more meaningful way.’
This is an essential part of the integration strategy, Pedersen said. ‘It enables better sharing of data more widely. Look at how fast the scientific communities have been working to develop new understanding of, and vaccines for, Covid-19. That is partly thanks to the ability to integrate systems and contextualise data and results, but also due to sharing data in a collaborative way to make integration comprehensive and meaningful.’
A key factor in being able to quickly derive meaningful insights is the ability to aggregate data and standardise data, taxonomies and ontologies, Pedersen reiterated. ‘There’s a great move towards building in a more semantic data layer that makes it possible to ask better questions and get more intelligent answers from the data.’
Maintaining intelligence
A key challenge for scientists, Pedersen suggests, is to maintain intelligence, without dilution from internal and external data sources. So the age-old issue of toolkit integration may not necessarily be the biggest problem, he believes.
The Certara D360 scientific informatics platform has been developed to allow organisations to do just that, and collate and analyse complex chemical, biological, logistical and computational data from a wide variety of sources, more cohesively, so that the maximum amount of intelligence can be derived from that data, Pedersen explained. ‘Importantly, D360 is an application that sits atop of all these assay, analytical and experimental systems that generate the data, so that scientists across an organisation can view, understand and leverage that data, to drive the right outcomes. So, even for a large biopharmaceutical company that may have labs carrying out analytical or research-based workflows at very different levels, including in vitro and in vivo work, D360 ensures that the data from all these sources is accessible, and can be viewed and analysed in context.’
Data complexity and volume
Data complexity and volume are impacting hugely on lab integration complexity, suggested Robert D Brown, vice president product marketing at Dotmatics. ‘Think about how science is racing ahead. Ultimately it falls to the Research IT organisation to devise, develop, or acquire the software that can handle that breadth, detail and volume of data, so the scientists can make sense of it.’
From a vendor’s point of view, this complexity means that a typical lab will now have multiple software and hardware platforms for sample and inventory management and registration, along with potentially multiple ELNs for different parts of the overall lab function and infrastructure, Brown noted.
‘The end users are, effectively, creating the integration problem, by pushing the boundaries of science. Research IT departments, in combination with vendors such as ourselves, are working to solve that problem. In the real world these IT groups give us a tremendous amount of input. This helps us to develop the optimum solution, initially to meet the needs of perhaps individual clients, but which we can then take out more broadly to industry.’
Industry needs as a driving force
In fact, most vendors would likely say that their software is driven by the needs of the industry, he continued. ‘But at Dotmatics this is truly the case: each innovation here could probably be traced back to the needs of one or two customers, who then helped to define the end product. The ultimate aim is to develop a solution that solves the initial problem, but that can be easily configured to meet the needs of other users, without the need to create separate versions.’
ELN and to a certain extent LIMS systems, have historically been bottlenecks to contextual data transfer, Brown noted. ‘A lot of these platforms were effectively places where data went to die. They were fantastic at getting data, collecting it and storing it, but then scientists had problems getting it back out,’ Brown said.
And whereas in the past the real value of the ELN was in recoding the experiment, an ELN today is also a valuable resource that is used to inform on prior experiments and help to direct forward studies and decision making.
‘Data in the ELN will let scientists see which experiments have been carried out by someone else in the organisation, and what the results were, or to identify optimal conditions from similar experiments carried out historically, to help in the design of new experiments going forwards.’
However, as the complexity and throughput of data has exceeded what a scientist can handle manually, a pressing problem that has now emerged, is how to get all the data and metadata into these platforms, without loss of depth or context. To solve this issue, Dotmatics acquired Boston-based company BioBright last year.
‘We needed to help customers solve just that “data-in” problem, and through the BioBright acquisition we now have both the “in”, and “out”, pieces of the data automation cycle,’ Brown noted.
The BioBright platform has effectively given Dotmatics what it says is a unique combination of lab data capture, data processing, ELN and data analytics capabilities. Brown said: ‘Using the Dotmatics platform, all data coming from instruments, or from external partners and contract research organisations is channelled into the centralised informatics platform. This means scientists can more easily affect end-to-end workflows and access data seamlessly.’
A holistic approach
Reed Molbak, product manager at Benchling, said the ability to integrate and interface the lab holistically will be matched by the need to keep up with massively increased laboratory throughput – related to ‘omics’ technologies.
‘The ability to maximise intelligence from this increased throughput in physical experimentation and data generation will hinge on the ability to channel complete, contextual data and metadata through machine learning and AI,’ Molbak noted. ‘Such algorithms are increasingly being developed to interrogate and analyse the results from genomic and proteomic analyses.’
Increased throughput has not only impacted the data coming out of assays but necessitated the development of a new generation of laboratory automation to enable those high throughput assay and screening workflows, Molbak also noted. ‘Functions such as assay plate preparation, which would traditionally have been carried out by a scientist or technician, is now in the hands of liquid handling and other robotic systems that can keep up with this assay throughput.’
The modern lab environment
In today’s lab environment, ‘success’ may now depend on being able to combine the ability to automate and so increase physical throughput, as well as optimise data management, integrity and utilisation for intelligence. And so we come back round to AI and ML, Molback noted. ‘Organisations may be developing their own algorithms and AI techniques, and so they look for a LIMS/ELN system that will work in that environment.’
The ability to connect hardware and software in the cloud is taking the pressure off companies having to install and integrate multiple pieces of software at the desktop and in-house server level, Molbak noted. ‘And for even relatively small companies, there will almost inevitably be multiple software platforms running, from different vendors, and for different instrumentation, that will need to interconnect.’
The Benchling R&D Cloud has been designed to facilitate laboratory interconnection and communication from discovery through to bioprocessing.
‘We have invested a lot of effort in enabling that imperative for instrument integration and interfacing, and also integrating the Benchling solution with other databases and data warehouses. Our goal is to enable our clients, who may have unique or proprietary techniques supporting their workflows, to be able to carry out specialised analyses, and support the management of data for ML and AI analysis and interrogation.’
Enabling laboratory unification
The differentiator for Benchling, Molbak believes, is enabling laboratory unification, and providing a holistic landscape for experimentation and data management.
‘In my experience those co-ordination problems are some of the hardest when you’re trying to knit together, say, an ELN and a LIMS system, and then possibly additional systems that interface with the lab instrumentation. Everything within the Benchling cloud environment is built on one database and uses one set of data models. For example, the plasmids that one team might design are held in the same records as the results, and all notebook data and all inventory data. It is all automatically centralised, no matter which part of our product you use, including our LIMS and ELN tools. There is never any need to try and interface different systems. What this means is that it then becomes much easier, to easy, to run contextual data through ML, because nothing gets left behind.’
Benchling is investing time and development in its lab automation and developer platform.
‘We can’t necessarily predict where the science is going to go, but we are definitely keeping an eye on it, to make sure the Benchling platform will continue to integrate physical hardware and software,’ Molbak said. ‘We are also focused on ensuring we can match the increasing scale of data generation. I don’t think anyone could have predicted, even two to three years ago, just how fast-paced the growth in experimental throughput would be.’