Drug development is a huge business, as successfully trialled drugs can generate billions of dollars in revenue.
But, for every successful compound, thousands of potential drugs fail. Software has been used to compliment and accelerate drug design for several years, but this still leaves chemists with complex and risky decisions to make when selecting potential compounds.
Some of this risk comes from knowing which compound or series of compounds to choose for a project but, as Optibrium’s CEO and company director Matthew Segall explains, uncertainty in the data – if not well managed – can lead to wasted resources.
‘There are a lot of different end-points measured or calculated and many different compounds or chemistries a project will explore – but a point we emphasise, that we believe is underused, is uncertainty in data – very significant uncertainty,’ said Segall.
This combination of complex parameters for a drug development project – the uncertainty in data, and the huge list of potential candidate drugs – were primary factors that drove Optibrium’s decision to develop ‘decision analysis methods to help people navigate through a very complex landscape of data,’ said Segall. ‘The goal is to prioritise compounds and to understand the structure activity relationships that are driving activity and other properties within the chemistry.
‘Everyone knows the value of downstream failure. If you pick the wrong chemistry and push it forward, you can end up with these incredibly costly late-stage failures.’
But Segall stresses this is a hidden cost, which is how many potential drugs have been missed, due particularly to the uncertainty in the data.’ This lack of understanding around uncertainty can lead scientists to make decisions that are not supported by the data that is available.
Multi-parameter optimisation
As drug development projects become increasingly complicated with multiple parameters that need to be optimised, this uncertainty can be an acute stumbling block, or, as Segall explains, it can be used to a chemist’s advantage.
‘What is really unique about the approach that we use, is that we explicitly propagate the effect of that uncertainty through to the decision that is being made. We have published numerous papers on this. One of the things that we observed in a paper published in Drug Discovery Today around 2012 was the cognitive biases involved in decision making.
‘This is something that has been well explored by experimental psychologists that everyone, including scientists, find it very difficult to make decisions on complex data when there is a lot of risk and uncertainty involved. Helping people recognise and use uncertainty appropriately is very important and unique to what we do,’ said Segall.
Complementary software
Drug development relies on a scientist’s ability to manage hugely complex streams of data on any number of compounds of interest to a particular project. To keep up with all of this data and make effective decisions requires the use of sophisticated software that can alleviate some pressure from drug development projects.
However, some companies recognise their expertise lies in a particular area, and so work to ensure complementary software packages can work together, so users have the choice to pick and choose software right for them.
‘We develop a lot of technology in-house, but as a company we recognise that no one entity can develop all that is cutting edge in every area of computational chemistry and cheminformatics,’ stated Segall. ‘We actively seek partners that are leaders in their space, to bring the technology into their software environment and make the interaction as seamless as possible for the end user.’
It is this acceptance and willingness to use the best tools, that creates an environment where several highly specialised software packages can be used together, to create an effective platform for drug development.
‘We have partnerships with Collaborative Drug Discovery (CDD), The Edge and Certara. Their platforms are being used for ELN or storage of databases to gather, reduce and store data for drug discovery projects. We work with them to ensure that our software works seamlessly with theirs,’ said Segall.
To maximise this integration, Optibrium aims to provide integration with their software and that of partner’s organisations, removing a need to correct data formats, data exportation and formatting. ‘That is a big part of our philosophy, as well being very agnostic to where people will get their data – we want to make that process as easy as possible,’ said Segall.
Visualisation is not enough
Managing all this data requires sophisticated data visualisation tools that can more intuitively display complicated data that is produced or collated throughout drug development projects. While many companies have their way of visualising data, Optibrium has decided to employ a card system that allows users to quickly group compounds with similar properties. However, Segall stressed that it is not just the visualisation tools themselves, but the combination of visualisation in tandem with support for decision making and data analysis that creates the most benefit for users.
‘If you think about five parameters you might be interested in, you could have a three-dimensional scatter plot: X, Y, Z. You end up with these incredibly complex 3D plots that look really great, but frankly, when you do this with real data it is very hard to make a decision – even before you take the uncertainty into account,’ said Segall.
This is further complicated by the level of expertise of the user, as increasingly these projects include non-computational experts that may have little experience with this kind of data analysis. ‘Often it is a medicinal chemist or biologist making decisions about this data and using the tools,’ said Segall. ‘Having some very complex software windows buttons or even asking scientists to work from the command line is just not good enough these days.’
This reality requires that software developers streamline software for non-domain experts that want to access the data but do not necessarily have the programming skills or expert chemical knowledge that a computational chemist would possess.
‘Very often you may run a complex algorithm that clusters compounds together or analyses “match pairs” or “activity cliffs” to understand the structure activity relationships in a chemical series. Most of these methods produce a great big table of numbers. Poring over that table and trying to understand what they are telling you about the structural modifications, or impact on properties and so on, is very challenging.’
‘We pioneered a visualisation tool we call ‘Card View – and, as the name suggests, it represents each compound on a card that is arranged on an infinite desktop that provides a very nice way to show the relationships between compounds,’ said Segall.
He explained that, in many cases, a chemist is interested not in a single compound, but a series that they can take forward. ‘This algorithm tries to group compounds that represent a series, and the card view represents each of these series or clusters of compounds as a stack of cards.
‘The output of these complex algorithms can be represented in this environment more visually, so key patterns just jump out at you. This could be a small change in structure that drives a big change in activity – clearly, this is something scientists need to understand in order to be able to take the next step in designing new compounds,’ said Segall.
Optibrium has designed its software to be as easy to use a possible, so data can be interpreted as intuitively as possible. ‘This allows you to apply these methods, and it allows users to understand what those methods are telling you about your data very quickly. This is absolutely key to the effective use of these technologies,’ said Segall.
However, it is not yet time to step back and let the computer take over. ‘The problem with these algorithms is they never completely agree with a chemist’s view of what a chemical series might be,’ Segall added.
A crucial point is that software must work with the expert; it may be beneficial to use a clustering algorithm to help define a particular series but it is important an expert can still use their experience to fine-tune software predictions and refine the overall results.
‘There are always artefacts of the algorithm that do not agree with a chemist’s eye’, stated Segall. ‘In our system you can see the output but then you see two clusters that are very similar, in the chemist’s opinion part of the same series, you can simply pick it up and drop one on top of the other and the software will show you the updated analysis.’
In an increasingly competitive and complex industry, it is this synergy between expert and machine that will be crucial to future drug discovery projects.