Speed, integration and interactivity

Pharmaceuticals play a crucial role in managing health problems of all types and there is a huge impetus to find and improve drugs. But pharmaceutical development is neither cheap nor simple. As Lea Thøgersen, senior bioinformatics scientist at CLC bio in Denmark, observes: ‘The cost of bringing a new drug to market is huge. This is, in part, due to money and time spent taking a potential drug to the clinical trial phase, only to see it fail at this late stage. The challenge for the pharmaceutical industry is to speed up the process of selecting a good candidate drug, and to secure a higher success rate when a drug reaches the clinical trials.’

The traditional image of drug discovery involves chemists, biologists, pharmacologists and other ‘wet’ scientists experimenting in labs. However there is another increasingly crucial piece to the puzzle: the role played by computational scientists throughout the process in helping to address this challenge.

‘Modelling today is an integrated part of the drug discovery process. It is used for virtual screenings of compound libraries, but also for hypothesis testing and gaining an understanding of why one compound binds strongly to a target and another does not. Docking simulations are a cheap way to explore chemical space without doing wet-lab experiments, and help decide which compounds seem worthwhile to synthesise,’ continues Thøgersen.

Of course, this is not a simple task, she says: ‘The challenges faced by researchers using small-molecule docking as a tool for drug design have been the same for many years now; namely accommodating the flexibility of the protein target in the model and acquiring a binding affinity prediction that allows not just a proper ranking of small-molecule binders, but also a reasonable estimate of the binding strength.’

CLC bio (including the Molegro business, which was bought in September 2012) offers Molegro Virtual Docker, promising high-speed and high-accuracy docking for virtual screening of small-molecule libraries. It also has a graphical user interface where the interactions between the target protein and the potential drug candidates can be explored. To describe target protein flexibility, induced-fit docking is offered, allowing for rearrangements of side-chains in the binding pocket to best fit the docked molecule. Visualisation tools for analysing protein-ligand interactions and identifying hot-spots for favourable types of interactions are also provided. Many challenges relating to integration with other tools and processes are being tackled by CLC bio as part of the integration of Molegro with the company’s other services and products.

Thøgersen says a common request from customers is support for an interactive flow from sketching small molecules, docking them against a protein target and then manipulating the molecules inside the protein binding pocket. They also want a real-time response from the software relating to the implications the changes have on binding quality. She observes that customers are always looking for a better prediction of binding affinity too. ‘The dream would be to find the perfect drug from a virtual screening of all synthesisable molecules. However, there is a limitation as to the level of modelling precision, and a modelling approach such as docking will have its built-in limitations with respect to accuracy,’ she says.

An ongoing process

Shi-Yi Liu, SVP of marketing at Schrödinger, notes that refining and improving models and their application to drug discovery needs to be a continual process. ‘We are devoted to ongoing research and are constantly engaged in understanding it from a physics perspective, trying to improve the accuracy of the process,’ she explains. ‘It’s ongoing – it’s still theoretical and we are always making approximations.’

Such models and approximations can be assisted by experimental data. Liu says: ‘We use any and all data we can get our hands on, especially new data. We systematically improve our tools, using experimental data to feed in to models, and we have a large group of scientists working on methods development.’ Schrödinger is fortunate in this regard because, as well as developing software, the company provides services to pharmaceutical groups. ‘Not everyone is willing to share data, but the services side of our business enables us to work on proprietary data sets. These are useful for testing and refining the simulation tools,’ adds Liu. ‘If there are differences between the experimental results and the models, this is like an alert to re-evaluate the models. These datasets are also good training sets when we are refining models.’

Modelling and simulation tools are used in a number of ways depending on the stage of the process and the needs of the client. ‘It is most highly used in virtual screening, but lead optimisation is also of key importance,’ says Liu. ‘We’ve also heard that, once customers have found a candidate compound, they find it useful to use the software to find a back-up.’

Such a range of applications require different ways of using the tools, particularly with regard to the trade-off between accuracy and speed of response. ‘As a company, we offer a range of tools across the speed-versus-accuracy spectrum. Customers need to exercise their judgement,’ says Liu. ‘At the start there are many millions of ideas. If you are constrained by the number of licences, or number of computer cores, then you will err on the side of speed.’

However, the issues are different as researchers progress a drug-discovery project. ‘As you go through the process, there are fewer compounds you are looking at, so you can throw more computing power at each,’ says Liu.

Working in the cloud is something Liu recommends: ‘We provide our licences to be run on the cloud, working with Cycle Computing, but the option only has a minority take-up at the moment, which is surprising.’

She attributes the lack of cloud take-up to security concerns, but notes that even banks use the cloud. ‘With pharmaceutical companies it is probably more of a cultural trend and we expect that to change. My personal view is that the cloud is just as secure as any cluster,’ she argues.

Aside from the speed versus accuracy trade-off, Liu says that the company hears more and more that researchers are becoming happier with the predictive power of modelling software. However, they are overwhelmed with the sheer volume of data in different silos.

‘The problem is the data avalanche – not just from computational models, but also from assays, crystallographic data and other sources. How do you make informed decisions, rather than just looking at the latest bit of data to land on your desk?’ she says. ‘People making decisions need real-time access to all the data – and that’s something that, as a field, we’ve not done well.’

One of the company’s goals now is to bring all this data together. ‘At Schrödinger we have been working on this over the past few years and are in the prototyping and designing phase. It is not solved yet, but we are hiring really bright people and in a year’s time we will probably have a very exciting update.’

Outside of major, established companies there is also plenty of innovative research that aims to help drug discovery in the future. Victor Guallar and colleagues at the Barcelona Supercomputing Center in Spain have developed a new technique for protein dynamics and interactions with drugs. Their approach is based on Monte Carlo techniques rather than traditional Newtonian methods. This, said Guallar, makes the approach significantly faster – an attribute that is a key part of commercial plans that the group has for its tool: ‘We envision that, in the future, we will combine it with multimedia functionality and enable on-the-fly interacting with results – on a device like an iPad, perhaps. Seeing real-time changes and mutations could be really powerful.’

Interactive simulations

Being able to watch simulations and interact with them would mean that researchers could adapt conditions as a simulation progresses rather than needing to wait for the output at the end. ‘I’d like to make it really interactive’ Guallar says. ‘You could start a simulation and see the movement of the ligand and protein in real time and then try things like moving ligands or changing the temperature.’

Today, he says, people send requests to the server and leave it for a few hours or days. ‘I’d say about 50 per cent of simulations are wrong and need to be tweaked and done again.’

The tool that the group is working on currently takes around two to three minutes for every step, although these steps are bigger than those with Newtonian approaches, he adds. He predicts that a new CPU would reduce this to perhaps less than a minute but says that the group’s goal is to enable, maybe, 2,000 steps per minute and make it really interactive.

‘A simulation can be done in about three hours, but the next generation will do it in minutes, so you don’t have to leave it and come back,’ he says, adding ‘I think this will make a big difference but you need parallel computing and fast enough software.’

The tool is named PELE, which stands for Protein Energy Landscape Exploration; Guallar notes that any similarity with a famous footballer is a coincidence and this won’t be criteria for naming future products. The goal is to have a version of PELE that is commercially available to the pharmaceutical industry, although Guallar adds it will be distributed free of charge to universities and not-for-profit organisations.

Speed, integration and interactivity

An ongoing process

Interactive simulations

Topics

Read more about:

Editor's picks

Enter the SCW75 - celebrating leaders in scientific computing

On Demand: Free Online Panel Discussion | LIMS innovation boosts precision and security

On-Demand: Optimise your HPC storage strategy

On-demand | AI in Life Sciences: Practical applications in small molecule design

How LIMS innovation is boosting bioanalysis precision and security

From bytes to breakthroughs: Data storage challenges in scientific research

Protecting bioanalytical data integrity from bench to report