Babylon AI achieves equivalent accuracy with human doctors

Babylon Health has announced a world-first during a presentation streamed live from London’s Royal College of Physicians: The company’s AI, in a series of robust tests (including the relevant sections of the MRCGP exam), has demonstrated its ability to provide health advice which is on-par with practising clinicians^[1^].

The MRCGP exam is the final test for trainee General Practitioners (GPs), set by the Royal College of General Practitioners (RCGP). Trainee GPs who pass this assessment have demonstrated their competence and clinical skills to a level which is sufficiently high enough for them to undertake independent practice.

A key part of this exam tests a doctor’s ability to diagnose.

Babylon took a representative sample-set of questions testing diagnostic skills from publicly available RCGP sources^[2], as well as independently published examination preparation materials, and mapped these to the current RCGP curriculum in order to ensure the questions resembled actual MRCGP questions as closely as possible.

The average pass mark over the past five years for real-life doctors was 72 per cent^[3]. In sitting the exam for the first time, Babylon’s AI scored 81%. As the AI continues to learn and accumulate knowledge, Babylon expects that subsequent testing will produce significant improvements in terms of results.

Important though exams are, doctors are presented with a much wider range of illnesses and conditions in their daily practice. Therefore, to further test the AI’s capabilities, Babylon’s team of scientists, clinicians and engineers next collaborated with the Royal College of Physicians, Dr Megan Mahoney (Chief of General Primary Care, Division of Primary Care and Population Health, Stanford University), and Dr Arnold DoRosario (Chief Population Health Officer, Yale New Haven Health) to test Babylon’s AI alongside seven highly-experienced primary care doctors using 100 independently-devised symptom sets (or ‘vignettes’).

Babylon’s AI scored 80 per cent for accuracy, while the seven doctors achieved an accuracy range of 64-94 per cent.

The accuracy of the AI was 98 per cent when assessed against conditions seen most frequently in primary care medicine. In comparison, when Babylon’s research team assessed experienced clinicians using the same measure, their accuracy ranged from 52-99 per cent.

Crucially, the safety of the AI was 97 per cent. This compares favourably to the doctors, whose average was 93.1 per cent.

Dr Ali Parsa, Babylon’s Founder and CEO said of tonight’s news: 'The World Health Organisation estimates that there is a shortage of over 5 million doctors globally, leaving more than half the world’s population without access to even the most basic healthcare services. Even in the richest nations, primary care is becoming increasingly unaffordable and inconvenient, often with waiting times that make it not readily accessible. Babylon’s latest artificial intelligence capabilities show that it is possible for anyone, irrespective of their geography, wealth or circumstances, to have free access to health advice that is on-par with top-rated practicing clinicians.'

'Tonight’s results clearly illustrate how AI-augmented health services can reduce the burden on healthcare systems around the world. Our mission is to put accessible and affordable health services into the hands of every person on Earth. These landmark results take humanity a significant step closer to achieving a world where no-one is denied safe and accurate health advice' added Parsa.

Babylon’s research paper, entitled A comparative study of artificial intelligence and human doctors for the purpose of triage and diagnosis, can be downloaded from the company’s website and will be available over coming days via ArXiv.com.

Babylon AI achieves equivalent accuracy with human doctors

Topics

Read more about:

Editor's picks

From cancer data to treatment: Nancy Guo's precision medicine vision

Live Webcast: Transform your labs with cutting-edge AI solutions

Live webcast: Cloud computing in the analytical lab: The strategic risks, challenges and opportunities to consider

On Demand Webinar: Consolidating computing resources can democratise research infrastructure