Can AI review the scientific literature — and figure out what it all means?

When Sam Rodriques was a neurobiology graduate student, he was struck by a fundamental limitation of science. Even if researchers had already produced all the information needed to understand a human cell or a brain, “I’m not sure we would know it”, he says, “because no human has the ability to understand or read all the literature and get a comprehensive view.”

Five years later, Rodriques says he is closer to solving that problem using artificial intelligence (AI). In September, he and his team at the US start-up FutureHouse announced that an AI-based system they had built could, within minutes, produce syntheses of scientific knowledge that were more accurate than Wikipedia pages¹. The team promptly generated Wikipedia-style entries on around 17,000 human genes, most of which previously lacked a detailed page.

Rodriques is not the only one turning to AI to help synthesize science. For decades, scholars have been trying to accelerate the onerous task of compiling bodies of research into reviews. “They’re too long, they’re incredibly intensive and they’re often out of date by the time they’re written,” says Iain Marshall, who studies research synthesis at King’s College London. The explosion of interest in large language models (LLMs), the generative-AI programs that underlie tools such as ChatGPT, is prompting fresh excitement about automating the task.

Some of the newer AI-powered science search engines can already help people to produce narrative literature reviews — a written tour of studies — by finding, sorting and summarizing publications. But they can’t yet produce a high-quality review by themselves. The toughest challenge of all is the ‘gold-standard’ systematic review, which involves stringent procedures to search and assess papers, and often a meta-analysis to synthesize the results. Most researchers agree that these are a long way from being fully automated. “I’m sure we’ll eventually get there,” says Paul Glasziou, a specialist in evidence and systematic reviews at Bond University in Gold Coast, Australia. “I just can’t tell you whether that’s 10 years away or 100 years away.”

At the same time, however, researchers fear that AI tools could lead to more sloppy, inaccurate or misleading reviews polluting the literature. “The worry is that all the decades of research on how to do good evidence synthesis starts to be undermined,” says James Thomas, who studies evidence synthesis at University College London.

To read more, click here.