Machine Tops Humans in Fibrotic Lung Disease Classification

— Nearly instantaneous results also matched prognostic abilities

by Ian Ingram, Deputy Managing Editor, 番茄社区 September 18, 2018

PARIS -- Robots scored another win against humans, this time with an artificial intelligence (AI) program outperforming thoracic radiologists for classifying fibrotic lung diseases in a new study reported here.

Examining 150 high-resolution CT images of fibrotic lung disease, a deep learning algorithm showed better accuracy compared with 91 self-identified "expert" radiologists (median 73.3% versus 70.7%, respectively), with the AI system outperforming two-thirds of them, reported Simon L.F. Walsh, MD, of King's College Hospital Foundation Trust in London.

The results were presented at the European Respiratory Society (ERS) meeting and published simultaneously in the .

"Just before someone asks, it's not an attempt to replace radiologists," Walsh said toward the end of his presentation here. "That's always the first question."

"This is a diagnostic tool," he explained, pointing to the fact that while expert centers have top radiologists, idiopathic pulmonary fibrosis (IPF) patients are also seen in rural settings and in the community, and have to travel to referral centers for access to imaging expertise.

"Anything that improves or speeds up diagnostic accuracy in IPF essentially means less biopsies and that appears to me, at least in my discussion with patients, that that is one of the major concerns," said Walsh.

As fibrotic lung disease patients who have usual interstitial pneumonia (UIP) generally perform worse, Walsh's group tested this as well, and found that the algorithm was equivalent at distinguishing UIP cases from non-UIP cases (HR 2.88, 95% CI 1.79-4.61, P<0.0001) compared with the group of expert radiologists' majority opinion (HR 2.74, 95% CI 1.67-4.48, P<0.0001).

On Fleischner Society high-resolution CT criteria for UIP, median interobserver agreement was moderate between radiologists (κw=0.56) but good between the algorithm and radiologists (κw=0.64).

In a that accompanied the study, David Levin, MD, of the Mayo Clinic in Rochester, Minnesota, highlighted that due to the rare presentation and varied forms, radiologic diagnosis of interstitial lung disease can be a challenge, as can the further diagnosis of pulmonary fibrosis even for subspecialist radiologists.

"Although the results show that deep learning methods can classify fibrotic lung disease with essentially equivalent performance to subspecialist radiologists, there are several limitations," Levin said, noting in part that as deep learning algorithm is improved with ever-increasing amounts of data, only 929 scans made up the training set.

He also pointed to the fact that a gold standard for UIP diagnosis does not currently exist, and that the labeling of the training CT set could be a source of potential bias introduction as it was performed by a single radiologist.

"Despite these limitations, the overall performance of the algorithm was remarkable," said Levin.

The study from Walsh's group used 1,157 images of diffuse fibrotic lung disease to train (929 scans), validate (89), and initially test (139) the algorithm. To increase the data input for the algorithm, each image in the training set was divided up into up to 500 four-image sets, which were each separately analyzed.

The final test used 150 high-resolution CT images that had been identified by the expert thoracic radiologists in a previous study -- about one-third of the cases were of IPF.

"We demonstrated the generalizability of the algorithm by testing it on a data set which essentially had been labeled by a set of radiologists who were not part of the training process, which is important," said Walsh.

Interobserver agreement was similar between the radiologists' majority opinion and the algorithm (κw=0.69), and the majority opinion and each radiologist (κw=0.67), though the algorithm again outperformed over half (62%) of the radiologists.

The CT images were classified using 2011 international guidelines from the American Thoracic Society, ERS, and others for the diagnosis of IPF as well as the Fleischner Society diagnostic criteria for IPF.

Disclosures

Walsh is the founder of Thoracic.AI, a developer of machine learning applications for fibrotic lung disease. He also disclosed relationships with Boehringer Ingelheim, InterMune, Roche, Sanofi-Genzyme, and Bracco.

Co-authors reported relationships with Roche and Boehringer Ingelheim.

Levin reported no conflicts of interest.

Primary Source

The Lancet Respiratory Medicine

Walsh SLF, et al 鈥淒eep learning for classifying fibrotic lung disease on high-resolution computed tomography: A case-cohort study鈥� Lancet Respir Med 2018; DOI: 10.1016/S2213-2600(18)30286-8.

Secondary Source

The Lancet Respiratory Medicine

Levin DL 鈥淒eep learning and the evaluation of pulmonary fibrosis鈥� Lancet Respir Med 2018; DOI: 10.1016/S2213-2600(18)30371-0.