A deep learning algorithm is comparable or even superior to common clinical measures for predicting infarct size and location up to a week following acute ischemic stroke, new research suggests.
Interestingly, the new approach corresponded well with MRI findings at 3 to 7 days without accounting for subsequent reperfusion, investigators note.
“We are getting better and better at predicting future tissue damage from baseline MRI imaging in acute stroke. Deep learning is better able to estimate the ultimate tissue damage in these settings,” senior investigator Greg Zaharchuk, MD, PhD, professor of radiology at Stanford University, California, told Medscape Medical News.
Zharchuk added that this technology could guide prognosis across a wider range of patients.
“These deep learning models surpass the current clinical methods of the diffusion-perfusion mismatch, mostly because most patients have partial reperfusion rather than complete or no reperfusion,” he said.
The findings were published online March 12 in JAMA Network Open.
Predicting the likely infarct size at 3 to 7 days “may help clinicians prepare for decompression treatment and aid in patient selection for neuroprotective clinical trials,” the researchers write.
The machine learning algorithm was initially “self-trained,” using datasets, to determine the most relevant neuroimaging features without human input.
The investigators used a specific subtype of machine learning called “convolutional neural networks.” They chose the U-Net algorithm developed at the University of Freiburg in Germany, “owing to its high computational efficiency, sensitivity, and accuracy for image segmentation tasks.”
The study included 182 patients (53% women; mean age, 65 years) who were enrolled from two previous trials, the Imaging Collaterals in Acute Stroke (iCAS) study and the Diffusion Weighted Imaging Evaluation for Understanding Stroke Evolution Study–2 (DEFUSE-2).
At baseline and before reperfusion therapy, all participants underwent 1.5T or 3T MRI, including diffusion-weighted imaging and perfusion-weighted imaging.
Of all participants, 32 had minimal reperfusion, defined as 20% or less. Another 67 participants had 80% or more, which was considered major reperfusion. For another 41 partial reperfusion patients, rates were between these values, and for 42 patients, reperfusion rates were unknown.
For the deep learning model, the median area under the curve was 0.92 (interquartile range [IQR], 0.87 – 0.96), indicating good correlation with traditional methods, such as time to maximum of the residue function (Tmax) and apparent diffusion coefficient (ADC) measures.
The researchers also gauged performance in comparison with traditional methods using the Dice Score Coefficient (DSC). The DSC ranges from 0 to 1, with greater values reflecting more overlap between algorithm predictions and MRI findings at 3 to 7 days.
An advantage of the DSC is that it incorporates both predicted lesion size and location, the researchers note.
Using a threshold score of 0.50 on the DSC, reflecting at least moderate correlation, the median DSC overlap of the model was 0.53 (IQR, 0.31 – 0.68), the sensitivity was 0.66 (IQR, 0.38 – 0.86), the specificity was 0.97 (IQR, 0.94 – 0.99), and the positive predictive value was 0.53 (IQR, 0.28 – 0.74).
Volume predicted from the model demonstrated “excellent correlation with true lesion volume” (P = .74; 95% confidence interval [CI], 0.66 – 0.80), the investigators add.
The volume error of the algorithm was 9 mL (IQR, –14 to 29), and the absolute volume error was 24 mL (IQR, 11 – 50).
Parsing by Reperfusion Rate
In addition to overall comparisons, the investigators evaluated algorithm performance by levels of subsequent reperfusion.
For example, they found that among patients with minimal reperfusion, the proposed model outperformed for positive predictive value and specificity while maintaining comparable DSC and sensitivity.
Similarly, for patients with major reperfusion, it outperformed for DSC and sensitivity. In these patients, the model tended to overestimate the final infarct lesion, whereas the ADC segmentation tended to underestimate the lesion.
In patients with partial or unknown reperfusion status, the model demonstrated moderate to excellent agreement between predicted lesion volume (P = .69; 95% CI, 0.51 – 0.82) and true lesion volume (P = .75; 95% CI, 0.58 – 0.86).
However, in terms of lesion volume, the proposed model did not show a significant difference from the true lesion (true lesion volume error, 6; IQR, −11 to 32 mL).
The model was trained without reperfusion information, “yet it had comparable performance in patients with and without major reperfusion compared with a common clinically used ADC and Tmax thresholding software package,” the researchers write.
The prediction time frame of 3 to 7 days is the time when acute vasogenic edema tends to create the largest lesions, so the algorithm “would be helpful to guide treatment decisions and coordinate clinical resources such as early preparation for decompression surgery and osmotherapy,” they add.
Despite its clinical promise, it could take some time before the U-Net algorithm sees widespread use.
“This software is not clinically available and has not undergone FDA review,” Zaharchuk said. “However, it is likely that algorithms similar to this should become available in the near future ― and will hopefully be incorporated into the scanner itself or [be available] from third-party vendors.”
Future studies in larger and more diverse populations are now needed to validate the results.
“Next, we would like to build a deep learning system to make ‘best case’ and ‘worst case’ predictions, much like current all-or-none diffusion-perfusion mismatch software does, but using a data-driven approach,” Zaharchuk said.
“Currently, we are focusing just on tissue damage, which is important but does not always correlate well with outcomes,” he added.
“Longer term, I would like to develop a tool that gives you information on stroke location but also can predict long-term functional outcome scores, since these better capture meaningful differences in stroke outcome.”
An AI Achievement
Commenting on the findings for Medscape Medical News, Cyrus A. Raji, MD, PhD, assistant professor of radiology and director of neuromagnetic resonance imaging at Washington University School of Medicine, St. Louis, Missouri, said the “key value added by brain imaging is in predicting outcomes” when it comes to stroke care.
“By using deep learning to predict the volume of an infarct 3 to 7 days later using only the first brain MRI scan a patient receives in the hospital, the Stanford group has demonstrated the ability of artificial intelligence to augment the actionable information obtained from neuroradiological imaging,” said Raji, who was not involved with the research.
Zaharchuk received grants from the National Institutes of Health during the conduct of the study and has received grants from GE Healthcare and Bayer AG. He has also received nonfinancial support from Nvidia Corporation and is a cofounder in Subtle Medical, Inc. Raji has reported no relevant financial relationships.
JAMA Netw Open. Published online March 12, 2020. Full text