Using Artificial Intelligence For Improving Stroke
Using Artificial Intelligence For Improving Stroke
research-article20202020
TAN0010.1177/1756286420938962Therapeutic Advances in Neurological DisordersV Abedi, A Khan
a practical framework
© The Author(s), 2020.
Article reuse guidelines:
sagepub.com/journals-
permissions
Vida Abedi, Ayesha Khan, Durgesh Chaudhary, Debdipto Misra, Venkatesh Avula,
Dhruv Mathrawala, Chadd Kraus, Kyle A. Marshall, Nayan Chaudhary, Xiao Li,
Clemens M. Schirmer, Fabien Scalzo, Jiang Li and Ramin Zand
Correspondence to:
Abstract: Stroke is the fifth leading cause of death in the United States and a major cause Ramin Zand
Neuroscience Institute,
of severe disability worldwide. Yet, recognizing the signs of stroke in an acute setting is Geisinger Health
System, Stroke Program,
still challenging and leads to loss of opportunity to intervene, given the narrow therapeutic Geisinger Northeast
window. A decision support system using artificial intelligence (AI) and clinical data from Region, GRA Stroke Task
Force, American Heart
electronic health records combined with patients’ presenting symptoms can be designed Association, Department
of Neurosciences, 100 N
to support emergency department providers in stroke diagnosis and subsequently reduce Academy Ave, Danville, PA
the treatment delay. In this article, we present a practical framework to develop a decision 17822-2101, USA
rzand@geisinger.edu;
support system using AI by reflecting on the various stages, which could eventually improve ramin.zand@gmail.com
patient care and outcome. We also discuss the technical, operational, and ethical challenges Vida Abedi
Department of Molecular
of the process. and Functional Genomics,
Geisinger Health System,
Danville, PA, USA
Keywords: acute stroke, artificial intelligence, cerebrovascular disease/stroke, computer Biocomplexity Institute,
Virginia Tech, Blacksburg,
aided diagnosis, ischemic stroke, machine learning, stroke diagnosis, stroke in emergency VA, USA
department Ayesha Khan
Durgesh Chaudhary
Clemens M. Schirmer
Received: 21 May 2020; revised manuscript accepted: 2 June 2020. Neuroscience Institute,
Geisinger Health System,
Danville, PA, USA
Debdipto Misra
Introduction environment for providers, especially with the Dhruv Mathrawala
Division of Informatics,
Stroke is the fifth leading cause of death in the multiplicity of care protocols, and the dynamic Geisinger Health System,
United States and a significant cause of severe nature of patient care.8,9 Triage, consultations, Danville, PA, USA
Venkatesh Avula
disability in adults.1 Each year, around 800,000 admissions, discharge, and other steps in emer- Jiang Li
Americans experience a new or recurrent stroke.2 gency care are time-sensitive, complex, and always Department of Molecular
and Functional Genomics,
Rapid diagnosis and treatment of stroke is cru- changing to further improve efficacy and quality of Geisinger Health System,
cial and leads to improved outcomes and prog- care. Therefore, identifying potential stroke symp- Danville, PA, USA
nosis among patients treated within the ‘Golden toms can be challenging,10–12 especially when the Chadd Kraus
Kyle A. Marshall
Hour’.3,4 providers are in training.13,14 Besides, the risk of Department of Emergency
misdiagnosis can be higher among walk-in Medicine, Geisinger Health
System, Danville, PA, USA
However, strokes, especially posterior circulation patients,15 when the providers do not receive a pre-
Nayan Chaudhary
strokes, are associated with significant (>10%) arrival notification from emergency medical ser- Xiao Li
diagnostic error.5 The latter could be due to vices,16 or when a neurologist is not readily Genentech/Roche inc.,
South San Francisco, CA,
(1) some patients with acute stroke present with available for an urgent consultation.17–19 Scoring USA
non-focal symptoms such as dizziness, diplopia, systems for the diagnosis of stroke and recurrent Fabien Scalzo
dysarthria, or ataxia,6 which may not trigger a neu- stroke do not have a high sensitivity to diagnose the Department of Neurology,
University of California,
rology consult or a need for a more detailed neuro- posterior circulation stroke.20,21 Furthermore, Los Angeles, CA, USA
logical examination; (2) stroke is commonly these tools are also not automatic, and require that Department of Computer
Science, University of
misdiagnosed in younger patients7,8; and (3) the the physicians suspect stroke as a differential diag- California, Los Angeles,
emergency department (ED) is a challenging nosis to apply the scoring system. CA, USA
journals.sagepub.com/home/tan 1
Creative Commons Non Commercial CC BY-NC: This article is distributed under the terms of the Creative Commons Attribution-NonCommercial 4.0 License
(https://creativecommons.org/licenses/by-nc/4.0/) which permits non-commercial use, reproduction and distribution of the work without further permission
provided the original work is attributed as specified on the SAGE and Open Access pages (https://us.sagepub.com/en-us/nam/open-access-at-sage).
Therapeutic Advances in Neurological Disorders 13
2 journals.sagepub.com/home/tan
V Abedi, A Khan et al.
Figure1. Key steps for a stroke ML-enabled decision support system for EDs.
ED, emergency department; ML, machine learning.
dictionary,30 allowing for efficient translation and development, typically, an 80–20 split is per-
interoperability. The NLP pipeline will generate formed to test the model performance; that is
added insights, such as the polarity of the words 20% of the data is marked as unseen and is used
and context. for model testing, while 80% is used for model
development. Furthermore, as the number of
cases is likely to be significantly less than the
Designing the ML-enabled diagnostic tool number of controls, it is crucial to address the
The ML development process will consist of vari- class imbalance. Standard techniques to address
ous iterative steps for training, testing, and pre- class imbalance include up-sampling (by use of
dicting the probability of patients presenting to SMOTE algorithm, etc.) the minority class and/
ED with stroke. or down-sampling the majority class.31
Exploratory data analysis will help modelers under- Training predictive models are done by using the
stand feature distribution, multicollinearity among training dataset to identify the best performing
features, missing data, data quality. Features that candidate model. Models like logistic regression,
are irrelevant or partially relevant (such as proce- decision trees, random forest, autoencoders, and
dure codes that are no longer in active use), or neural networks can be used in the training pro-
highly sparse can be removed or merged. Feature cess. Typically, an interpretable framework such
selection will help reduce overfitting, training time, as logistic regression is used for benchmarking.
and improve the model accuracy. Feature engi- Finally, nested K-fold cross-validation applied to
neering, also an important step, can be used to 80% of the data should be performed for hyper-
construct robust high-level feature representations parameter tuning and making modeling frame-
from complex and high-dimensional concepts work choices – to avoid the underfitting and
(e.g. diagnoses, medications). During the model overfitting. Various model metrics such as the
journals.sagepub.com/home/tan 3
Therapeutic Advances in Neurological Disorders 13
area under the receiver operating characteristics cannot reveal a hyperacute stroke in the majority
curve, sensitivity, specificity, F-score, positive of cases, and it has reduced sensitivity for lacunar
predictive value (PPV), negative predictive value strokes.33 Despite the rapid increase in the use of
(NPV), and missed classification rates can be advanced neuroimaging, it may be challenging to
used to identify the final candidate model for reduce misdiagnosis of stroke as the use of urgent
implementation and testing, prospectively. These MRI to diagnose stroke in ED is still limited.34
metrics should be used for model selection only Nevertheless, in reality, a provider should con-
after careful evaluation to better understand the sider the possible stroke diagnosis to order addi-
clinical needs in actual care settings. In the case tional neuroimaging.
of stroke, the cost of misdiagnosis is asymmetric,
meaning that underdiagnosis (labeling true Patients’ notes at the point-of-care contain vital
stroke patients as non-stroke) might have a higher information that, if combined with the patients’
consequence than overdiagnosis. Therefore, the profile and medical history, can be ingested by an
system should enjoy a high sensitivity and NPV intelligent system to identify at-risk patients.
while keeping the specificity in a reasonable Implementation of such a system must be seam-
range. Finally, depending on the model used, less without affecting the ED workflow. If a
knowing the most influential driving features patient has a relatively significant chance of
helping the stroke prediction for each patient stroke, a ‘stroke alert’ will be generated when the
would be helpful and can provide insights into patient’s chart is viewed by the next ED provider.
the prediction and ultimately assist the physician The provider will then have a chance to act upon
in decision making. the alert, and, if needed, call stroke-alert, request
an urgent neurological consult, or order confirm-
atory imaging.
Workflow and system implementation
In a typical ED setting, a patient arrives at the
hospital, either by ambulance or by a private vehi- System adoption and evaluation
cle. Although our proposed ML-enabled clinical
decision support system is designed to work for Promoting adoption
all patients regardless of the arrival mode, we A methodical plan for promoting adoption should
believe such a system will be more beneficial be part of a thoughtful implementation. In gen-
among patients self-presenting with milder and eral, physicians have relatively positive attitudes
atypical symptoms. These patients meet a nurse toward the idea of decision support systems.35,36
at the point-of-care desk where they are asked However, many challenges including low specific-
preliminary questions while their vitals are ity,37,38 work-flow interruption,39–41 computer lit-
recorded. The patients’ symptoms and any perti- eracy and confusing interface,42,43 low confidence
nent information are entered in the EHR and can in the evidence,44 awareness of the information,45
be available to the NLP pipeline for modeling. requirement for a lot of data,41,46 interference with
The patient is then sent back to the waiting area. physician autonomy,35,47 or lack of relevance,41
The ED providers rely on the clinical presenta- limit the effective use and adoption of such sys-
tion and vitals to prioritize the patients and per- tems in many healthcare systems. “Alert fatigue”
forming the physical examination (which may not can be caused by poorly designed and imple-
include neurological examination), ordering labs, mented clinical decision support system.35,48–50
and, in some cases, imaging procedures. Too many alerts will discourage adoption.
4 journals.sagepub.com/home/tan
V Abedi, A Khan et al.
and (4) next-providers (inpatient/outpatient neu- better equipped to incorporate continual learning
rologists/hospitalists). The discussion objectives in training the model for improved clinical utility.
should focus on understanding the current work-
flow; how the intervention(s) may impact the Model generalizability. Developing generalizable
people and processes; what the care providers’ models require a comprehensive and multi-level
perceptions are; and how to achieve their buy-in view of the patients with an added effort to ensure
while communicating the current gaps, clinical adequate patient representation to reduce algo-
goals, and how the proposed system can help. rithmic bias. The utilization of data from two or
more centers will be important to develop gener-
alizable models. Techniques such as transfer
System evaluation learning can be designed to evaluate model gener-
A mixture of targeted chart-review, systematic alizability and transferability across health sys-
evaluation, and assessment of trends and rate of tems.53 These solutions are critical for the
misidentification is important. The process development of models that can function well in
should be designed to be agile and iterative. smaller systems, to drive technological advances
Prospective evaluation is critical and cannot be a for the mainstream, in rural and urban areas alike.
one-time process. As long as a decision support
system remains in operation, the model outcome
should be re-assessed at regular intervals, and Operational challenges
integration of variables with most and least prom- Implementation of an ML model to provide pre-
ising relevance can be reassessed. Fine-tuning dictions and recommendations in real-time in
should be subtle and iterative to ensure smooth EHR requires specialized programming exper-
and clear improvements in both effectiveness and tise and purchase of specialized products
performance. Presenting the summary findings to from the vendor, which could be prohibitive for
the stakeholders to demonstrate transparency, smaller healthcare systems with limited
gaps, areas of misclassifications, and the overall resources, especially since, for a real clinical util-
added value is important. ity, there will always be a need for maintenance
with added financial burden. Operational chal-
lenges also entail usability and adoption. A less
Challenges and opportunities discussed challenge is the need for continuous
learning. Feedback loop process creation can be
Technical challenges – tool or a tool to facilitate creating an automated process
model-dependencies to generate data used for continual learning of
Selection of the right mix of tools, techniques, and the model to improve performance; the latter
languages for productionalizing. Specific tools/ needs commitments from users to add needed
languages (e.g., Spark) might be conducive to information into the system. Adding additional
handling large data volumes; however, they might steps to the already busy schedules of the users is
not have mature data science libraries such as the a challenge; however, the development of tools
ones offered in Python to develop stable models. with clinical-experts-in-the-loop from the initial
The proper selection of tools will cause down- phase could provide opportunities for better
stream technical challenges, especially if the adoption.
designed pipeline is not agnostic to the imple-
mentation language.
Ethical challenges
Model drifting. Models deteriorate in terms of Defining how an “ethical” AI system should per-
their predictive power and clinical utility if not form in this context is somewhat subjective and
continuously adjusted. In healthcare, addressing encompasses our experience in the field and per-
model drift takes on a larger dimension since ception about how the AI software operates.
changing trends in population health manifests Overall, the generic goal would be to ensure fair-
itself as both concept drift and data drift. Con- ness, efficiency, efficacy, and patient safety. In an
tinuous domain adaptation is an active field of ideal scenario, a system would benefit all identi-
research to address such challenges.52 Healthcare fied groups equally; however, in practice, such a
systems that are in stable regions with low drop- goal is often impossible, and it will be necessary to
out rates (such as Geisinger Health System) are define an acceptable bias. Rigorous regulatory
journals.sagepub.com/home/tan 5
Therapeutic Advances in Neurological Disorders 13
6 journals.sagepub.com/home/tan
V Abedi, A Khan et al.
13. Arch AE, Weisman DC, Coca S, et al. Missed 26. Patel YR, Robbins JM, Kurgansky KE, et al.
ischemic stroke diagnosis in the emergency Development and validation of a heart failure
department by emergency medicine and with preserved ejection fraction cohort using
neurology services. Stroke 2016; 47: 668–673. electronic medical records. BMC Cardiovasc
Disord 2018; 18: 128.
14. Schrock JW, Glasenapp M, Victor A, et al.
Variables associated with discordance between 27. Collins GS, Reitsma JB, Altman DG, et al.
emergency physician and neurologist diagnoses Transparent reporting of a multivariable
of transient ischemic attacks in the emergency prediction model for individual prognosis or
department. Ann Emerg Med 2012; 59: 19–26. diagnosis (TRIPOD): the TRIPOD statement.
BMJ 2015; 350: g7594.
15. Mohammad YM. Mode of arrival to the
emergency department of stroke patients in the 28. White IR, Royston P, Wood AM, et al. Multiple
United States. J Vasc Interv Neurol 2008; 1: imputation using chained equations: issues and
83–86. guidance for practice. Stat Med 2011; 30: 377–399.
16. Tennyson JC, Michael SS, Youngren MN, et al. 29. Abedi V, Shivakumar MK, Lu P, et al. Latent-
Delayed recognition of acute stroke by emergency based imputation of laboratory measures from
department staff following failure to activate electronic health records: case for complex
stroke by emergency medical services. West J diseases. BioRxiv 2018; 275743.
Emerg Med 2019; 20: 342–350. 30. National Library of Medicine. Unified medical
17. Moulin T, Sablot D, Vidry E, et al. Impact language system (UMLS). https://www.nlm.nih.
of emergency room neurologists on patient gov/research/umls/index.html (2019, accessed 10
management and outcome. Eur Neurol 2003; 50: February 2020).
207–214. 31. Chawla NV, Bowyer KW, Hall LO, et al.
SMOTE: synthetic minority over-sampling
18. Falco FA, Sterzi R, Toso V, et al. The neurologist
technique. J Artif Intell Res 2002; 16: 321–357.
in the emergency department. An Italian
nationwide epidemiological survey. Neurol Sci 32. ER Inspector. https://projects.propublica.org/
2008; 29: 67–75. emergency/ (accessed 1 December 2019).
19. Morrison I, Jamdar R, Shah P, et al. Neurology 33. Kabra R, Robbie H and Connor SEJ. Diagnostic
liaison services in the acute medical receiving yield and impact of MRI for acute ischaemic
unit. Scott Med J 2013; 58: 234–236. stroke in patients presenting with dizziness and
vertigo. Clin Radiol 2015; 70: 736–742.
20. Antipova D, Eadie L, Macaden A, et al.
Diagnostic accuracy of clinical tools for 34. Chaturvedi S, Ofner S, Baye F, et al. Have
assessment of acute stroke: a systematic review. clinicians adopted the use of brain MRI for
BMC Emerg Med 2019; 19: 49. patients with TIA and minor stroke? Neurology
2017; 88: 237–244.
21. Chaudhary D, Abedi V, Li J, et al. Clinical risk
score for predicting recurrence following a cerebral 35. Varonen H, Kortteisto T and Kaila M. What
ischemic event. Front Neurol 2019; 10: 1106. may help or hinder the implementation of
computerized decision support systems (CDSSs):
22. Krittanawong C, Zhang HJ, Wang Z, et al. a focus group study with physicians. Fam Pract
Artificial intelligence in precision cardiovascular 2008; 25: 162–167.
medicine. J Am Coll Cardiol 2017; 69: 2657–
2664. 36. Bouaud J, Spano JP, Lefranc JP, et al. Physicians’
attitudes towards the advice of a guideline-based
23. Noorbakhsh-Sabet N, Zand R, Zhang Y, et al. decision support system: a case study with
Artificial intelligence transforms the future of OncoDoc2 in the management of breast cancer
health care. Am J Med 2019; 132: 795–801. patients. Stud Health Technol Inform 2015; 216:
24. Lee EJ, Kim YH, Kim N, et al. Deep into the 264–269.
brain: artificial intelligence in stroke imaging. 37. Van Der Sijs H, Mulder A, Van Gelder T, et al.
J Stroke 2017; 19: 277–285. Drug safety alert generation and overriding
in a large Dutch university medical centre.
25. Zuick S, Graustein A, Urbani R, et al. Can a
Pharmacoepidemiol Drug Saf 2009; 18: 941–947.
computerized sepsis screening and alert system
accurately diagnose sepsis in hospitalized floor 38. Van Der Sijs H, Aarts J, Vulto A, et al. Overriding
patients and potentially provide opportunities for of drug safety alerts in computerized physician
early intervention? A pilot study. J Intensive Crit order entry. J Am Med Informatics Assoc 2006; 13:
Care 2016; 2. 138–147.
journals.sagepub.com/home/tan 7
Therapeutic Advances in Neurological Disorders 13
39. Bergman LG and Fors UGH. Computer-aided 46. Gadd CS, Baskaran P and Lobach DF.
DSM-IV-diagnostics - Acceptance, use and Identification of design features to enhance
perceived usefulness in relation to users’ learning utilization and acceptance of systems for
styles. BMC Med Inform Decis Mak 2005; 5: 1. internet-based decision support at the point of
care. Proc AMIA Symp 1998; 91–95.
40. Curry L and Reed MH. Electronic decision
support for diagnostic imaging in a primary care 47. Khalifa M. Clinical decision support: strategies
setting. J Am Med Informatics Assoc 2011; 18: for success. Procedia Comput Sci 2014; 37:
267–270. 422–427.
41. Zheng K, Padman R, Johnson MP, et al. 48. McCoy AB, Thomas EJ, Krousel-Wood M, et al.
Understanding technology adoption in clinical Clinical decision support alert appropriateness: a
care: clinician adoption behavior of a point-of- review and proposal for improvement. Ochsner J
care reminder system. Int J Med Inform 2005; 74: 2014; 14: 195–202.
535–543. 49. Aakre CA, Dziadzko MA and Herasevich V.
42. Rousseau N, McColl E, Newton J, et al. Practice Towards automated calculation of evidence-
based, longitudinal, qualitative interview study based clinical scores. World J Methodol 2017; 7:
of computerised evidence based guidelines in 16–27.
primary care. Br Med J 2003; 326: 314–318. 50. Khairat S, Marc D, Crosby W, et al. Reasons for
43. Johnson MP, Zheng K and Padman R. Modeling physicians not adopting clinical decision support
the longitudinality of user acceptance of systems: critical analysis. JMIR Med Inform 2018;
technology with an evidence-adaptive clinical 6: e24.
decision support system. Decis Support Syst 2014; 51. Venkatesh V, Morris MG, Davis GB, et al. User
57: 444–453. acceptance of information technology: toward a
unified view. MIS Q 2003; 27: 425–478.
44. Sousa VEC, Lopez KD, Febretti A, et al. Use
of simulation to study nurses’ acceptance and 52. Lao Q, Jiang X, Havaei M, et al. Continuous
nonacceptance of clinical decision support domain adaptation with variational domain-
suggestions. Comput Inform Nurs 2015; 33: agnostic feature replay, http://arxiv.org/
465–472. abs/2003.04382. (2020, 9 March 2020)
Visit SAGE journals online 45. Terraz O, Wietlisbach V, Jeannot JG, et al. The 53. Wiens J, Guttag J and Horvitz E. A study in
journals.sagepub.com/ EPAGE internet guideline as a decision support transfer learning: leveraging data from multiple
home/tan
tool for determining the appropriateness of hospitals to enhance hospital-specific predictions.
SAGE journals colonoscopy. Digestion 2005; 71: 72–77. J Am Med Informatics Assoc 2014; 21: 699–706.
8 journals.sagepub.com/home/tan