Precision Medicine in Digital Pathology Via Image Analysis and Machine Learning

CHAPTER
Precision medicine in
digital pathology via image
analysis and machine
learning
8
Peter D. Caie, BSc, MRes, PhD 1, Neofytos Dimitriou, B.Sc 2,
Ognjen Arandjelovic, M.Eng. (Oxon), Ph.D. (Cantab) 2
1
School of Medicine, QUAD Pathology, University of St Andrews, St Andrews, United Kingdom;
2
School of Computer Science, University of St Andrews, St Andrews, United Kingdom
Introduction
Precision medicine
The field of medicine is currently striving toward more accurate and effective clin-
ical decision-making for individual patients. This can be through many forms of
analysis and be put into effect at multiple stages of a patient’s disease progression
and treatment journey. However, the overarching goal is for higher treatment success
rates with lower side effects from potentially ineffectual, but toxic, therapies, and of
course better patient well-being and overall survival. For example, understanding if
a specific treatment for an individual patient’s cancer may help their treatment or in
fact be detrimental to their overall survival, as is the case with cetuximab treatment
in colorectal cancer. This process of treating the patient as an individual, and not as a
member of a broader and heterogeneous population, is commonly termed precision
medicine and has traditionally been driven by advances in targeted drug discovery
with accompanying translatable companion molecular tests. These tests report on
biomarkers measured from patient samples and inform if specific drugs will be
effective for patients, normally based on the molecular profile of their diagnostic tis-
sue sample. The tests may derive from our knowledge of biological process, like
designing inhibitors against EGFR receptor pathways, or from machine learninge
based mining of large multiomic datasets to identify novel drug targets or resistance
mechanisms. In the current era of digital medicine, precision medicine is being
applied throughout the clinical workflow from diagnosis, prognosis, and prediction.
Large flows of data can be tapped from multiple sources and no longer solely
through molecular pathology. This is made possible by the digitization, and avail-
ability, of patient history and lifestyle records, clinical reports, and through the adop-
tion of digital pathology and image analysis in both the realm of research and the
clinic. In fact, prior to the interrogation of digitized histopathological datasets by im-
age analysis, in vitro high-content biology-based drug screens were being developed
Artificial Intelligence and Deep Learning in Pathology. https://doi.org/10.1016/B978-0-323-67538-3.00008-7 149
Copyright © 2021 Elsevier Inc. All rights reserved.
150 CHAPTER 8 Precision medicine in digital pathology via image
by the pharmaceutical industry [1]. These screens also applied image analysis, but to
cultured cells exposed to genetic of small molecular manipulation, to segment and
classify cellular structures before capturing large multiparametric datasets that
inform on novel targets or drug efficacy. Similar methodology taken from high-
content biology and image analysis can be applied to digital pathology. This is
the case for classical object thresholdebased image analysis or for artificial intelli-
gence. The overarching aim being to either quantify and report on specific bio-
markers or histological patterns of known importance, or by capturing unbiasedly
collected multiparametric data and pattern recognition across segmented objects
and whole-slide images (WSI). The aim of distilling and reporting on the extracted
data from digital pathology images is to allow for the stratification of patients into
distinct groups that may inform the clinician on their optimal and personalized treat-
ment regimen.
Digital pathology
Digital pathology, the high-resolution digitization of glass-mounted histopatholog-
ical specimens, is becoming more common place in the clinic. This disruptive tech-
nology is on track to replace the reporting of glass slides down a microscope, as has
been the tradition for centuries. There remain certain obstacles to overcome for
wide-scale adoption of digital pathology; such as IT infrastructure, scanning work-
flow costs, and the willingness of the pathology community. However, as more in-
stitutes trend to full digitization, these obstacles diminish. The adoption of digital
pathology holds advantages over traditional microscopy in teaching, remote report-
ing and image sharing, and not least the ability to perform image analysis on the
resultant digitized specimens.
In essence, the technology is currently moving from the glass slide and micro-
scope to the digital image and high-resolution screen, although the manual viewing
and diagnosis remains the same. Clinical applications using digital pathology and
WSIs are currently restricted to the primary diagnosis of H&E stained slides. How-
ever, future applications, such as immunofluorescence, can bring advantages to dig-
ital pathology. Indeed, scanner vendors are frequently combining both brightfield
and multiplexed immunofluorescence visualization into their platforms. Immunoflu-
orescence allows the identification, classification, and quantification of multiple cell
types or biomarkers colocalized at the single-cell resolution and on a single tissue
section. The importance of this capability is becoming increasingly apparent as
we realize that the complex intercellular and molecular interactions within the tumor
microenvironment, on top of cellular morphology, play a vital role to a tumor’s pro-
gression and aggressiveness.
Humans are as adept at reporting from digital pathology samples as they are from
glass mounted ones. They do this by identifying diagnostic and prognostic patterns
in the tissue, while at the same time disregarding artifact and nonessential histology.
However, an inherent human setback in the field of pathology has been the standard-
ization and consistency of inter- and intraobserver reporting. This is specifically true
Applications of image analysis and machine learning 151
for the identification and semiquantification of more discrete and subtle morphol-
ogies as well as, for example, the counting of specific cell types, mitotic figures,
or biomarker expression across WSIs [2e4]. The automated analysis of such fea-
tures by computer algorithms can overcome these flaws and provide objective and
standardized quantification. With ongoing research providing much needed evi-
dence of the capability of image analysis and artificial intelligence to accurately
quantify and report on molecular and morphological features, it is only a matter
of time before these too translate into the clinic.
The use of image analysis and artificial intelligence can be knowledge driven or
data driven. The next section of this chapter will discuss how both methodologies
can be applied to digital pathology before we expand on the theory and concepts
behind the various artificial intelligence models commonly used in the field.
Applications of image analysis and machine learning

Knowledge-driven image analysis
The first applications of image analysis in digital pathology were the quantification of
previously identified histopathological features or immunohistochemical results of
known pathological significance. This image analysis typically relies on user-
defined thresholds of either color or fluorescence intensity coupled with classifications
based on object shape, extent, or texture. Examples of histopathological features quan-
tified in such a manner include the quantification of tumor buds, lymphovascular den-
sity and invasion, or immune infiltrate [5,6]. Similarly, the quantification of protein
expression through immunolabeling, such as pHH3, PDL-1, or EGFR, whether study-
ing predictive or prognostic outcomes, can also be quantified in such a manner [7e9].
This image analysis can be applied to user-defined regions of interest through manual
annotations, or across WSIs. A further advantage of the automatic reporting of such
features, over and above reproducibility, is that they can be quantified across a contin-
uum instead of traditional semiquantified categorical reporting. The analysis of
continuous data is far more amenable to personalized pathology through machine
learning than creating distinct categories to group patients.
Machine learning for image segmentation

Pathologists regularly quantify features from specific locations within the tissue sec-
tion, for example, only within the tumor or stromal cells, or at the cancer’s invasive
margin. However, to manually annotate images to differentiate these regions of in-
terest prior to performing image analysis is a subjective and tedious task, specifically
if needed to be applied to the large patient cohorts required for statistically relevant
studies. One can therefore apply machine learning algorithms, such as random for-
est, directly to the images to overcome the need for manual annotations. This ma-
chine learningebased classification relies on the user teaching the algorithm what
the specific regions are that they want to segment. They do so by selecting examples
across a training subset of images. Once the algorithm has learned the difference be-
tween the regions to be differentiated, it can be applied to a larger sample set in order
to automatically segment the image of the tissue. An example of this would be using
an antibody against cytokeratin to label the tumor and use this marker, on top of the
cancer cell morphology, to differentiate the tumor from the stroma (Fig. 8.1A). Once
this is performed, one can employ the before-mentioned threshold-based image
analysis to quantify, for example, Ki67 expression in only the tumor (Fig. 8.1B), tu-
mor buds only at the invasive margin [10], or CD3 and CD8 positive lymphocytic
infiltration within either the tumor core or invasive margin [6] (Fig. 8.2).
Deep learning for image segmentation

There exists great inter- and intrapatient heterogeneity, specifically in the molecular
and phenotypic makeup of a patient’s tumor. This also equates to great variation
within the images, not only morphologically, but also when applying a specific marker
(A1) (A2) (A3) (A4)
(B1) (B2) (B3) (B4)
FIGURE 8.1 Automated tumor to stroma segmentation and cellular classification.

(A) A machine learning approach for automatic tissue segmentation using Definiens
Tissue Studio software. In this example, segmenting the tumor from the stroma. Ai) A
colorectal TMA core labeled by immunofluorescence for tumor (Pan Cytokeratin, green)
and nuclei (Hoechst, blue). Aii) The software automatically segments the image based on
pixel heterogeneity (blue outline). Aiii) The user tags segmented example objects of the
features to be segmented, here turquoise for stroma and purple for tumor. The machine
learning algorithm learns these features and automatically segments the rest of the
images in the project, as demonstrated in Aiv. (B) The same TMA core was labeled for
ki67, which is coregistered in Bi in red. Bii) shows all nuclei automatically segmented
(blue outline) by the Definiens software, prior to the exclusion of stroma nuclei (gray) and
the retaining of tumor nuclei (blue) in Biii. Biv) shows the tumor nuclei being classified
positive for ki67 expression (yellow).
(A)
Invasive margin
Core
(B) (C)
FIGURE 8.2 Quantification of lymphocytic infiltration with the tumor’s invasive margin or
core.
(A) A whole slide image of a colorectal cancer tissue section labeled for tumor cells
(green) and nuclei (blue). The image analysis in this figure is performed using Indica Labs
HALO software. The red outline is the automatically detected deepest invasion of tumor
cells and the inset shows a zoomed in example of the image segmented into an invasive
margin (green) and the tumor core (blue). The purple square denotes where Figure (B
and C) originate from in the invasive margin. (B) Multiplexed immunofluorescence
visualized CD3 (yellow) and CD8 (red) positive lymphocytes. (C) Automated quantification
of these lymphocytes within just the invasive margin of the tumor.
to differentiate regions of interest for image analysis. Furthermore, the tissue spec-
imen is imperfect, and so, therefore, is the digitized WSI. These imperfections can
originate from multiple stages of tissue preparation. They can result in folds, tears,
or uneven thickness of the tissue to more subtle morphological differences due to
the ischemia or fixation times. If performing immunolabeling, nonspecific staining,
edge-effect, or autofluorescence can further create confounding issues for automated
image analysis. Examples of such artifacts can be seen in Fig. 8.3. All of the above
(A) (B) (C) (D) (E)
(F) (G) (H) (I) (J)
FIGURE 8.3 Examples of image artifact within digital pathology.

(AeE) Examples of artifact from brightfield image capture. (A) Folded over section of a
TMA core. (B) Cutting artifact (dark pink line) and piece of dust (gray) on an H&E-stained
image. (C) Out of focus and incorrectly stitched image of an H&E-stained image. (D) Tear
and fold in H&E-labeled tissue. (E) Out-of-focus section bordered by a foreign object on
the H&E-labeled specimen. (FeJ) Examples of artifact from fluorescence image capture.
(F) A tear in a tissue section resulting in nonspecific fluorescence. (G) Cutting artifact
(bright lines). (H) Out-of-focus nuclei. (I) Illumination artifact resulting in large blue
squares. (J) Autofluorescence from hair in a urine cytology sample (blue lines).
may cause inaccurate reporting, such as false positives, when applying image analysis
and machine learning to segment tissue across large patient cohorts. However, many
of these issues can be overcome by employing deep learning architecture to segment
the digital WSI. To do this, trained experts can annotate the regions of interest selected
for quantification, while further identifying and training the algorithm to ignore the
tissue artifact [11]. The human brain is extraordinary at pattern recognition and the
pathologist routinely and automatically ignores imperfect tissue specimens in order
to hone in on the area containing the information needed to make their diagnosis or
prognosis. The ever developing sophistication of deep learning architectures now al-
lows this level of analysis by automated algorithms.
For accurate tissue segmentation, using either machine learning or deep learning
methodology, a strong and standardized signal to noise is required within the feature
one is using to differentiate regions of interest. However, the inherently heteroge-
neous sample is also reflected in the marker used for segmentation, and this may
vary in brightness or intensity between patient samples; and even within the same
sample. Image color and intensity standardization algorithms can be employed prior
to the analysis of the image by artificial intelligence [12]. This can lead to a more
accurate fully automated tissue segmentation across diverse and large patient co-
horts; however, there is controversy over whether this is the best method to achieve
this goal and whether this methodology may reduce real diagnostic information.
Deep learning is not only being applied to segment tissue prior to biomarker or
cellular quantification but also has the ability to recognize histopathological patterns
in digitized H&E-stained tissue sections (Fig. 8.4). This has been demonstrated in
studies where the expert pathologist has annotated features of significance in order
to train the deep learning algorithms to identify these in unseen test and validation
sets [13,14]. The clinical application and methodology that allow the computer to
visualize and recognize more subtle prognostic associated patterns are covered in
(A) (B)
(C) (D)
FIGURE 8.4 Tissue segmentation performed by deep learning.

When segmenting tumor from stroma without the aid of a tumor visualization marker,
shallow learningebased image segmentation is usually inaccurate. Deep learning can be
applied in this instance for accurate tissue segmentation. These examples utilize Indica
Labs HALO AI software to segment tumor from stroma. (A) A TMA core stained with only
hematoxylin. (B) HALO AI algorithm trained to recognize only tumor cells (red) within the
hematoxylin-stained digital image. (C) The software next classifies each tumor nuclei
(dark blue). (D) Deep learning algorithm’s ability to be trained to recognize tumor (green
overlay) from stroma (red overlay) and from H&E-stained digital images.
more detail elsewhere in this book. Algorithms such as these are being developed to
aid the pathologist in their diagnosis, and it is only a matter of time until they are
applied routinely in the clinical workflow of pathology departments. Currently, these
algorithms are not being designed to automatically report on patient samples,
without any human verification, rather to act as an aid to the pathologist in order
to increase the speed of their reporting, for example, by highlighting the areas of in-
terest, as a method to triage urgent cases or as a second opinion. However, as deep
learning becomes more sophisticated, there is a strong possibility that in the future,
computer vision algorithms may perform an aspect of autonomous clinical report-
ing. Later in this chapter, we discuss what regulatory concerns to address when
designing an algorithm for translation into the clinic.
Deep learning architectures are now able to predict patient molecular subgroups
from H&E-labeled histology. They do this from only the morphology and histolog-
ical pattern of the tissue sample [15,16]. This may be quite remarkable to imagine;
however, the histopathologist would most likely have predicted this to be possible.
They have been fully aware of the complex and important variations of morphology
present in the tissue and how they affect patient outcome; even if they have not been
able to link these to molecular subtypes. This, however, has real implications beyond
academic research. Molecular testing is expensive and requires complex instrumen-
tation. If the same information, relevant to personalized pathology, can be gleaned
from the routine and cheap H&E-labeled section, this could have significant mone-
tary impact when calculating health economics.
Spatial resolution
We have briefly covered the importance of reporting the tissue architecture in the
field of precision medicine when quantifying biomarkers of interest, as opposed to
other molecular technology that destroys the tissue and thus the spatial resolution
of its cellular components. Histopathologists know that context is key to an accurate
diagnosis and prognosis. The tumor microenvironment is complex, with many
cellular and molecular interactions that play a role in inhibiting or driving tumor pro-
gression and responses to therapy. The quantification of a stand-alone biomarker,
even by image analysis, may not be enough to predict accurate prognosis or predic-
tion for an individual patient. This is the case for PDL-1 immunohistochemical
testing, where even patients with PDL-1 positive tumors may not respond to anti-
PDL-1 therapy [17]. Similarly, there is an advantage to quantifying prognostic his-
topathological features such as lymphocytic infiltration or tumor budding in distinct
regions within the tumor microenvironment. Traditionally, image analysis has quan-
tified a single prognostic feature across a single tissue section and with proven suc-
cess at patient stratification. However, by applying multiplexed
immunofluorescence, it is now possible to visualize multiple biomarkers and histo-
logical features within a single tissue section. Image analysis software can further-
more calculate and export the exact x and y spatial coordinates of each feature of
interest across the WSI, or recognize specific patterns within the interacting cellular
milieu of the tumor microenvironment (Fig. 8.5). This not only brings the advantage
of measuring more than one prognostic feature, but also allows new insights into the
understanding of disease progression based on quantifying novel interactions at the
spatial resolution of the tissue. This was demonstrated by Nearchou et al. who
showed that the density of tumor budding and immune infiltrate were significantly
associated with stage II colorectal cancer survival, but furthermore that their specific
interaction added value to a combined prognostic model [6]. Studies such as this
show that complex spatial analysis may be key to the success of accurate prognosis
for the individual patient.
Machine learning on extracted data

The image analysis solutions described above allow accurate image segmentation
and cellular and molecular classifications of known biomarkers or histopathological
features. The quantification of these known features can be utilized and applied to
deliver personalized pathology. However, image analysis can further extract a wealth
of unbiasedly collected data across multiple classified objects and their spatial inter-
actions. Such data may be the specific density of specific objects surrounding others,
or their distance to each other, as well as a wealth of extent, shape, and texture mea-
surements of individual objects. The value of these data is often not known a priori
(A) (B) (E)
(C) (D)
FIGURE 8.5 Quantifying the spatial resolution of cellular subpopulations.

This figure demonstrates how CD3 and CD8 positive lymphocyte densities can be
mapped to tumor cells when multiplexed immunofluorescence is quantified using Indica
Lab HALO software. Immunofluorescence labeled CD3 (yellow) and CD8 (red) positive
lymphocytes (A) segmented and quantified (B). (C) Tumor cells (green) captured from the
same image as (A) and segmented and quantified (D). The exported x and y spatial
coordinates of the tumor cells (green dots) are plotted alongside the coordinates of the
CD3 (yellow dots) and CD8 (red dots) positive lymphocytes(E). This allows for the spatial
statistics of the lymphocytes within a 50-mm diameter of a tumor cell being calculated.
and is difficult, if not impossible, to sort and analyze by eye. Machine learning can
be applied to large datasets from both molecular and digital pathology in order to
understand the optimal features that allow patient stratification, and thus clinical
decision-making, in the field of personalized medicine. However, caution must be
taken when deciding on which machine learning algorithm to apply to your data.
If one model is superior at analyzing one dataset, based on, e.g., the search for
the optimal area under the receiver operator, it does not mean that the same algo-
rithm will be as successful at analyzing a second and distinct dataset under similar
computational restrictions. This forms part of the “no free lunch theorem” [18] of
which we will touch upon again later in the chapter. In plain terms, different machine
learning algorithms are superior to others when applied to specific datasets. For
example, some algorithms excel at analyzing data with low dimensionality (such
as K-Nearest Neighbor) and, however, become intractable or return poor results
when the data dimensions increase. On the other hand, random forests excel at
analyzing high-dimensional data. Similarly, some models are better than others at
separating data in a linear fashion. As we rarely know a priori which model, or
which settings of their hyperparameters, is optimal at separating a specific dataset,
it is prudent to test multiple machine learning methodologies across a single dataset.
Automated workflows can be designed that split data into balanced training and test
sets, apply feature reduction algorithms (if needed), test multiple machine learning
models, and set their hyperparameters before returning the model and features used
to best separate ones data and answer the clinical question being asked [19].
An example that demonstrates the usefulness of combining many of the topics
that we have discussed in this chapter is that of Schmidt et al., who designed an im-
age analysis and deep learning workflow to better predict a patient’s response to ipi-
limumab [20]. They combined basic image analysis thresholding to classify
lymphocytic infiltrating cells and applied deep learning to negate artifact and necro-
sis in the images. Furthermore, they used pathologist annotations to train for specific
regions of interest, where they compartmentalized the tumor and stroma prior to
quantification of the lymphocytic contexture. Finally, they applied multiple methods
of analyzing their resultant data before reporting the final optimal model that pre-
dicts response to treatment.
Beyond augmentation
To make deep learning an effective method for patient diagnosis in the clinic, there
must be a sufficient amount of labeled annotations and a wide variety of samples to
be representative of the larger population. In the case of deep learning, this requires
training and validation on 1000s of patient samples, obtained from multiple interna-
tional institutes and prepared by multiple individuals. This is not an easy or fast task
to perform. A major drawback to the field is a lack of such large well-annotated and
curated datasets that are available to data scientists. However, there is a wealth of
data held in each hospital located in their glass microscope slide archives and that
go back decades. These data can be traced back to each patient treated in that
Practical concepts and theory of machine learning 159
institute along with their clinical reports. A sample tissue section from each patient
in the archive will be stained with H&E, and as each prospective patient’s samples
are also stained with H&E, it makes sense to concentrate deep learning efforts on
digital pathology samples prepared with this stain. The bottleneck is therefore not
access to patient samples but the expert’s digitized annotations of regions of interest
that pertain to diagnosis. To overcome this bottleneck, researchers are forgoing im-
age level annotations and developing weakly supervised deep learning architectures
that rely only on slide level annotations, namely the diagnosis of the patient. Simply
put, the computer is not told where in the image the cancer is, rather that somewhere
in the image there are cancer cells. This methodology has been shown to be effective
and relies on the data-driven analysis of patients, where the computer vision algo-
rithm identifies subtle and complex morphologies that relate to diagnosis or prog-
nosis. Thus, the machine can now inform the human on what is of pathological
significance within the patient’s individual tissue specimen. These patterns may
have gone unnoticed to date, or have been too complex to allow for a standardized
reporting protocol to be produced by human effort. This type of methodology has
been tested in colorectal cancer [21], prostate cancer, and lymph nodal metastasis
of breast cancer [22]. Further information on these methodologies can be found in
the following chapter.
Practical concepts and theory of machine learning

Machine learning and digital pathology
The existing work in the realm of digital pathology, which analyzes various types of
clinical data, could be conceptually organized in a variety of ways, that is, according
to different criteria. Considering the diversity of pathologies of possible interest, one
approach would be to consider solutions developed with specific diseases in mind, or
indeed tissues or populations; another criterion would be based on the type of anno-
tations that is training feedback, available for learning; yet another could distinguish
between different types of input data and the modalities used to acquire them.
For reasons which will become apparent as we develop our narrative, herein we
found it useful to start with a broad categorization of work in the area into two cate-
gories in terms of their technical approach to the problem. The first of these could be
roughly described as involving “conventional” (or “traditional”) computer vision, and
the second one as deep learning based. This distinction is somewhat loose but in that it
reflects the current trends in the field, namely the shift in focus from the former to the
latter, it will be useful for illustrating the driving challenges and opportunities in the
field, as well as for highlighting potential pitfalls and limitations of the current
thinking, and for explaining our thoughts and recommendations for future work.
Considering the variety of challenges that a pathologist has to deal with in the
analysis of real-world cases, it should come as no great surprise to the reader that
algorithmic solutions addressing a wide range of possibly useful tasks have been
explored and proposed. Much, indeed probably most, of research has focused on
diagnostic tasks or prognostic stratification (e.g., high vs. low risk). Others provide a
more assistive role to a human expert by providing automatic or semiautomatic seg-
mentation of images, classification of cells, detection of structures of interest (cells,
tissues types, tumors, etc.) from images, and so on. Although future work may lead
to the fully automatic diagnosis of patients, currently the drive is for clinically trans-
ferable tools to aid the pathologist in their diagnosis.
Common techniques
The common clinical questions highlighted in the previous section can all be broadly
seen as to have the same form as the two most frequently encountered machine
learning paradigms: namely those of classification (e.g., “is a disease present or
not?” and “is the patient in the high risk category or not”?) and regression (e.g.,
“what is the severity of disease?” and “what is the patient’s life expectancy?”).
Therefore it can hardly come as much of a surprise to observe that most of the
work in the area to date involves the adoption and adaptation of well-known and
well-understood existing techniques, and their application on pathology data.
Here we give the reader a flavor of the context, and pros and cons, of some of these
that have been applied to analyzing the multiparametric data extracted from the im-
age analysis of digital pathology specimens.
Supervised learning
Even the mere emergence of digitization to clinical pathology, that is the use of
computerized systems for storing, logging, and linking information, has led to the
availability of vast amounts of labeled data as well as the associated meta-data.
The reduced cost of computing power and storage, increased connectivity, and wide-
spread adoption of technology have all contributed greatly to this trend which has
escalated yet further with the rising recognition of the potential of artificial intelli-
gence. Consequently, supervised machine learning in its various forms has attracted
a great amount of research attention and continues to be one of the key focal points
of ongoing research efforts.
Both shallow and deep learning algorithms have been successfully applied to
clinical problems. Deep learning strategies based on layer depth and architecture
of neural networkebased models have been discussed in detail in Chapters 2e4.
This section will therefore focus on the mathematical underpinnings of shallow
learning algorithms such as Naı̈ve Bayes, logistic regression, support vector ma-
chines, and random forests. In contrast, the mathematics of deep learning models
are highly complex, as they encompass multilayer perceptron models, probabilistic
graphical models, residual and recurrent networks, reinforcement and evolutionary
learning, and so on. Detailed overview of these is presented in Chapters 2e4, but the
mathematics involved are beyond the scope of the book. For a rigorous mathematical
treatment of deep learning, see Deep Learning by Goodfellow, Bengio, and Cour-
ville, or any of the other modern texts in the field.
Naı¨ve Bayes assumptionebased methods

Naı̈ve Bayes classification applies the Bayes theorem by making the “naı̈ve”
assumption of feature independence. Formally, given a set of n features x1,., xn,
the associated pattern is deemed as belonging to the class y which satisfies the
following condition:
Y
n
y ¼ arg max PðCj Þ pðxi jCj Þ 8.1
j
i¼1
where P (Cj) is the prior probability of the class Cj and p(xijCj) is the conditional
probability of the feature xi given class Cj (readily estimated from data using a su-
pervised learning framework) [23].
The key potential weakness of naı̈ve Bayesebased algorithms, be they regres-
sion or classification oriented, is easily spotted, and it lies in the unrealistic assump-
tion of feature independence. Yet, somewhat surprisingly at the first sight, these
simple approaches often work remarkably well in practice and often outperform
more complex and, as regards the fundamental assumptions of feature relatedness,
more expressive and more flexible models [19].
There are a few reasons why this might be the case. One of these includes the
structure of errorsdif the conceptual structure of data relatedness under a given rep-
resentation is in a sense symmetrical, errors in the direction of overestimating con-
ditional probabilities and those in the direction of underestimating them can cancel
out in the aggregate, leading to more accurate overall estimates [24]. Another
equally important factor contributing to often surprisingly good performance of
methods which make the naı̈ve Bayes assumption emerges as a consequence of
the relationship between the amount of available training data (given a problem
of a specific complexity) and the number of free parameters of the adopted model.
It is often the case, especially considering that in digital pathology class imbalance
poses major practical issues, that more complex models cannot be sufficiently well
trained; thus, even if in principle able to learn a more complex functional behavior,
this theoretical superiority cannot be exploited.
While a good starting point and a sensible baseline, naı̈ve Bayesebased methods
are in the right circumstances outperformed by more elaborate models, some of
which we summarize next.
Logistic regressionebased methods

Logistic regression is another widely used, well-understood, and often well-
performing supervised learning technique. In logistic regression, the conditional
probability of the dependent variable (class) y is modeled as a logit-transformed
multiple linear regression of the explanatory variables (input features) x1, ., xn:
1
PLR ðy ¼ 1jx; wÞ ¼ T : 8.2
1 þ eyw x
The model is trained (i.e., the weight parameter w learned) by maximizing the
likelihood of the model on the training dataset, given by:
Y
2 Y
2
1
Prðyi jxi ; wÞ ¼ T ; 8.3
i¼1 i¼1
1 þ eyi w xi
penalized by the complexity of the model:
1
pffiffiffiffiffiffi e2s2 w w ;
1 T
8.4
s 2p
which can be restated as the minimization of the following regularized negative log-
likelihood:
X
2
log 1 þ eyi w xi þ wT w:
T
L¼C 8.5
i¼1
A coordinate descent approach, such as the one described by Yu et al. [25], can
be used to minimize L.
Support vectorebased methods

Support vector machines perform classification by constructing a series of class
separating hyperplanes in a high-dimensional (potentially infinitely dimensional)
space into which the original input data are mapped [26]. For comprehensive detail
of this regression technique, the reader is referred to the original work by Vapnik
[27]; herein we present a summary of the key ideas.
In the context of support vector machines, the seemingly intractable task of map-
ping data into a very high-dimensional space is achieved efficiently by performing
the aforesaid mapping implicitly, rather than explicitly. This is done by employing
the so-called kernel trick which ensures that dot products in the high-dimensional
space are readily computed using the variables in the original space. Given labeled
training data (input vectors and the associated labels) in the form {(x1, y1), ., (xn,
yn)}, a support vector machine aims to find a mapping which minimizes the number
of misclassified training instances, in a regularized fashion. As mentioned earlier, an
implicit mapping of input data x / F(x) is performed by employing a Mercer-
admissible kernel [28] k(xi, xj) which allows for the dot products between mapped
data to be computed in the input space: F(xi)$F(xj) ¼ k(xi, xj). The classification
vector in the transformed, high-dimensional space of the form
X
n
w ¼ ci yi Fðxi Þ 8.6
i¼1
is sought by minimizing
X
n
1X n X n
ci yi ci kðxi ; xj Þyj cj 8.7
i¼1
2 i¼1 j¼1
P
n
subject to the constraints ci yi ¼ 0 and 0 ci 1=ð2nlÞ. The regularizing
i¼1
parameter l penalizes prediction errors. Support vectorebased approaches usually
perform well even with relatively small training datasets and have the advantage
of well-understood mathematical behavior (which is an important consideration in
the context of regularly compliance, among others).
Stepping back for a moment from the technical detail, intuitively what is
happening here is that the algorithm is learning which class exemplars are the
“most problematic” ones, i.e., which exemplars are nearest to the class boundaries
and thus most likely to be misclassified. These are the support vectors that give
the approach its name. Inspection of these is insightful. Firstly, a large number of
support vectors (relative to the total amount of training data) should immediately
raise eyebrows as it suggests overfitting. Secondly, by examining which exemplars
end up as support vectors, an understanding of the nature of learning that took place
can be gain as well as of the structure of the problem and data representation, which
can lead to useful and novel clinical insight.
Nonparametric, k-nearest neighborebased methods

The k-nearest neighbor classifier classifies a novel pattern comprising features x1,
., xn to the class dominant in the set of k-nearest neighbors to the input pattern
(in the feature space) among the training patterns with known class memberships
[29]. The usually employed similarity metric used is the Euclidean distance.
Random forests
Random forest classifiers fall under the broad umbrella of ensemble-based learning
methods [30]. They are simple to implement, fast in operation, and have proven to be
extremely successful in a variety of domains [31,32]. The key principle underlying
the random forest approach comprises the construction of many “simple” decision
trees in the training stage and the majority vote (mode) across them in the classifi-
cation stage. Among other benefits, this voting strategy has the effect of correcting
for the undesirable property of decision trees to overfit training data [33]. In the
training stage, random forests apply the general technique known as bagging to in-
dividual trees in the ensemble. Bagging repeatedly selects a random sample with
replacement from the training set and fits trees to these samples. Each tree is grown
without any pruning. The number of trees in the ensemble is a free parameter which
is readily learned automatically using the so-called out-of-bag error [29].
Much like in the case of naı̈ve Bayese and k-nearest neighborebased algo-
rithms, random forests are popular in part due to their simplicity on the one hand,
and generally good performance on the other. However, unlike the former two ap-
proaches, random forests exhibit a degree of unpredictability as regards the structure
of the final trained model. This is an inherent consequence of the stochastic nature of
tree building. As we will explore in more detail shortly, one of the key reasons why
this characteristic of random forests can be a problem in regulatory reasonsd
clinical adoption often demands a high degree of repeatability not only in terms
of the ultimate performance of an algorithm but also in terms of the mechanics as to

how a specific decision is made.
Unsupervised learning
We have already mentioned the task of patient stratification. Indeed, the need for
stratification emerges frequently in digital pathology, for example, due to the hetero-
geneity of many diseases or differential response of different populations to treat-
ment or the disease itself [19].
Given that the relevant strata are often unknown a priori, often because the lim-
itations imposed by previously exclusively manual interpretation of data and the
scale at which the data would need to be examined to draw reliable conclusions,
it is frequently desirable to stratify automatically. A common way of doing this is
by means of unsupervised learning, by applying a clustering algorithm such as a
Gaussian mixture model or more frequently due to its simplicity and fewer free pa-
rameters, the k-means algorithm [21]. Then any subsequent learning can be per-
formed in a more targeted fashion by learning separate models for each of the
clusters individually.
Let X ¼ {x1, x2, ., xn} be a set of d-dimensional feature vectors. The k-means
algorithm partitions the points into K clusters, X1, ., XK, so that each datum be-
longs to one and only one cluster. In addition, an attempt is made to minimize the
sum of squared distances between each data point and the empirical mean of the cor-
responding cluster. In other words, the k-means algorithm attempts to minimize the
following objective function:
k X
X
JðX1 ; .; Xk Þ ¼ kci xk2 ; 8.8
i¼1 x˛Xi
where the empirical cluster means are calculated as:

X
ci ¼ x=jXi j: 8.9
x˛Xi
The exact minimization of the objective function in Ref. [1] is an NP-hard prob-
lem [34]. Instead, the k-means algorithm only guarantees convergence to a local
minimum. Starting from an initial guess, the algorithm iteratively updates cluster
centers and data-cluster assignments until (1) a local minimum is attained, or (2)
an alternative stopping criterion is met (e.g., the maximal desired number of itera-
tions or a sufficiently small sum of squared distances). The k-means algorithm starts
from an initial guess of cluster centers.
Often, this is achieved simply by choosing k data points at random as the centers
of the initial clusters, although more sophisticated initialization methods have been
proposed [35,36]. Then, at each iteration t ¼ 0, . the new datum-cluster assignment
is computed:
Image-based digital pathology 165

ðtÞ ðtÞ 2
Xi ¼ x: x ˛ Xârg minx cj ¼ i : 8.10
j
In other words, each datum is assigned to the cluster with the nearest (in the
Euclidean sense) empirical mean. Lastly, the locations of cluster centers are recom-
puted from the new assignments by finding the mean of the data assigned to each
cluster:
ðtþ1Þ
X ðtÞ
cj ¼ x=Xi 8.11
ðtÞ
x˛Xi
The algorithm is guaranteed to converge because neither of the updates in

Ref. [3] nor [4] can ever increase the objective function of Ref. [1]. Various exten-
sions of the original k-means algorithm include fuzzy c-means [37], k-medoid [38],
kernel k-means [39], as well as discriminative k-means when partial labeling is
available [40]. The interested reader is referred to the cited publications for further
information.
Image-based digital pathology

Probably one of the most interesting areas of research in digital pathology, both in
terms of practical potential and the nature of intellectual challenges, concerns the
use of images.
Due to the inherent complexity of images of interest in pathology, the aforemen-
tioned distinction between what we termed “conventional” machine learninge and
deep learningebased methods is particularly pronounced in this realm of digital pa-
thology. In an outline, the former usually take on a modular, pipeline form whereby
explicitly engineered features (sometimes referred to as “handcrafted” features) are
extracted from input images first, and then fed into a machine learning algorithm of
the sort described in the previous section. In contrast, in deep learningebased ap-
proaches, the process is not atomized [41] in this mannerdgood features, as com-
plex functions of directly sensed data (e.g., pixel intensities), are learned
automatically from labeled training data. This distinction has often led to deep
learningebased methods being characterized as “data agnostic” [42]; colloquially
put, the idea is that no matter what the input is, the algorithm will learn what
good features are. However, to describe deep learning in this manner would be a
misnomer and highly misleading. For any learning to take place, the model has to
be constrained and by virtue of these constraints contain implicit information about
the problem at hand; succinctly put, a tabular rasa cannot learn. Indeed, one of the
theorems popularly known as “no free lunch theorem” implicitly says the very same:
Any two optimization algorithms are equivalent when their performance is aver-
aged across all possible problems.
Ref. [18]
Thus, deep learning algorithms too are not endowed with a quasi-magical ability
to overcome this fundamental limitation but are also constrained by some prior
knowledge (and are hence not agnostic). In particular, a deep neural network con-
tains constraints, which emerge from the type of its layers, their order, and the num-
ber of neurons in each layer, the connectedness of layers, and other architectural
aspects. This is important to keep in mind and not see deep learning as necessarily
superior to “conventional” approaches or see them as the universal solution to any
problem.
Conventional approaches to image analysis

As mentioned already, the first broad category of algorithms that have been proposed
a means of analyzing images used in clinical pathology is one which uses what we
term conventional, or traditional, computer vision. By this, we primarily refer to the
first stage in the processing, which concerns the extraction of a priori, explicitly
defined features. The key premise is appealing and intuitively sensible to use human
expertize to identify what elementary, low level elements of an image are salient in
that they capture important discriminative information on the one hand, and are yet
compact enough to facilitate computational efficiency and reliable learning from
available data.
Having demonstrated success on a wide variety of images of so-called natural
scenes (to say the least, they have rather revolutionarized the field) [43], local
appearance descriptors originally proposed for more day-to-day applications of
computer vision (e.g., location recognition, synthetic panoramic image generation,
object localization, etc.) have been adopted first and applied on a diverse range of
pathology image types [44,45]. Popular and widely used examples include local bi-
nary pattern (LBP) [46], scale invariant feature transform (SIFT) [47], and histogram
of oriented gradients (HOG) [48] based descriptors.
While also “handcrafted” in contrast to the above general purpose descriptore
based features are features which directly and explicitly exploit domain-specific in-
formation, that is, human expert knowledge of pathology. Thus, body of interest
counts (e.g., tumor and immune cells, etc.), their morphology quantifiers (e.g.,
size, eccentricity), or spatial statistics are some of the widely used ones. The appeal
of these is twofold. First, by their very nature they can be reasonably expected to
exhibit saliency in the context of pathology slide analysis. Moreover, they are usu-
ally rather effortlessly measured (extracted) from images: different staining and im-
aging modalities or biomarkers can be used to highlight the targets of interest, and
simple image processing or computer vision algorithms (such as flood fill algorithms
[49], morphological operators [50], blob detectors [45], etc.) usually suffice as re-
gard to the computational side.
Image-based digital pathology 167
Deep learning on images

One of the reasons why deep learning has become such a popular area of research
lies in its success when applied to many computer vision problems, including a num-
ber of those which have been prohibitively challenging using more traditional ap-
proaches [51]. Its application to image analysis in digital pathology has not
disappointed either, with deep learning quickly establishing the state-of-the-art per-
formance in the context of a number of diverse tasks. Considering that the technical
detail specifically of deep learning is addressed by several other chapters in the pre-
sent volume, herein we focus our attention on the bigger picture with the aim of
highlighting the key trends in the field, the main challenges, and inherent advantages
and disadvantages of deep learning.
Rather than joining the choir singing praises to deep learning, a more useful,
insightful, and instructive question to ask concerns the outstanding limitations. If
the key obstacle were merely the availability of sufficient data (undoubtedly a dif-
ficulty encountered by virtually any single research group, perhaps save for the
few tech giants), the solution would be merely a matter of time and concerted effort.
Unfortunately from the practical point of view, or fortunately from the point of view
of intellectual challenge, the answer is more nuanced. In particular, the major prob-
lem to current methods is posed by the sheer size of individual images which are
often in excess of 10 GB each. Training deep neural network images of this size
(and bear in mind that deep learning methods are very training data hungry) is
computationally impractical.
A survey of existing work in the field shows two main ways of dealing with the
problem posed by the size of WSIs. The first of these involves severe
downsamplingda multiple gigapixel image is scaled down to an image with mere
tens of thousands of pixels. The second approach consists of dividing a WSI into
small patches (usually nonoverlapping), performing learning on these as individual
images, and then aggregating patch level decisions to arrive at a slide level decision.
Both approaches can be seen to lead to information loss, albeit in different ways. As
formally demonstrated by Nyquist in the 1930s, but equally easy understood intui-
tively, the type of information loss incurred by the downsampling approach comes in
the form of loss of fine detail (i.e., high spatial frequencies). Even to a human expert,
this information is important in the interpretation of a tissue slide, and can be ex-
pected to be even more so to computer-based methods which do not suffer from
many of the imperfections of the human visual system. The type of information
loss that takes place in patch-based algorithms is rather different and more nuanced
so it is useful to highlight a few influential works before continuing the discussion.
In the simplest form, patches are treated entirely separately from one another.
This approach has yielded promising results in patch level distinction between
tumorous and normal tissues [52] and the segmentation of precursor lesions [53],
but on prognostic tasks the success has been much more limited [54]. Moving
away from patch level decision as the ultimate aim, Hou et al. [55] proposed a
method whereby a deep neural network is trained on individual patches, and patch
level decisions aggregated by virtue of voting to make slide level predictions.

Recently, a few methods which address the challenges of intraslide heterogeneity
have been introduced, broadly comprising adaptive selection of patches,
phenotype-based patch clustering, automatic clustering selection, and learned aggre-
gation of cluster level predictions. Initial attempts were disappointing [56], but more
subsequent refinements of the methodology have led to much more promising results
[21].
What the above fly-through the field brings out to the fore are some of the key
limitations of patch-based methods, which dominate the existing literature. First,
by decomposing a WSI into patches, it is virtually unavoidable but to resort to a
form of ad hoc labeling of patches, usually attaching the whole slide label to all
of the patches extracted from it. This is clearly a rather unattractive proposition as
not every region in a slide containing a pathology is equally informative; indeed
some may contain perfectly healthy tissue. Hence the learning algorithm is effec-
tively fed incorrect data. Moreover, problems arise in the aggregation stage too.
Lastly, there is a loss of spatial information effected by the separation of patches
and the arbitrariness of the loci of patch boundaries with respect to the imaged phys-
ical tissue.
Regulatory concerns and considerations

Thus far our focus was on the fundamental challenges associated with the use of im-
ages for tasks in digital pathology, the manner in which these have shaped the field to
date, and the trends most likely to yield advances in future. Related to these, equally
important in practice though somewhat different in nature, are requirements and de-
mands associated with the adoption of artificial intelligence in everyday clinical pro-
cesses. Oftentimes these are imposed by various regulatory bodies, such as the Food
and Drug Administration (FDA) in the United States or more loosely the end users
themselves. While a comprehensive examination of this topic is beyond the scope of
this chapter, it is important to draw the reader’s attention to some of the key issues
within this area. In particular, we highlight three issues:
• Explainability,
• Repeatability, and
• Generalizability.
Explainability in the context of relevance here concerns the ability for a decision
made by an algorithmic, artificial intelligenceebased solution to be communicated
to a human on a semantically appropriate level. In other words, the question being
asked is not just what the decision is but rather how or why the algorithm made that
decision. As alluded to before, for many of the algorithms covered under the um-
brella of conventional approaches to image understanding, this is reasonably
straightforward. For example, in k-nearest neighborebased solutions, the nearest
neighbors, that is, previously seen and human labeled examples which exhibit the
Regulatory concerns and considerations 169
greatest similarity to a new example under consideration can be brought up. With
support vectorebased methods, the closest examples (in a kernel sense) or the sup-
port vectors which define class boundaries can be similarly used to gain insight into
what was learned and how a decision was made. With random forests, a well-known
method of substituting features with dummy features can be used to quantify which
features are the most important ones in decision-making and “sanity checked” by an
expert. This process can be not only confirmatory but can also lead to novel clinical
insight. While explainability has for a long time been seen as a potential disadvan-
tage (cf. with human memory and brain: can one localize where the concept of
“cake” is stored?) of deep learningebased approaches, which used to be seen as pro-
verbial “black boxes,” recent years have seen huge strides of progress in this area
[57]. For example, looking at which neurons in a network fire together (cf. the
mantra of biological reinforcement learning “what fires together wires together”)
can be insightful. A higher (semantic) level of insight can be gained by looking at
different layers and visualizing features learneddtypically layers closer to input
tend to learn simple, low-level appearance elements, which are then combined to
compose more complex visual patterns downstream [58]. Another ingenious tech-
nique involves so-called “occlusion” whereby parts of an input image are occluded
by a uniform pattern (thus effecting localized information loss) and quantifying the
impact such occlusion has on the decisiondthe greater the impact, the greater the
importance of a particular image locus [59].
The issues of repeatability and reproducibility have gained much prominence in
academic circles, and wider, in recent years. The difference between the two concepts
has been convincingly highlighted and discussed by Drummond [60], but this nuance
does not appear to have penetrated regulatory processes as of yet. In particular, the
practical impossibility of perfect reproducibility of experimental outcomes for some
machine learning algorithms poses a major obstacle in their adoption in clinical prac-
tice due to the stochastic nature of their operation. We have already alluded to this in
the previous section in our overview of random forests. In particular, even if the same
training data are used and the same manner of training employed, the parameters of a
trained random forest or neural network will differ from instance to instance. This is
an inherent consequence of the stochastic elements of their training processes and is
something that does not sit comfortably with some. While this sentiment is not diffi-
cult to relate to, it does illustrate what can be argued to be an inconsistency in how
human and machine expertize are treated. In particular, the former has been shown
to exhibit major interpersonal variability (two different, competent, and experienced
pathologists arriving at different conclusions from the same data), as well as intraper-
sonal one (e.g., depending on the degree of fatigue, time of day, whether the decision
is made in the anteprandial or postprandial period, etc.). This kind of double standard
is consistent with a broad range of studies involving human-human and human-
machine interaction and is neurologically well understood, with the two types of
engagement differently engaging important brain circuitry such as the ventromedial
prefrontal cortex and the amygdala [61]. Generalizability is a concept that pervades
machine learning. It refers to the ability of an algorithm to learn from training data
some information contained within it which would allow it to make good decisions on
previously unseen input, often seemingly rather different from anything seen in the
training stage. The issue of generalizability underlies the tasks of data representation,
problem abstraction, mathematical modeling of learning, etc. Herein we are referring
to generalizability in a very specific context. Namely, a major challenge in the practice
of pathology concerns different protocols and conditions in which data are acquired.
Put simply, the question being asked is what performance can I expect to see from an
algorithm evaluated using data a technician acquired using particular equipment on
one cohort of patients in a specific lab, when it is applied on data acquired by a
different technician in a different lab from a different cohort. It can be readily seen
that slight changes in the data acquisition (such as the duration of exposure to a
dye), different physical characteristics of lab instruments, or indeed different demo-
graphics of patients all pose reasonable grounds for concern. Indeed, at present,
most of the work in digital pathology is rather limited in this regard, in no small
part due to the variety of obstacles in data sharing: there are issues of ethics and pri-
vacy, as well as financial interests at stake.
Acknowledgments
The authors would like to acknowledge Inés Nearchou for kindly providing images for the
figures within this chapter.
References
[1] Caie PD, Walls RE, Ingleston-Orme A, Daya S, Houslay T, Eagle R, et al. High-content
phenotypic profiling of drug response signatures across distinct cancer cells. Molecular
Cancer Therapeutics 2010;9(6):1913e26.
[2] Deans GT, Heatley M, Anderson N, Patterson CC, Rowlands BJ, Parks TG, et al. Jass’
classification revisited. Journal of the American College of Surgeons 1994;179(1):11e7.
[3] Lim D, Alvarez T, Nucci MR, Gilks B, Longacre T, Soslow RA, et al. Interobserver
variability in the interpretation of tumor cell necrosis in uterine leiomyosarcoma. The
American Journal of Surgical Pathology 2013;37(5):650e8.
[4] Chandler I, Houlston RS. Interobserver agreement in grading of colorectal cancersd
findings from a nationwide web-based survey of histopathologists. Histopathology
2008;52(4):494e9.
[5] Caie PD, Zhou Y, Turnbull AK, Oniscu A, Harrison DJ. Novel histopathologic feature
identified through image analysis augments stage II colorectal cancer clinical reporting.
Oncotarget 2016;7(28):44381e94.
[6] Nearchou IP, Lillard K, Gavriel CG, Ueno H, Harrison DJ, Caie PD. Automated analysis
of lymphocytic infiltration, tumor budding, and their spatial relationship improves prog-
nostic accuracy in colorectal cancer. Cancer Immunology Research 2019;7(4):609e20.
[7] Khameneh FD, Razavi S, Kamasak M. Automated segmentation of cell membranes to
evaluate HER2 status in whole slide images using a modified deep learning network.
Computers in Biology and Medicine 2019;110:164e74.
References 171
[8] Widmaier M, Wiestler T, Walker J, Barker C, Scott ML, Sekhavati F, et al. Comparison
of continuous measures across diagnostic PD-L1 assays in non-small cell lung cancer
using automated image analysis. Modern Pathology 2019;33.
[9] Puri M, Hoover SB, Hewitt SM, Wei BR, Adissu HA, Halsey CHC, et al. Automated
computational detection, quantitation, and mapping of mitosis in whole-slide images
for clinically actionable surgical pathology decision support. Journal of Pathology
Informatics 2019;10:4.
[10] Brieu N, Gavriel CG, Nearchou IP, Harrison DJ, Schmidt G, Caie PD. Automated
tumour budding quantification by machine learning augments TNM staging in
muscle-invasive bladder cancer prognosis. Scientific Reports 2019;9(1):5174.
[11] Brieu N, Gavriel CG, Harrison DJ, Caie PD, Schmidt G. Context-based interpolation of
coarse deep learning prediction maps for the segmentation of fine structures in immu-
nofluorescence images. SPIE; 2018.
[12] Roy S, kumar Jain A, Lal S, Kini J. A study about color normalization methods for his-
topathology images. Micron 2018;114:42e61.
[13] Cruz-Roa A, Gilmore H, Basavanhally A, Feldman M, Ganesan S, Shih NNC, et al. Ac-
curate and reproducible invasive breast cancer detection in whole-slide images: a deep
learning approach for quantifying tumor extent. Scientific Reports 2017;7:46450.
[14] Litjens G, Bandi P, Ehteshami Bejnordi B, Geessink O, Balkenhol M, Bult P, et al. 1399
H&E-stained sentinel lymph node sections of breast cancer patients: the CAMELYON
dataset. GigaScience 2018;7(6).
[15] Coudray N, Ocampo PS, Sakellaropoulos T, Narula N, Snuderl M, Fenyö D, et al. Clas-
sification and mutation prediction from nonesmall cell lung cancer histopathology im-
ages using deep learning. Nature Medicine 2018;24(10):1559e67.
[16] Sirinukunwattana K, Domingo E, Richman S, Redmond KL, Blake A, Verrill C, et al.
Image-based consensus molecular subtype classification (imCMS) of colorectal cancer
using deep learning. bioRxiv 2019:645143.
[17] Havel JJ, Chowell D, Chan TA. The evolving landscape of biomarkers for checkpoint
inhibitor immunotherapy. Nature Reviews Cancer 2019;19(3):133e50.
[18] Wolpert DH, Macready WG. No free lunch theorems for optimization. Transactions on
Evolutionary Computation 1997;1(1):67e82.
[19] Dimitriou N, Arandjelovic O, Harrison DJ, Caie PD. A principled machine learning
framework improves accuracy of stage II colorectal cancer prognosis. NPJ Digital Med-
icine 2018;1(1):52.
[20] Harder N, Schonmeyer R, Nekolla K, Meier A, Brieu N, Vanegas C, et al. Automatic
discovery of image-based signatures for ipilimumab response prediction in malignant
melanoma. Scientific Reports 2019;9(1):7449.
[21] Yue X, Dimitriou N, Arandjelovic O. Colorectal cancer outcome prediction from
H&E whole slide images using machine learning and automatically inferred pheno-
type profiles. February 01, 2019. Available from: https://ui.adsabs.harvard.edu/abs/
2019arXiv190203582Y.
[22] Campanella G, Hanna MG, Geneslaw L, Miraflor A, Werneck Krauss Silva V,
Busam KJ, et al. Clinical-grade computational pathology using weakly supervised
deep learning on whole slide images. Nature Medicine 2019;25(8):1301e9.
[23] Bishop CM. Pattern recognition and machine learning. New York, USA: Springer- Ver-
lag; 2007.
[24] Vente D, Arandjelovic O, Baron V, Dombay E, Gillespie S. Using machine learning for
automatic counting of lipid-rich tuberculosis cells influorescence microscopy images.
In: Proc. AAAI conference on artificial intelligence workshop on health intelligence;

2019.
[25] Yu H-F, Huang F-L, Lin C-J. Dual coordinate descent methods for logistic regression
and maximum entropy models. Machine Learning 2011;85(1e2):41e75.
[26] Schölkopf B, Smola A, Müller K. Advances in Kernel methods e SV learning, chapter
Kernel principal component analysis. Cambridge, MA: MIT Press; 1999. p. 327e52.
[27] Vapnik V. The nature of statistical learning theory. Springer-Verlag; 1995.
[28] Mercer J. Functions of positive and negative type and their connection with the theory
of integral equations. Philosophical Transactions of the Royal Society A 1909;209:
415e46.
[29] Cunningham P, Delany SJ. k-nearest neighbour classifiers. Multiple Classifier Systems
2007:1e17.
[30] Breiman L. Random forests. Machine Learning 2001;45(1):5e32.
[31] Cutler DR, Edwards TC, Beard KH, Cutler A, Hess KT, Gibson J, Lawler JJ. Random
forests for classification in ecology. Ecology 2007;88(11):2783e92.
[32] Ghosh P, Manjunath B. Robust simultaneous registration and segmentation with sparse
error reconstruction. IEEE Transactions on Pattern Analysis and Machine Intelligence
2013;35(2):425e36.
[33] Zadrozny B, Elkan C. Obtaining calibrated probability estimates from decision trees
and naive Bayesian classifiers. Proceedings IMLS International Conference on Machine
Learning 2001;1:609e16.
[34] Dasgupta S. The hardness of k-means clustering. Technical Report CS2007-0890. San
Diego: University of California; 2007.
[35] Khan SS, Ahmadb A. Cluster center initialization algorithm for k- means clustering.
Pattern Recognition Letters 2004;25(11):1293e302.
[36] Peña JM, Lozano JA, Larrañaga P. An empirical comparison of four initialization
methods for the k-means algorithm. Pattern Recognition Letters 1999;20(10):1027e40.
[37] Dunn JC. A fuzzy relative of the ISODATA process and its use in detecting compact
well-separated clusters. Journal of Cybernetics 1973;3(32e57).
[38] Kaufman L, Rousseeuw PJ. Finding groups in data: an introduction to cluster analysis.
Wiley; 2005.
[39] Schölkopf B, Smola A, Müller K-R. Nonlinear component analysis as a kernel eigen-
value problem. Neural Computation 1998;10(5):1299e319.
[40] Arandjelovic O. Discriminative k-means clustering. In: Proc. IEEE international joint
conference on neural networks; 2013. p. 2374e80.
[41] Arandjelovic O. A more principled use of the p-value? Not so fast: a critique of Colqu-
houn’s argument. Royal Society Open Science; 2019.
[42] Kooi T, Litjens G, Van Ginneken B, Gubern-Mérida A, I Sánchez C, Mann R, den
Heeten A, Karssemeijer N. Large-scale deep learning for computer aided detection of
mammographic lesions. Medical Image Analysis 2017;35:303e12.
[43] Niemeyer M, Arandjelovic O. Automatic semantic labelling of images by their content
using non-parametric Bayesian machine learning and image search using synthetically
generated image collages. In: Proc. IEEE international conference on data science and
advanced analytics; 2018. p. 160e8.
[44] Alhindi TJ, Kalra S, Ng KH, Afrin A, Tizhoosh HR. Comparing lbp, hog and deep fea-
tures for classification of histopathology images. In: 2018 International joint conference
on neural networks (IJCNN). IEEE; 2018. p. 1e7.
References 173
[45] Xing F, Yang L. Robust nucleus/cell detection and segmentation in digital pathology
and microscopy images: a comprehensive review. IEEE Reviews in Biomedical Engi-
neering 2016;9:234e63.
[46] Fan J, Arandjelovic O. Employing domain specific discriminative information to
address inherent limitations of the LBP descriptor in face recognition. In: Proc. IEEE
international joint conference on neural networks; 2018. p. 3766e72.
[47] Lowe DG. Distinctive image features from scale-invariant keypoints. International Jour-
nal of Computer Vision 2003;60(2):91e110.
[48] Arandjelovic O. Object matching using boundary descriptors. In: Proc. British machine
vision conference; 2012. https://doi.org/10.5244/C.26.85.
[49] Mehta N, Raja’S A, Chaudhary V. Content based sub-image retrieval system for high-
resolution pathology images using salient interest points. In: 2009 Annual international
conference of the IEEE engineering in medicine and biology society. IEEE; 2009.
p. 3719e22.
[50] Karunakar Y, Kuwadekar A. An unparagoned application for red blood cell counting
using marker controlled watershed algorithm for android mobile. In: 2011 Fifth inter-
national conference on next generation mobile applications, services and
technologies. IEEE; 2011. p. 100e4.
[51] Zhu J-Y, Park T, Isola P, Efros AA. Unpaired image-to-image translation using cycle-
consistent adversarial networks. In: Proceedings of the IEEE international conference
on computer vision; 2017. p. 2223e32.
[52] Jamaluddin MF, Fauzi MFA, Abas FS. Tumor detection and whole slide classification of
h&e lymph node images using convolutional neural network. In: Proc. IEEE Interna-
tional conference on signal and image processing applications; 2017. p. 90e5.
[53] Albayrak A, Ünlü A, Çalık N, Bilgin G, Türkmen I,_ Çakır A, Ça- par A, Töreyin BU,
Ata LD. Segmentation of precursor lesions in cervical cancer using convolutional neural
networks. In: Proc. Signal processing and communications applications conference;
2017. p. 1e4.
[54] Mackillop WJ. The importance of prognosis in cancer medicine. TNM Online; 2003.
[55] Hou L, Samaras D, Kurc TM, Gao Y, Davis JE, Saltz JH. Patch-based convolutional
neural network for whole slide tissue image classification. In: Proc. IEEE conference
on computer vision and pattern recognition; 2016. p. 2424e33.
[56] Zhu X, Yao J, Zhu F, Huang J. Wsisa: making survival prediction from whole slide his-
topathological images. In: IEEE conference on computer vision and pattern recognition;
2017. p. 7234e42.
[57] Erhan D, Bengio Y, Courville A, Vincent P. Visualizing higher-layer features of a deep
network. University of Montreal 2009;1341(3):1.
[58] Cooper J, Arandjelovic O. Visually understanding rather than merely matching ancient
coin images. In: Proc. INNS conference on big data and deep learning; 2019.
[59] Schlag I, Arandjelovic O. Ancient Roman coin recognition in the wild using deep
learning based recognition of artistically depicted face profiles. In: Proc. IEEE interna-
tional conference on computer vision; 2017. p. 2898e906.
[60] Drummond C. Replicability is not reproducibility: nor is it good science. 2009.
[61] Kätsyri J, Hari R, Ravaja N, Nummenmaa L. The opponent matters: elevated fmri
reward responses to winning against a human versus a computer opponent during inter-
active video game playing. Cerebral Cortex 2013;23(12):2829e39.

Precision Medicine in Digital Pathology Via Image Analysis and Machine Learning

Uploaded by

Copyright:

Available Formats

Precision Medicine in Digital Pathology Via Image Analysis and Machine Learning

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Precision Medicine in Digital Pathology Via Image Analysis and Machine Learning

Uploaded by

Copyright:

Available Formats

CHAPTER

Applications of image analysis and machine learning

Machine learning for image segmentation

Deep learning for image segmentation

(A1) (A2) (A3) (A4)

(B1) (B2) (B3) (B4)

FIGURE 8.1 Automated tumor to stroma segmentation and cellular classification.

(A) (B) (C) (D) (E)

(F) (G) (H) (I) (J)

FIGURE 8.3 Examples of image artifact within digital pathology.

FIGURE 8.4 Tissue segmentation performed by deep learning.

Machine learning on extracted data

(A) (B) (E)

FIGURE 8.5 Quantifying the spatial resolution of cellular subpopulations.

Practical concepts and theory of machine learning

Naı¨ve Bayes assumptionebased methods

Logistic regressionebased methods

Support vectorebased methods

Nonparametric, k-nearest neighborebased methods

of the ultimate performance of an algorithm but also in terms of the mechanics as to

where the empirical cluster means are calculated as:

The algorithm is guaranteed to converge because neither of the updates in

Image-based digital pathology

Conventional approaches to image analysis

Deep learning on images

level decisions aggregated by virtue of voting to make slide level predictions.

Regulatory concerns and considerations

In: Proc. AAAI conference on artificial intelligence workshop on health intelligence;

You might also like

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.