Breast Cancer Diagnostiic Using Machine Learning
Breast Cancer Diagnostiic Using Machine Learning
ABSTRACT
Master’s thesis
2023
71 pages, 13 figures, 12 tables
Examiners:Associate Professor Lassi Roininen and Jyrki Savolainen, Post-Doctoral
Researcher
Keywords: Breast cancer diagnosis, Machine learning, Cancer detection, Predictive Modelling
Breast cancer poses a significant global health concern, with approximately 2.2 million new
cases and 700,000 deaths reported in 2020. Traditional diagnostic approaches which
predominantly depend on expert judgement, have been associated with substantial variability
in accuracy. To bridge this gap ML models are used to improve diagnostic out of which the
present research investigates the potential of specific machine learning algorithms—Decision
Trees, K-Nearest Neighbors, Support Vector Machines, and Logistic Regression—with an
overarching objective of improving early detection and enhancing the precision of breast
cancer diagnosis. The study utilizes the Breast Cancer Coimbra Dataset and the Wisconsin
Diagnostic Breast Cancer Dataset for model training and evaluation. A comprehensive
comparative analysis of these models is conducted, with a focus on optimizing hyperparameters
and distance measures to ascertain the most effective configurations. Further, the influence of
feature selection methods and Principal Component Analysis on model performance is
explored.
Logistic Regression and Support Vector Machines models demonstrated remarkable
performance, surpassing the predictive accuracy of models reported in current literature, with
accuracies reaching up to 99.42%. This research could serve as a foundation for future studies
applying machine learning models in breast cancer diagnostics, emphasizing the potential of
machine learning as a robust tool in medical diagnostics.
2
ACKNOWLEDGEMENTS
I want to express my deepest gratitude to my mom, dad, and brothers who have provided
unwavering support throughout my master's journey. Their love and reassurance during times
of doubt have provided the strength I needed to persevere. I can confidently say that this
achievement would not have been possible without them. I am immensely thankful for their
enduring faith in my abilities and for their ceaseless support.
I would also like to thank my supervisors, Jyrki Savolainen and Lassi Roininen, for their
invaluable feedback, which has greatly enhanced the quality and clarity of this work.
3
ABBREVIATIONS
BC Breast Cancer
ML Machine Learning
LR Logistic Regression
DT Decision Tree
TP True Positive
TN True Negative
FP False Positive
FN False Negative
4
Table of contents
Abstract
Acknowledgments
Symbols and abbreviations
1. Introduction ................................................................................................................... 8
1.1. Background and Motivation ................................................................................... 9
1.2. Aim and research question ..................................................................................... 9
1.2.1. Significance of study .................................................................................... 10
2. Theoretical Background ............................................................................................... 11
2.1. Breast cancer ....................................................................................................... 11
2.1.1. Breast Cancer Detection and Diagnosis ........................................................ 12
2.1.2. Types of breast cancer .................................................................................. 12
2.1.3. Cause of Breast cancer ................................................................................. 13
2.1.4. Breast cancer stages ..................................................................................... 14
2.1.5. Breast Cancer: Global Patterns ..................................................................... 15
2.2. Machine learning ................................................................................................. 16
2.2.1. Machine learning methods ........................................................................... 16
2.3. Algorithms........................................................................................................... 17
2.3.1. Support vector machine ................................................................................ 17
2.3.2. K-nearest neighbour(KNN) .......................................................................... 18
2.3.3. Logistic regression ....................................................................................... 19
2.3.4. Decision tree ................................................................................................ 20
2.4. Assessing Machine Learning Models: The Key Role of Precision, Recall, F1-score,
and Accuracy ................................................................................................................... 21
3. Literature Review......................................................................................................... 22
3.1. Previous research on WDBC and BCCD dataset .................................................. 23
3.2. Investigating Machine Learning Approaches for Breast Cancer Diagnosis ........... 28
4. Data and methodology ................................................................................................. 36
4.1. Data ..................................................................................................................... 37
4.1.1. Coimbra Breast Cancer Dataset .................................................................... 37
5
Figures
Figure 6: WDBC dataset with PCA performance comparison of different SVM kernel
Tables
Table 7: Decision Tree Model Performance at Various Max Depths on BCCD Dataset
Table 8: Decision Tree Model Accuracy at Various Max Depths on WDBC Dataset
Table 9: Comparison of Machine Learning Model Accuracies on BCCD Dataset Before and
after Feature Elimination
Table 10: Model Performance: Assessing Accuracy Across Machine Learning Models on
WDBC Dataset
Table 11. Comparative Analysis of Machine Learning Model Performance with Previous
Studies
8
1. Introduction
In the last decade, there has been a surge of interest in the field of Machine learning (ML),
driven by several factors including lower prices for processing time and storage space. This
has facilitated the development of advanced ML models, such as reinforcement and deep
learning, enabling the efficient archiving, processing, and analysis of massive datasets.
Machine learning has played a crucial role in areas like data mining, natural language
processing, image recognition, expert systems, and prediction (Maity & Das, 2017).The
primary objective of this thesis is to develop an accurate and efficient machine learning model
for Breast Cacner (BC) detection.
BC is a significant global health concern, with an estimated 2.2 million new cases and 0.7
million deaths by 2020, making it the second leading cause of cancer-related fatalities among
women worldwide (Sung et al., 2021).In 2021 (Siegel et al., 2022) anticipated 43,600 deaths
between women and 0.3 million newly diagnosed cases of BC with in U.S. Typically speaking,
tumours can be either benign or malignant. Everyone faces a bigger risk of getting BC, yet it
is neither life-threatening nor malignant. Malignant tumours, on the other hand, are a more
significant cause for concern because they tend to be cancerous. According to a recent study,
twenty per cent of women with BC die from aggressive tumours (Subashini et al., 2009).
Tumour diagnosis has been a focus of previous studies. Scientists are using Machine learning
(ML) and Data Mining (DM) to anticipate BC (Abdar et al., 2020). Improved accuracy and
throughput in cancer diagnosis are possible using classifier-based prediction models built on
ML and DM. DM is a wide-ranging amalgamation of methods for mining massive, complex
datasets for previously undiscovered knowledge and insights. It has seen extensive application
in the rollout of disease prediction systems (McWilliam et al., 2016), including those for
thyroid cancer (Rasool et al., 2020), and cardiovascular disease (Park et al., 2021). Fuzzy
genetics (Bicchierai et al., 2021) and computer-assisted systems (Kim et al., 2021) have
included DM and ML approaches for BC diagnosis.
9
Cancer, a disease marked by rapid, aggressive cell division, spreads to nearby organs and
tissues. DNA aberrations trigger this devastating condition. Changes typically affect larger
DNA segments called genes. The term "cancer" encompasses various types, each characterized
by uncontrolled abnormal cell growth in one or multiple organs. These rogue cells often invade
adjacent body parts and spread to other organs. BC originates from cells in the breast,
particularly those in the inner lining of milk ducts or the milk-producing lobules. These changes
or mutations may occur spontaneously due to increasing entropy or be triggered by external
factors. Various environmental stressors contribute to these alterations, including radiation
(microwaves, gamma rays, X-rays, ultraviolet rays), chemicals found in food, water, and air,
evolution, aging of RNA and DNA (Leão et al., 2021).
Cancer research has made significant strides in recent years, with machine learning (ML)
playing a crucial role in advancing diagnostic methods. However, several researchers face
challenges with ML classifier accuracy due to the absence of fundamental methodologies.
Confusion matrices in some studies have mis predicted false negatives and true negatives,
leading to the incorrect classification of cases. Another issue arises when feature training is
combined with nonlinear classification, as the model's execution time increases exponentially
with the addition of more features, ultimately affecting diagnosis accuracy. Both data analysts
and medical professionals are deeply concerned about the model's accuracy and its time
complexity. Given the above challenges, this research is motivated by the necessity to improve
the effectiveness of BC diagnostics using machine learning the goal is to propose data mining
strategies using various machine learning models to identify the most effective ML model for
predicting BC diagnosis.
The purpose of the study to employ ML algorithms to test existing prediction models for
processing a large number of tumours features and extracting relevant data for BC analysis.
Learning goals included using data mining methods to find a solid cancer categorization
10
prediction model. The forthcoming analysis is to further examine the selected machine learning
models by varying the values of their respective hyperparameters. The main aim of this thesis
is to evaluate and optimize the Machine Learning model to Improve BC diagnostic accuracy.
This research will use for ML models Decision tree (DT), K-Nearest Neighbors (KNN),
Support vector machine( SVM), and logistic regression (LR) will propose data exploratory
techniques (DET) as well as create four separate predictive models and also will find which
machine learning models will produce best performance in Breast Cancer Coimbra
Dataset(BCCD) and Wisconsin Diagnostic Breast Cancer (WDBC) .
1. How and which machine learning models are utilized to detect cancer in patient data
according to literature?
2. Among Support vector machine (SVM), K-Nearest Neighbors (KNN), Decision tree (DT),
and Logistic regression (LR), which Machine learning (ML) model demonstrates the best
prediction performance when applied to the Breast cancer Coimbra dataset (BCCD) and
Wisconsin diagnostic breast cancer dataset (WDBC)?
Traditional diagnostic methods such as mammography and biopsies are useful but time
consuming and expensive. They are not readily available in underserved areas where it is
challenging to provide and maintain such high-cost equipment. There is an increasing demand
for alternative approaches that are cost effective and efficient. Machine learning Computer
vision technologies, which has the potential to assist with this challenge in the medical field.
Because survival rates can be increased with early diagnosis, cancer prognosis and detection
are of the utmost importance. Early diagnosis has been shown to increase survival rates.
Because of their ability to detect complicated patterns in data, machine learning-based
technologies have emerged as promising solutions in this domain, potentially outperforming
older methods. The goal of this study is to improve machine learning algorithms for diagnosing
BC utilizing the BCCD and WDBC datasets, with the goal of increasing accuracy. This study
could be the benchmark for future studies in this field.
11
2. Theoretical Background
It is generally accepted that breast tissue is a woman's body's most common cancer site. Cancer
of the breast occurs when a large number of breast cells mutate (change) and proliferate in an
uncontrolled manner, forming a tumour (tumour). Like many other types of cancer, BC can
metastasize to a lymph node as well as other structures of the chest. It can also metastasise or
apply to other places of the body, where it can create new tumours. This is referred to as
metastasis. Aside from skin cancer, BC ranks as one of the most frequent cancers among
women. Prevalence increases after age 50 in females. Although while men are not immune to
the disease, females are more likely to contract it. About 2,600 men are diagnosed with male
BC every year in the U.S, making up less than 1% of total cancer cases. (Prague et al,, 2023).
BC is more common in trans women than in cis men. Wherein compared to cisgender women,
the incidence of BC was reduced among Trans men. Women aged 40 and up are more likely
to acquire BC, which develops when cells in the milk-producing glands (called lobules)
become aberrant and divide rapidly (de Blok et al., 2019).
Between 8 and 9 percent of women worldwide were given a BC diagnosis each year, as well
as its underlying cause is yet to be extensively identified, based on the World Health
Organization. Nonetheless, there are a number of known risk factors that are thought to enhance
the threat of getting BC in women These contain dietary habits, alcohol consumption being
female, smoking, having dense breasts, not getting enough exercise, having a history of
pregnancy, family history, genetics, breastfeeding, ethnicity, life history, menstrual history,
body mass index, breast density, breast changes, and a past history of the BC. The most shared
symptoms of BC contain dry, flaky skin just on breast or nipple, itchy skin, dimpled skin, red,
12
a change in breast size or shape, breast thickening in patches, and whole or partial swelling
(Sharma, 2021).
During the processing, features are extracted from the entire image, with emphasis on regions
showing abnormalities or potential lesions (Antropova et al., 2017). These features encapsulate
valuable information about the structure and morphology of breast tissues, thereby offering
insights into the presence or absence of cancerous cells. The extracted features may include
shape-based, texture-based, and edge-based attributes, each of which contributes uniquely to
the overall diagnostic process. Each of these features contributes distinct information to the
diagnostic process and is essential for the precise and accurate diagnosis of BC (Dhungel et al.,
2017). Once the relevant features have been extracted, they are fed into a classifier (a machine
learning algorithm) that will predict whether the examined tissue is cancerous or not,
completing the process of computer-aided BC detection and diagnosis (Dhungel et al., 2017).
BC comes in a wide variety of forms. The afflicted cell types in the breast are used to classify
the disease. Carcinomas make up the huge majority of cases of BC (American Cancer Society,
2021). Adenocarcinomas arise in the gland cells that line the milk ducts and the lobules, making
them the most prevalent type of BC (milk-producing glands). Because they begin in distinct
13
breast cells, malignancies like angiosarcoma and sarcoma can also develop in the breast but
are not technically BC (Elanany et al., 2023).
It is also possible to categorise breast tumours based on the proteins and genes they express.
Following a biopsy, BC cells are analysed to fix whether or not they have the HER2 protein or
gene, as well as the oestrogen receptor and progesterone receptor proteins (Ross et al., 2009).
The tumour cells are examined in great detail in the lab to determine the tumour’s grade.
Treatment options and cancer stage are often determined by the types of proteins detected and
the severity of the tumour.
The multiplication and spread of aberrant breast cells are critical steps in developing BC.
However, the specialists are still unsure of the initial trigger for this phenomenon. Research
has indicated a multitude of potential risk factors that could increase a woman's susceptibility
to developing BC (Mahmood, 2023). Several potential risk factors have been identified. Age
is a critical risk factor for BC, particularly for women over the age of 55. As a woman grows
older, her breast tissue becomes more vulnerable to damage and mutation, increasing the
likelihood of cancerous growths. One prominent influence is the hormonal changes that occur
during menopause, which can potentially stimulate the development of BC. Moreover, sex is a
critical determinant in BC incidence. The disease affects women significantly more than men,
which is attributed to the greater estrogen exposure in women that stimulates the growth of
breast cells (American Cancer Society, 2021).
Family history and genetics substantially contribute to the risk factors associated with BC. An
elevated risk is observed in women whose close relatives, such as mother, sister, or child, have
been diagnosed with BC. Certain genetic mutations, notably BRCA1 and BRCA2, while
relatively rare and accounting for only 5-10% of all cases, are linked to a significant increase
in BC risk. In addition to genetic factors, several lifestyle habits are associated with increased
BC risk. Smoking and alcohol consumption have been consistently associated with higher BC
incidence rates. Obesity represents another critical risk factor, given the ability of adipose cells
to produce oestrogen, a hormone known to stimulate the growth of breast cells (Mahmood,
14
2023).Radiation exposure, particularly to the head, neck, or chest regions, can also elevate BC
risk. Such radiation has the potential to damage the DNA in breast cells, triggering mutations
and precipitating abnormal growth. HRT in which synthetic hormones substitute for naturally
occurring ones, has been implicated in elevated BC risk. Data suggest that women utilizing
HRT are at a higher risk of BC compared to those who refrain from such therapy (Mahmood,
2023).These findings highlight the multifaceted nature of BC risk, encompassing genetic,
lifestyle, and environmental factors.
BC is classified into stages according to the size of the tumour and the extent to which it has
spread. Cancers that have spread beyond the breast or are particularly big are considered to be
at a more advanced stage than those still localised to the breast. It is essential for doctors to
determine whether or not a BC patient has an invasive form of the disease before deciding how
to treat it (Jia et al., 2023).How far the cancer has gone, whether or not lymph nodes are
implicated, and the size of the tumour. BC can be divided into five distinct stages, labelled
from 0 to 4 (Edge & Compton, 2010)
Stage Description
Stage 0 DCIS: Cancer cells have not spread beyond ducts
Stage 1A Tumor ≤ 2 cm, no lymph node involvement.
Stage 1B Lymph nodes test positive, tumor ≤ 2 cm
Stage 2A Tumor ≤ 2 cm with 1-3 lymph nodes affected OR tumor > 5 cm without spread.
Stage 3A Swelling of 4-9 4-9 axillary lymph nodes OR lymph nodes inside breast. Tumor size
doesn't matter.
Tumor > 5 cm with 1-3 lymph nodes affected (including breastbone lymph nodes).
Stage 3B Cancer spreads to chest walls, possibly affecting up to 9 lymph nodes
Stage 3C At least 10 lymph nodes affected in the armpit, under the collarbone, or in the breast.
15
Stage 4 Metastatic BC: Any size tumor, cancer cells spread to nearby and distant lymph nodes
and other organs.
According to estimates, cancerous tumours are the biggest cause of disability in women all
over the world, accounting for 107.8 million years of potential life lost. These DALYs are
reduced because of BC, which is responsible for 19.6 million of them. The percentage of female
malignancies attributable to BC in the U.S is projected to rise to 29% by the year 2040. (Alom
et al., 2023). The HDI has a favourable and statistically significant connection with the ASIR
for BC, as shown by the most recent information from GLOBOCAN. According to the data
from the year 2020, nations with an HDI of 75.8 per 0.1 million had the highest ASIR, whilst
nations with an HDI of either medium or low had an ASIR that was less than half (27.8 as well
as 36.1, respectively). There were 0.7 million fatalities [95% UI, 0.7 million] from BC
worldwide at an estimated rate of 13.6 per 0.1 million people of reproductive age. Although
mortality rates are highest in industrialised regions, 63% of all fatalities in 2020 will be in Asia
and Africa. A woman inside a high-income country has a better chance of survival if she
receives a positive BC diagnosis, she has a much better chance of survival than in a low-income
or would in a low-income or even middle-income country (Ferlay et al.,2020).
The mortality-to-incidence ratio (MIR) for BC in the year 2020 was 0.30, which is indicative
of 5-year survival rates (Łukasiewicz et al., 2021). The five-year survival rate for BC is 89.6%
for cancer that has been localised but it is only 75.4% for cancer that has spread throughout the
body in nations with modern healthcare systems such as Hong Kong, Singapore, and Turkey.
In high-income countries, the five-year BC survival rate was 76.3%, whereas in low-income
countries, it was only 47.4%. (India, Philippines, Thailand, Costa Rica, and Saudi Arabia)
(Łukasiewicz et al., 2021).
16
A component of both AI and computer science, that seeks to progress upon earlier AI systems
by demonstrating how humans study and using that knowledge to generate new data and
algorithms. Over the past few decades, storage and processing improvements have made
possible novel machine learning-based products for instance, Netflix's recommendation engine
or self-driving cars (Bell, 2022). There has been a recent uptick in the popularity of data
science, of which machine learning is an integral aspect. In data mining projects, statistical
approaches are used to teach computers to sort data into categories, make predictions, and
unearth previously unknown relationships. Applications and businesses can then make better
decisions based on these findings, which should positively affect critical growth KPIs.
Machine learning encompasses various methods for analyzing and modeling data, with three
key aspects being semi-supervised learning, unsupervised learning, and supervised learning.
Supervised learning processes existing input data to generate output and has two main sub-
types: classification and regression (Zhou, 2018). Classification involves organizing data into
pre-defined classes, while regression is concerned with making predictions or inferences about
data characteristics using a subset of these characteristics. In contrast, unsupervised learning
does not require predetermined target outcomes. It aims to uncover the relationships and
connections within the data during the learning process. Unsupervised learning doesn't rely on
"training data" and includes clustering and association as its primary forms (Glielmo et al.,
2021). Clustering identifies related groups when data's inherent groupings are unknown, and
association discovers relationships and connections between items within the dataset.
2.3. Algorithms
In this study, four machine learning algorithms utilized to investigate BC diagnostics using the
BCCD and WDBC datasets. The selected algorithms-KNN, SVM, DT, and LR are chosen
based on their variety of approaches, interpretability, and ease of use. The following
subsections provide a detailed description of each algorithm, along with their suitability for the
current study.
wx + b = 0 (1)
where w is the weight vector, x represents a data point, and b is the bias term.
In cases where the data is linearly separable, t problem for SVM is:
minimize: ||w||^2 / 2
where y_i denotes the class label of the data point x_i, and N is the total number of data points.
18
However, real-world data is often non-linearly separable. To accommodate such cases, the soft
margin SVM is introduced. This approach allows for some misclassifications by incorporating
slack variables (ξ_i) and a regularization parameter (C). The soft margin SVM expressed as
The regularization parameter, C, manages the trade-off between optimizing the margin and
reducing the classification error. Smaller C value allows for a margin with more
misclassifications, whereas a larger C value imposes stricter constraints on misclassifications,
resulting in a narrower margin.
Kernel functions are used in SVM to transform data into higher-dimensional spaces, which
makes it possible to find a hyperplane that separates the data in cases where the data is not
linearly separable in the original space (Schölkopf & Smola, 2002).
The K-Nearest Neighbor (KNN) algorithm is a type of instance-based learning algorithm that
classifies a given query point based on the majority class of its 'K' nearest data points in the
feature space. This makes KNN a 'lazy' learning algorithm, as it does not build a model from
the training data but instead uses the data points themselves for prediction (Hastie, Tibshirani,
& Friedman, 2009). The KNN algorithm can classify the presence of BC based on the
proximity of other data points with similar features. To do this, KNN uses various distance
metrics such as Euclidean, Manhattan, Chebyshev, and Minkowski to calculate the distance
between data points (James, Witten, Hastie, & Tibshirani, 2013).
19
The equations presented denote different distance metrics used in machine learning: the
Euclidean, Manhattan, and Minkowski distances. The symbol 'd(x,y)' signifies the distance
between two data points x and y, with 'x_i' and 'y_i' indicating individual elements of these
data points. The variable 'n' stands for the count of elements in each data point, and 'p' is a
parameter that dictates the specific distance metric applied. When 'p' equals 2, the Euclidean
distance is the result, and when 'p' equals 1, the outcome is the Manhattan distance. The
selection of an appropriate distance metric, along with the choice of the 'K' value, can greatly
influence the performance of the KNN algorithm. Additionally, it's important that KNN can be
hindered by the 'curse of dimensionality' in scenarios involving high-dimensional data.
Logistic regression is a supervised learning algorithm used primarily for binary classification
problems. It predicts the probability of an instance, represented by the input variables,
belonging to the default class (Y=1). This prediction can be represented as a binary variable,
which, in the context of this study, translates into the likelihood of the presence or absence of
BC based on the input features from the datasets. The logistic function, also known as the
sigmoid function, maps any real-valued number into a value between 0 and 1, which can then
be interpreted as the predicted probability (James, Witten, Hastie, & Tibshirani, 2013). The
logistic function is represented as follows:
The coefficients (β_0, β_1,..., β_n) in the LR model are estimated from the training data using
the method of maximum likelihood (Hosmer Jr et al., 2013). The LR model can thus be
expressed as follows:
This implies that the probability of Y=1, given the input variables X, can be computed by
applying the sigmoid function to the linear combination of the input features and their
respective coefficients.
At each node in the DT, the feature providing the highest information gain is chosen as the
splitting criterion. On the other hand, Gini impurity measures how frequently a randomly
chosen instance from the dataset would be mislabelled if it were labelled randomly according
to the distribution of class labels in the dataset. The Gini impurity is computed as follows:
To identify the best feature for splitting, the Gini impurity index is calculated for each feature.
This is done by considering the weighted sum of Gini impurities for each subset produced by
the split:
The feature with the lowest Gini index is chosen as the splitting criterion at each node in the
DT. In these formulas, "S" symbolizes the dataset, while "c" represents each class within the
dataset. The term "p(c)" indicates the proportion of instances belonging to class "c".
Simultaneously, "A" denotes the feature under consideration for splitting the dataset into
subsets 'S_v', each containing instances that share the value 'v' for the feature 'A'. Lastly, "|S_v|"
and "|S|" represent the quantity of instances in subsets "S_v" and "S", respectively (Quinlan,
1986).
2.4. Assessing Machine Learning Models: The Key Role of Precision, Recall, F1-
score, and Accuracy
Performance analysis is undertaken to assess a model and determine its effectiveness. The
(James et al., 2013) analysis allows us to pinpoint areas of the model that require improvement,
to evaluate its efficacy, and to ensure its reliability (Hastie et al., 2009). The performance
evaluation is conducted through various indicators, including precision, recall, accuracy, and
the F1-score. Accuracy refers to the proportion of correctly classified instances relative to the
overall number of classifications. It is a beneficial metric to employ when the class distribution
is balanced i.e., there is an equal or near-equal number of samples in each class (James et al.,
2013). However, with imbalanced datasets, accuracy may not provide a comprehensive
evaluation of the model's performance, as it can be heavily influenced by the majority class.
This is due to the fact that a model could predict the majority class every time and still achieve
high accuracy, leading to misleading results (Hastie et al., 2009). Precision is an effective
measure for evaluating results when the cost of false positives is high.
22
In many cases, there is a trade-off between precision and recall, where optimizing one may
lead to the reduction of the other. To balance these metrics, the F1-score is often used, which
is the harmonic mean of precision and recall. The F1-score provides a single metric that
considers both precision and recall, making it especially useful when the costs of false positives
and false negatives are very different, not just in the case of imbalanced datasets (Saito &
Rehmsmeier, 2015).
Precision
Recall
F-1 score
Accuracy
True Positive (TP) and True Negative (TN) are instances correctly classified by a model,
denoting actual positives and negatives, respectively. False Positive (FP) and False Negative
(FN) denote False classifications, representing negatives classified as positives and positives
classified as negatives, respectively. Choosing the appropriate performance metrics for
evaluating a machine learning model requires a detailed understanding of these measures. This
choice must also consider the unique attributes of the dataset and the specific objective of the
task.
23
3. Literature Review
Various machine learning algorithms have been used for BC diagnosis, with differing results
in terms of accuracy, precision, recall, and F1-scores.The performance of these algorithms in
different demographic settings and with diverse datasets has not been fully explored. Given the
24
A key finding from the study was the comparably low accuracy rates demonstrated by these
models. Among them, the Gradient Boosting classifier performed the best, albeit with an
accuracy of just 74.14%. At the lower end of the spectrum, KNN showed the least accuracy,
yielding a result of 58.14%. These relatively subdued accuracy rates were attributed by the
researchers to improper data standardization, which seemed to have a notable impact despite
the thorough hyperparameter tuning applied across all models. Ultimately, Yolanda's,2019
study serves as a significant benchmark for future research in this domain. The results
emphasize the paramount importance of proper data standardization when utilizing machine
learning models, particularly in the context of the BCCD. The study thus illuminates a clear
path forward for future endeavors, aiming to enhance the accuracy of BC prediction using
machine learning techniques.
Sharma et al. (2018) conducted an study that incorporated the k-fold cross-validation method,
specifically a 10-fold technique, to partition the data into ten distinct segments. From a total of
569 observations, 398 were utilized for the training set and the remaining 171 served as the
testing set. This division established a training to testing ratio of approximately 70:30. The
central aim of this study was to carry out a comparative exploration of three unique ML
algorithms. These were implemented on the WDBC dataset, offering valuable insights into
their utility for diagnostic purposes. Significantly, Sharma et al. (2018) reported that all the
algorithms under study achieved an accuracy of over 94% in differentiating benign from
malignant tumours. Notably, the KNN algorithm displayed superior performance in
comparison to its counterparts. It excelled in terms of accuracy, precision, and F1 score, metrics
essential for evaluating the performance of diagnostic models. However, the study observed
that the maximum accuracy achieved by any of the algorithms did not exceed 96%. This
limitation could be attributed to the use of base machine learning models without the
optimization offered by hyperparameter tuning. In comparison, Al-Azzam and Shatnawi
(2021) reported higher accuracy rates ranging from 97% to 98% for their machine learning
models. Their study highlighted the potential utility and efficiency of optimized machine
learning models in cancer diagnostics.
While LR and KNN demonstrated impressive results in Al-Azzam and Shatnawi's (2021)
study, other machine learning models, such as SVM, have also been extensively utilized in BC
26
detection. The SVM model, known for its robustness and versatility, has been explored in
different configurations and methodologies. One such methodology was proposed by (Osman,
2017) with WBC dataset. The technique consisted of applying SVM model with 10 folds
followed by two step clustering method in which data instances in dataset assigned to pre-
cluster, then the second step, clustering, takes these pre-clusters as input and applies a
clustering algorithm The disadvantage of this strategy is that clusters sometime do not represent
meaningful groupings, which might lead to bias. The study further confirmed the effectiveness
of the method by conducting a T-test which emphasized the significant improvement brought
by the two-step SVM approach. Although this research reported an accuracy of 99.10% which
is better than traditional SVM which have accuracy of 96.69%. In the research conducted by
de Brito in 2018, the application of SVM models consistently surpassed the performance of the
baseline model in all instances. This was particularly apparent when 'noise' variables, namely
adipokines related to obesity, were included in the models. The SVM models with noise
delivered better results in terms of accuracy than those without, with one notable exception
being the linear kernel model without noise, which demonstrated a superior sensitivity level
but at the cost of specificity. the SVM models show a variety of results across different kernel
types - Linear, Polynomial, and Radial. In the case of the Linear model, the accuracy improved
from 62.86% without noise to 71.43% with noise. Similar improvement was observed in the
Polynomial model. The Radial model, however, showed the highest accuracy both with and
without noise, scoring 68.57% and 77.14% respectively. The results demonstrated that while
all models performed better than chance, thus indicating the value of these variables, the study
concluded that the performance levels did not surpass other methods discussed in the existing
literature. Overall all accuracy of models were less than 80%
was further supplemented by the use of the J48 DT model with leaf nodes classified as either
malignant or benign. Pre-processed data was loaded onto Weka software and Information Gain
was applied to the dataset attributes. The J48 model was then implemented for DT generation
with leaf node acting as the class label. Diagnosis for new patients was determined by cross-
referencing values in DT, thereby specifying the type of tumour as benign or malignant.
Faud(2018) study examined seven different machine learning algorithms for detecting BC
using a specified dataset. Algorithms implemented included Random Forest, SVM, KNN, LR,
Gaussian Naïve Bayes, Convolutional Neural Network (CNN), and Artificial Neural Network
(ANN). Without Principal Component Analysis (PCA), deep learning models (CNN and ANN)
were found to outperform the other algorithms, with ANN achieving a perfect recall score of
100% - a critical metric in cancer detection, where false negatives can have serious
implications. Nevertheless, the precision score of the ANN model was somewhat lower,
indicating that it might produce more false positives, while CNN and ANN achieved superior
results with the given dataset, the performance drop seen with the application of PCA signals
the potential sensitivity of these models to data transformations. Additionally, the variability
in performance across different models emphasizes the importance of thoroughly testing
multiple algorithms in machine learning applications for BC detection.
Table 2.Performance of Machine Learning Models on Breast Cancer Datasets Utilized in Our
Study
(Al-Azzam and PCA and t-SNE Supervised & WDBC 97% to 98%
Shatnawi,2021) unsupervised
algorithms
(Austria et al., 70:30 split with 7 ML models BCCD Max accuracy: Gradient
2019) all possible boosting with 74.14%
hyperparameter
The literature review reveals a wide spectrum of machine learning methods applied to BC
detection, spanning from SVM,KNN) to LR, Naive Bayes (NB), and Random Forests (RF).
However, the recent trend in the field leans towards deep learning techniques, especially
29
Convolutional Neural Networks (CNNs), given their superior performance. In a notable study,
Vaka et al. (2020) leveraged deep neural networks to develop a BC diagnostic algorithm,
achieving an impressive precision of 97.21%. Further enriching the landscape of deep learning
applications in BC diagnosis, Vaka et al. (2019) introduced a novel approach using a mix of
machine learning and deep learning techniques, including RCNN and Bidirectional Recurrent
Neural Networks (HA-BiRNN). Their simulation results underscored the enhanced precision,
efficiency, and image quality obtained using the DNN method. In another compelling study,
Vasundhara et al. (2019) proposed an intuitive method for classifying mammography images
as normal, benign, or malignant, utilizing several machine learning techniques. However, it
was their CNN and ANN models that outperformed the traditional machine learning
approaches, achieving accuracies of 97.3% and 99.3%, respectively. This suggests that CNN,
with its sophisticated filtering and morphological operations, is a highly effective classifier for
intuitive classification of digital mammograms.
Guha et al., mortality, risk factors SEERMedic The largest After a diagnosis of Tabular
2022 and Incidence of are risk of AF BC, the prevalence
atrial fibrillation for analysis occurs within of AF in women
BC: an SEER- the initial 60 increases
Medicare analysis days dramatically. BC
following a diagnosed at a more
cancer advanced stage is
diagnosis, substantially
with an correlated with AF.
annual Women with such a
incidence of new BC diagnosis
3.9%. The that develop AF have
risk of dying an increased chance
from heart of death from
31
classification
performance, it did
expedite the
execution time,
resulting in lower
processing expenses.
Melekoodap Methods for detecting CNN and Using the According to Image
pattu et al., BC in mammograms texture ensemble experimental
2022 using a hybrid of featurebase approach, we evidence, the
modified CNN and d found that combined technique
textural features The approach MIAS had a improves
Journal for Ambient specificity of measurement metrics
Intelligence with 97.8% and an at each stage in a
Humanized accuracy of manner that is
Computing 98.6%, while independent of the
DDSM others.
34
scored 98.3%
and 97.9%
whether a
35
benign.
Bhise et al., BC Detection using CNN CNN was Accuracy and Image
2021 ML Techniques method found to be precision are used as
superior to yardsticks for the
other system's efficiency.
approaches in The probabilistic
terms of outcomes have been
accuracy, predicted with the
precision, use of activation
and data set functions like ReLu.
size.
Shen et al., Using Deep Learning CNN and More datasets These results Image
2019 to Screening convolution without ROI demonstrate the
Mammography to al network annotations feasibility of training
Boost BC Detection method can be used automatic deep
to fine-tune learning algorithms
the method, to find accuracy on
even if the mammography
datasets were platforms, They
produced show promise for
36
This research utilizes WDBC dataset and BCCD dataset, sourced from the UCI Machine
Learning repository. Following the exploratory analysis, the dataset was partitioned into testing
and training sets. Four distinct predictive methods were implemented: KNN, LR, DT, and
SVM. These methods were employed to scrutinize the datasets, with classifiers utilizing
confusion matrices and other metrics to assess model efficacy. The final step involved a
comparative study of the accuracy of each model against established ones, aiming to identify
the most effective strategy.
37
4.1. Data
The WDBC and BCCD datasets were chosen for this research due to their broad applicability
in numerous research areas. ML models were trained on this binary dataset, achieving an
acceptable level of accuracy. The subsequent subsections provide a detailed rationale for the
specific selection of these datasets:
The BCCD Dataset a smaller and less-analyzed dataset of patients compared to WDBC dataset,
presents an excellent (Patrício et al., 2018), opportunity to examine how ML models perform
on limited data. By exploring this dataset, to gain valuable insights into the best ML model for
BC diagnosis, contributing to the ongoing efforts to improve cancer detection and treatment.
It contains clinical data collected from patients at the University of Coimbra in Portugal. The
dataset is focused on BC diagnosis using biomarkers and clinical features, making it suitable
for ML analysis.BCCD Dataset consists of several columns, each representing different clinical
features collected from patients at the University of Coimbra in Portugal. The columns and
their descriptions are as follows:
1 Age Integer
2 BMI Float
3 Glucose Integer
4 Insulin Float
5 HOMA Float
6 Leptin Float
38
7 Adiponectin Float
8 Resistin Float
9 MCP-1 Float
This dataset comprises nine features, each representing a unique aspect of the patient's health.
Some of these features emphasize body composition, while others concentrate on hormone
levels. The dataset includes both integer and floating-point data types for these features.
The WDBC dataset, which includes data from 569 patients, was used to create the breast tumor
feature set. This dataset, disseminated by Dr. William H. Wolberg from the University of
Wisconsin-Madison's Department of General Surgery, comprises fluid samples from patients
with solid breast tumors. The cytological feature analysis during digital scans was facilitated
by a program called Xcyt (Wolberg & Mangasarian, 1990). This program employs a curve-
fitting method to calculate ten features and returns the mean, worst case, and standard error
(SE) for each. Each sample's record also includes a diagnosis—either malignant (M) or benign
(B). The dataset comprises 569 instances and 32 attributes, including ID, diagnosis, and 30
input features. The WDBC is relatively evenly divided between benign and malignant cases,
with approximately 37% malignant and 63% benign cases.
Python programming language was used in jupyter notebook. It is used for data analysis and
ML tasks. Python processed the following critical stages to aid the data analyst in carrying
out this task for real-time prediction of BC:
1) Perform the pre-processing steps to remove the missing values by importing the
3) Assess the effectiveness of the models by creating and utilizing appropriate functions,
including the ROC curve, confusion matrix, cross-validation metrics, learning curve, and the
precision-recall curve.
Prior to implementation of model both BCCD and WDBC dataset were preproces. The data
were standardized to ensure that all features had the same scale. To achieve this, a
StandardScaler() function from the Scikit-Learn library was used. The standardization was
applied separately to the training and test datasets to avoid data leakage. In this case, the
StandardScaler() was fitted on the training data, and then it was used to transform both the
training and test datasets. The dataset was partitioned into training and test following an 80:20.
The data split ratio of 80:20 was chosen as it is especially crucial in preventing class imbalance
that could introduce bias into our model. Furthermore, given the relatively small size of the
BCCD dataset, which contains only 116 instances, it is of utmost importance to provide
sufficient data to train the model effectively. Ensuring that the model has ample training data
contributes to improved accuracy and predictive performance. Therefore, this partition ratio
was deemed optimal for our particular research scenario.
4.4. Methodology
The ML repository is the source from which the WDBC and BCCD datasets are collected. Pre-
processing procedures were conducted on these individual datasets ensuring cases of malignant
WDBC, cases of benign WDBC, and cases of BCCD should each be given their group in the
material that has been gathered. Based on the correlation coefficients, classify the
characteristics as either favourable, unfavourable, or random. It is essential to recognize and
eliminate any irrelevant components if one wishes to achieve productive results. After
conducting an exploratory analysis of the data set, we should then split the data into a test set
and a training set. The datasets were analyzed using four different prediction methods: KNN,
LR, DT, and SVM. Following on execution of the model, this classifier will make its forecast
40
by employing several confusion matrices and other metrics to evaluate the effectiveness of the
models. In the end, we will need to conduct some research and analysis, comparing the
precision of each model to that of well-known ones. The overall analysis process is depicted in
figure 1
The first step in applying GridSearch K-fold CV to the WDBC dataset is to define the
hyperparameters and their possible values for the selected model. For instance, if an SVM is
chosen, it is essential to explore various values for the cost parameter (C) and different kernel
types. Subsequently, The dataset is partitioned into 'K' folds of roughly equivalent size, with
care taken to maintain an approximate balance of malignant and benign instances within each
fold. For KNN, the model was optimized through experimenting with different parameters. The
parameter 'k' in KNN represents the number of neighboring points considered when making a
prediction. Adjusting 'k' can significantly impact the model's performance. If 'k' is too small,
the model may be overly sensitive to noise in the data; if 'k' is too large, the model may be
oversimplified, failing to capture important patterns. Therefore, An range of 'k' values were
explored to identify the ideal number of neighbors that would yield the highest prediction
accuracy. In terms of distance measurement, three distance metrics were evaluated: Euclidean,
Manhattan, and Chebyshev. In execution of LR model, the hyperparameter were used were
'C','penality', 'solver'. Different values of 'C' were experimented with during the tuning process
to strike the right balance. Different solvers perform well with different types of data and their
choice can significantly affect the efficiency and accuracy of the model. Liblinear solver, an
apt choice for small-scale datasets and binary categorization tasks.
The disease prediction model is trained on the different k training folds using each combination
of hyperparameters, and its performance is evaluated on the validation fold. Suitable metrics,
such as accuracy or F1 score, should be employed for this evaluation. Afterward, the average
performance across all K-folds for each hyperparameter combination is calculated. This
process enables accurate evaluation of the model's performance using that specific combination
of hyperparameters.
Once the set of hyperparameters that leads to the best average result across all K-folds is
identified, it is chosen as the ideal configuration for the model. Lastly, the prediction model is
retrained using the entire WDBC dataset with the chosen hyperparameters, creating the final,
optimized model.Utilizing GridSearch K-fold CV in with dataset facilitates the optimization of
hyperparameters for prediction models. This optimization process contributes to the
42
development of more accurate and reliable tools for distinguishing malignant and benign breast
tumours.
4.4.2. Visualizing Feature Importance: Insight into Significant Factors for Breast
Cancer Diagnosis
For this study, the dataset was initially loaded and prepared using the Pandas library, and the
target variable was segregated from the input features. Subsequently, a random forest classifier
was instantiated with a fixed random state to guarantee the reproducibility of the results. The
classifier was then trained with the dataset, and feature importance were calculated post-
training, relying on impurity decrease within each DT.
For the ensuing analysis, the derived feature importance values will serve as the foundation for
feature selection. The least significant feature, MCP-1, will be eliminated, and the impact on
the accuracy of four different ML algorithms KNN, LR, DT, and SVM will be assessed. This
approach aids in reducing the complexity of our model. Simplifying the model by eliminating
43
less significant features can improve its interpretability without substantially compromising the
model's predictive accuracy. Moreover, a feature such as MCP-1 with minimal importance can
potentially introduce noise into the model, reducing its overall predictive capability. By
eliminating MCP.1, aim to reduce this noise, thereby improving the precision and robustness
of the model's predictions. This method of analysis will help clarify the impact of selecting
different features on the effectiveness of various classification algorithms, particularly when it
comes to diagnosing BC.
The chosen models for the current research on BC diagnosis, namely DT, KNN, LR, and SVM,
have been meticulously selected based on their literature successes in similar contexts and their
theoretical underpinnings. Notably, each model exhibits unique advantages in handling the type
of dataset and problem domain at hand.
44
DT and K-NN, as evinced in Al-Azzam and Shatnawi's (2021) study, demonstrated remarkable
accuracy levels on the WDBC dataset. Their inherent methodologies cater to pattern detection
in a multidimensional feature space, essential in medical datasets that often exhibit complex
structures and interactions among features. DT models further provide a transparent decision-
making process, aiding clinicians in understanding the model's predictions.SVM, as used by
Osman (2017), exhibits impressive flexibility in accommodating linear and non-linear
relationships due to its kernel trick capability. The choice of kernel and the regularization
parameter 'C' allow us to manage the bias-variance trade-off, a critical aspect to prevent
overfitting and underfitting. Hence, SVM is highly suited for a dataset such as ours, where we
can leverage this flexibility to effectively model complex data patterns. LR another model we
have adopted, is fundamentally a binary classification algorithm.
The probability outputs provided by the LR model are highly interpretable, providing
meaningful predictions of a tumor being malignant or benign. LR's ability to estimate feature
coefficients also provides us with insights into each feature's influence on the prediction,
enabling the possibility of feature importance analysis. LR also includes a regularization
component that helps to avoid overfitting by penalizing large values of the parameters. The
selected libraries provide an effective execution and optimization of these models, offering an
extensive range of tools for tasks such as pre-processing of features, model training, validation,
and the evaluation of performance.
45
5. Result
The following section meticulously scrutinizes the performance of four selected prominent ML
algorithms. To ensure fair evaluation, each algorithm is scrutinized based on uniform standards
such as accuracy, precision, sensitivity (recall), F1-score, AUC-ROC .
An evaluation of the KNN model's performance was conducted using precision, recall, and F1-
score metrics, as illustrated in Figure 2. When applying the Euclidean distance measure, the
model achieved a precision 90.00%, a recall 87.50%, and an F1-score of 87.30%. When the
Manhattan distance measure was used, a slight decline in model performance was noted, with
precision, recall, and F1-score values of 85.29%, 79.16%, and 78.22%, respectively. Further,
the model's performance experienced additional deterioration when the Chebyshev distance
measure was utilized, with respective precision, recall, and F1-score values of 87.50%, 83.33%,
and 82.86%. Given these results, the Euclidean distance measure facilitated the highest levels
of overall accuracy and precision, signifying its superior performance.
47
The least important feature of the BCCD dataset was identified as MCP.1. Upon the removal
of this feature from the dataset, a significant improvement in the KNN algorithm's performance
was observed. The accuracy of the KNN model, when configured with 5 neighbors and
employing the Manhattan distance metric, reached a notable 95.83%.
Turning to the WDBC dataset, the basic KNN model demonstrated an accuracy of 95.61%.
Seeking to enhance this foundational result, hyperparameter tuning was introduced, resulting
in an increased prediction accuracy of 96%. This further corroborated the impact of optimal
hyperparameter selection on the model's prediction efficacy. The final approach combined
PCA with hyperparameter tuning, producing the most substantial performance improvement
and an achieved accuracy of 96.49%. Notably, the optimal value of 'k' was determined to be
'9'. Additionally, a contrast was observed between the two datasets in terms of the most
effective distance metrics – Euclidean for the BCCD dataset and Manhattan for the WDBC
dataset. The confusion matrix underscored the model's proficient classification ability,
correctly predicting 106 out of 108 benign cases and 59 out of 63 malignant cases. Figures 3
and 4 provide further insights into the KNN model's performance by confusion matrices. The
precision, recall, and F1-score for benign predictions were 0.96, 0.98, and 0.97 respectively,
while those for malignant predictions stood at 0.97, 0.94, and 0.95, indicating a slightly
stronger performance for benign case predictions.
48
During this study, we looked into the viability of using SVM that used either a linear or
polynomial kernel. For the BCCD dataset, the best parameters identified were: C=100,
degree=2, gamma='scale', and kernel='linear'. The model achieved an accuracy of 79.17%, with
a confusion matrix indicating 11 true positives, 1 false positive, 4 false negatives, and 8 true
negatives. The classification report revealed a precision of 0.73 and 0.89, recall of 0.92 and
0.67, and F1-scores of 0.81 and 0.76 for Class 1 and Class 2, respectively. The accuracy of the
SVM algorithm got up to 87.5% after eliminated the MCP.1 feature, which was the least
important feature in the BCCD dataset
With WDBC dataset,a basic SVM model was used initially which achieved an accuracy of
96%. Upon further optimization with hyperparameter tuning, the SVM model displayed a
remarkable accuracy of 98.24%. Analysis yielded the following with PCA and optimal
hyperparameters: C=0.1, degree=2, gamma='scale', and kernel='linear'. The resulting model
exhibited a higher accuracy of 99.42%. The confusion matrix displayed 108 true positives, 0
49
false positives, 1 false negative, and 62 true negatives. The classification report demonstrated
a precision of 0.99 and 1.00, recall of 1.00 and 0.98, and F1-scores of 1.00 and 0.99 for benign
and malignant classes, respectively, as portrayed in Figure 7. Figures 6 and 7 provide
comparative views of SVM performance across different kernels for the WDBC and BCCD
datasets. Finally, Figures 8 and 9 reveal the confusion matrices for the SVM models on WDBC
and BCCD datasets respectively, providing a detailed overview of the prediction capabilities
of the models.
Figure 6. WDBC dataset with PCA performance comparison of different SVM kernel
50
The performance of the LR model with the chosen hyperparameters was evaluated using
accuracy, confusion matrix, and classification report. The model achieved an accuracy of
91.67% on the test set. The confusion matrix showed that 12 TP and 10 TN predictions were
made, while 0 FP and 2 FN prediction occurred. The classification report revealed that the
model achieved a precision, recall, and F1-score of 0.92 for both class 1 and class 2. The ROC
curve was plotted to visualize the trade-off between TPR and FPR at various decision
thresholds. The curve demonstrated a satisfactory level of discrimination between the two
classes. The area AUC score was calculated to be 0.9444, indicating that the LR model is
capable of differentiating between the two classes with a high degree of accuracy. The model
was then retrained after removing the 'MCP.1' feature, which was perceived to be of least
importance. Surprisingly, the performance of the model deteriorated slightly, yielding an
accuracy of 87.5%. Figure 9 displays the confusion matrix for the logistic regression (LR)
model on the BCCD dataset, effectively highlighting the model's capability to accurately
identify true positives and true negatives
52
The performance of a basic LR model was evaluated using WDBC dataset, resulting in a
compelling test accuracy of 98.25%. Subsequently, the model underwent hyperparameter
tuning which led to an enhanced accuracy of 99.42%, same accuracy obtained through PCA.
The confusion matrix revealed that there were 108 true negatives (TN), 62 true positives (TP),
1 false negative (FN), and no false positives (FP). This result indicates that the model
performed exceptionally well in identifying types of cancer cases in the WDBC dataset. The
classification report for the WDBC dataset showed a precision of 1.00 for class 0 (non-
cancerous) and 0.99 for class 1 (cancerous), and a recall of 1.00 for class 0 and 0.98 for class
1. The corresponding F1-scores were 1.00 for class 0 and 0.99 for class 1. The precision, recall,
and F1-score, assessed through both macro and weighted averages, exhibit values around 0.99
for the model. This suggests an exceptional performance and implies a high degree of reliability
in the model's predictive capability.
Hyperparameter Accuracy
The above table represent the accuracy of validation set on WBDC dataset. The combination
of C=0.001, Penalty=L1, and Solver=liblinear achieved an accuracy of 62.56%. This
combination, however, provided the lowest accuracy among the three, likely due to the strong
regularization (as represented by the small 'C' value) that may have led to underfitting, and the
L1 penalty which might have resulted in a sparse solution. The combination of C=0.001,
Penalty=L2, and Solver=liblinear yielded a significantly higher accuracy of 94.22%. The L2
penalty, unlike the L1 penalty, does not result in a sparse model, which might explain the
improved performance despite the strong regularization. Lastly, the combination of C=0.1,
Penalty=L2, and Solver=liblinear provided the highest accuracy of 97.48%. A higher 'C' value
was used, indicating less stringent regularization. This might have permitted the model to
53
identify more intricate patterns in the data, thus resulting in improved accuracy.Figure 10
presents the confusion matrix for the LR model on the WDBC dataset, illustrating performance
of the model in distinguishing between benign and malignant cancer cases.
For BCCD dataset Upon evaluating the performance of the optimized model, it was found to
achieve an overall accuracy of 75% on the test data. The confusion matrix revealed that out of
24 test samples, the model correctly classified 18 samples, while misclassifying 6 samples,
yielding a balanced outcome between the two classes. Both precision and recall metrics for the
two classes were also 75%, which is in line with the overall accuracy. The F1-score, which is
the harmonic average of precision and recall, was also 75% for both categories, demonstrating
that the classification performance was well-balanced. Figure 11 showcases the confusion
matrix for the Decision Tree (DT) model when applied to the BCCD dataset
1 75%
2 66%
3 71%
Results highlight the importance of hyperparameter tuning in ML models. While it might seem
that increasing the complexity of a model (in this case, increasing the 'Max Depth') would lead
to better performance, this is not always the case. In fact, models with too much complexity
can suffer from overfitting, leading to poorer performance on unseen data. The accuracy of the
algorithm got up to 83.33% after eliminating the MCP-1 feature, which was the least important
feature.
In our study utilizing the WDBC dataset, we found that the accuracy of the basic DT is
94.72%. Upon applying hyperparameter tuning, the accuracy remained consistent. The
utilization of PCA on the WDBC dataset effectively reduced its complexity. PCA transformed
the original dataset, comprising 30 features, into a new dataset characterized by a smaller set
55
of features, known as principal components. These principal components accounted for the vast
majority of the variability present in the original dataset. As a consequence, the complexity of
the DT model was notably diminished, thereby enhancing its performance when applied to the
reduced set of features , when PCA was employed in conjunction with hyperparameter tuning,
the model's accuracy significantly improved to 97.37% on the test dataset. This high accuracy
suggests that the model can successfully predict cancer diagnoses in approximately 97.37% of
the test cases, demonstrating its proficiency in discerning whether a breast mass is malignant
or benign. Upon examining the confusion matrix, it was evident that the model correctly
classified 69 benign instances as benign (TP) and 42 malignant instances as malignant (TN).
However, the model incorrectly classified 2 benign instances as malignant (FP) and 1
malignant instance as benign (FN). The model showcased exceptional performance in terms of
precision, recall, and f1-score for the benign (0) class, with scores of 0.99, 0.97, and 0.98
respectively, while the malignant (1) class scored 0.95, 0.98, and 0.97, respectively. These
metrics point towards a high-performing model, especially for benign cases, with a slight
underperformance for malignant cases. Further exploration of the 'max_depth' hyperparameter
revealed that at a max_depth of 1, the model achieved an accuracy of 96.49%. Increasing the
max_depth to 2 caused a slight decrease in accuracy to 95.61%. However, setting the
max_depth to 3 improved the model's performance, pushing the accuracy up to 97.37%. This
indicates that a max_depth of 3 enables the model to better capture the patterns in the data,
without causing overfitting, leading to improved performance on the test data. Figure 12
presents the confusion matrix for the DT model on the WDBC dataset. Despite the model
misclassifying a few instances, its overall performance is commendable, correctly predicting a
significant majority of both benign and malignant cases
1 96.49%
2 95.61%
3 97.37%
56
On the BCCD dataset, all models performed moderately well, but after removing the least
important feature, the accuracy of all models increased significantly, with the exception of LR,
which had a fall in accuracy. KNN has seen a significant increase from 87.50% to 95.83%.
Accuracy of the SVM and DT models increased from 79.17% to 87.5% and 75% to 83.33%,
respectively. With 91.62% accuracy, LR proved to be the best model. When the least important
feature was removed, KNN performed best with 95.85%.The results are presented in table 9
for BCCD dataset.
The results from the analysis of WDBC dataset suggest that all four ML models KNN, SVM,
LR, and DT - performed exceptionally well in terms of their predictive accuracies. However,
upon hyperparameter tuning and PCA, it was evident that some models outperformed the
others. Both the SVM and LR algorithms showed notable performance improvements, reaching
an accuracy of 99.42%. The accuracy of the DT algorithm also improved significantly after the
incorporation of PCA, attaining an accuracy of 97.37%. Meanwhile, KNN showed the least
improvement, yet maintained a commendable accuracy rate of 96.49%.
Moreover, the recall, precision, and F1-score also witnessed improvement with the fine-tuning
of these models, indicating their ability to deliver reliable results while minimizing errors. It
was also observed that adjusting hyperparameters such as 'C' and 'max_depth' and
implementing feature reduction techniques like PCA significantly boosted the predictive power
of the models. The detailed performance of each model provides essential insights into the
factors that contribute to an accurate prediction of BC diagnoses. These findings underscore
the potential of ML algorithms to aid in the early detection and diagnosis of BC, thereby
enhancing patient care and outcomes.
58
Table 9. Comparison of Machine Learning Model Accuracies on BCCD Dataset Before and
After Feature Elimination
DT 75% 83.33%
LR 91.67% 87.5%
Table 10. Model Performance: Assessing Accuracy Across Machine Learning Models on
WDBC Dataset
According to the literature, Austria et al., 2019 7 ml model was applied on BCCD dataset which
obtained a maximum accuracy of 74% with gradient boosting and other algorithms such as
KNN have obtained accuracy of 58.14% while other models in the similar range this is due to
the data was not standardized lead to the least accuracy meanwhile our research has used the
different model with standardised data so our research has seen the accuracy of our model has
increased to 87.50% for KNN. Other models LR performed 91.67% which was 72.14%
59
accuracy in the previous studies meanwhile SVM,DT also perform better compared to Austria
studies. Sharma et al.2018 achieved an accuracy of 95.90% for the KNN algorithm, which was
the same as our basic KNN applied on the WDBC dataset but with PCA and hypertunning
96.49%. However, while Sharma et al. applied a split ratio of 70:30, our model achieved an
accuracy on an 80:20 split ratio. While the author used rf and naive bayes, the accuracy level
was 94%-95%.
Faud ,2018 used techniques with and without PCA on the WDBC dataset, and the greatest
accuracy attained by LR was 97.68%, with our LR model performing one of the best with
99.42% and better on many stages such as precision, recall, and fi-score. In contrast to model
tested here, the accuracy of ml models with PCA in faud, 2018 article declines in the majority
of models. LR accuracy decreases to 96.54%.In comparison Our model outperforms models
without PCA. This is due to the hyperparameter tuning that was applied to our model.
Hazra et al. (2020) applied the DT and ANN algorithms to the WDBC dataset and achieved a
maximum accuracy of 98% with ANN and an accuracy of 86% without eliminating highly
correlated features and of 86% with DT model. After eliminating highly correlated features
accuracy increases to96%.In comparison to the hazra study, our research used PCA on DT and
attained an accuracy of 97.37% yielding a higher accuracy.
In another study, Durgalakshmi et al.'s research with the present study, several important
distinctions and similarities come to the fore. Both studies utilized machine learning algorithms
to predict breast cancer diagnosis, yet the choice and usage of datasets differed significantly.
Durgalakshmi et al. used the WDBC dataset reported that the highest accuracy of 94% was
obtained using selected features in the Naïve Bayes (NB) and Random Forest (RF)
models.When selected features were used with the SVM model, the accuracy obtained was
81%.
60
Table 11. Comparative Analysis of Machine Learning Model Performance with Previous
Studies
Breast cancer represents a leading cause of mortality worldwide. Early detection stands as a
crucial element in mitigating this death rate, accentuating the need for efficient and reliable
diagnostic tools. The advent of machine learning techniques offers immense promise in this
regard, paving the way for significant advancements in the early detection of breast cancer.
The study explored the use of ML techniques for BC diagnosis with two distinct datasets:
BCCD and WDBC. The models utilized included KNN, SVM, LR, and DT classifiers, which
were refined through hyperparameter tuning and feature selection to enhance performance.
In the BCCD dataset, the KNN model showed an accuracy of 87.5% when using the Euclidean
distance measure. Interestingly, the accuracy increased to 95.83% upon removal of the least
impactful feature (MCP.1). Similarly, the performance of both SVM and DT models improved
with the removal of this feature. Despite the LR model's performance decreasing after this
feature removal, it had initially achieved an accuracy of 91.62%.In the WDBC dataset, LR
model performed most effectively, achieving an accuracy of 99.42% with optimal
hyperparameters. The DT model's performance improved noticeably when combined with
PCA, with an accuracy increase from 94.72% to 97.37% at a max_depth of 3. The SVM model
initially achieved an accuracy of 96% that increased to 98.24% after hyperparameter tuning.
Notably, combining PCA with hyperparameter tuning further improved the SVM model's
accuracy to 99.42%.
1. How and which machine learning models are utilized to detect cancer in patient data
according to literature?
A significant number of ML models have been deployed for the detection of BC in patient
data, as highlighted in the literature review above. Different algorithms have shown varying
degrees of success and applicability based on the specific datasets used and the nature of the
62
2. Among SVM, KNN, DT, and LR, which ML model demonstrates the best prediction
performance when applied to the BCCD and WDBC dataset?
Based on the results of this study, different models showed the best performance on the two
datasets. For the BCCD dataset, after removing the least importance feature, KNN
demonstrated predictive accuracy of 95.85%. For the WDBC dataset, both SVM and LR
showed an outstanding performance with an accuracy of 99.42% after hyperparameter tuning
and the implementation of PCA.
This study, while providing significant insights, is not without its limitations. One such
limitation pertains to the size and nature of the dataset utilized. The scope and generalizability
of the findings may be influenced by these factors, as a larger and more diverse dataset with
outliers could not necessarily guarantee highly accurate results. This raises the question of how
63
well our findings might extend to other populations or scenarios, and underscores the need for
future research to replicate and extend these results with more expansive and varied datasets.
In this study, machine learning models have been successfully employed to diagnose cancer
with commendable accuracy. The study focused on four main machine learning models: KNN,
SVM, LR, and DT. However, there are many more advanced models that could potentially
improve predictive performance.
Moreover, although the study has made great strides in maximizing accuracy, it's important to
acknowledge that in the realm of medical diagnosis, accuracy is not the only performance
metric that matters. False negatives often have more detrimental effects than false positives,
highlighting the need to consider other performance metrics. This broader view on performance
measures would provide a more holistic evaluation of the effectiveness of ML models in
diagnosis. This research has made significant strides in the application of ML models for
diagnosis, while also identifying important areas for future study.
The objective of employing machine learning models to diagnose cancer with the best possible
level of accuracy was successfully accomplished. However, there could have been more
successful in determining the specific reason for the presence of malignant or benign traits.
This requires the assistance of an expert in the relevant field.
While the research has provided valuable insights using four specific machine learning
algorithms, there is an opportunity to expand the scope of future studies to explore other
potential algorithms. These could include, but are not limited to, Random Forests, Neural
Networks, or ensemble methods. Broadening the range of algorithms studied could lead to the
discovery of more effective prediction models, further improving the diagnosis of cancer.
Runtime is a critical consideration for ML models; understanding the minimum and maximum
execution times for each model is crucial. Many machine learning models don't indicate the
64
level of uncertainty associated with their predictions, an aspect that could be vital in medical
diagnostics. Future research could therefore focus on developing models that quantify this
uncertainty in addition to making predictions. The importance of the feature extraction process
in building effective machine learning models is a well-acknowledged facet of data-driven
research. While the current thesis primarily focuses on employing existing feature sets,
innovative approaches to feature extraction could be a significant area for future research.
The use of American databases in this study has raised the issue of data representation and the
applicability of the results across different ethnic groups. Future research could rectify this by
incorporating a more diverse range of datasets, including those from Asian populations. This
would enable the development of models that can accurately diagnose cancer in a wider range
of patients, enhancing the generalizability and cross-cultural applicability of the results.
By exploring these aspects, future research can contribute valuable insights into the
practicalities of integrating machine learning models into clinical workflows, highlighting
potential benefits and challenges, and ultimately guiding the way towards more effective,
efficient, and personalized healthcare.
65
References
Abdullah Al-Dhabi, N., Srigopalram, S., Ilavenil, S., Kim, Y.O., Agastian, P., Baaru, R.,
Balamurugan, K., Choi, K.C. and Valan Arasu, M., 2016. Proteomic analysis of stage-II breast
cancer from formalin-fixed paraffin-embedded tissues. BioMed research international, 2016
Al-Azzam, N., & Shatnawi, I. (2021). Comparing supervised and semi-supervised Machine
Learning Models on Diagnosing Breast Cancer. Annals of medicine and surgery (2012), 62,
53–64
Alom, M.Z., Yakopcic, C., Nasrin, M.S., Taha, T.M. and Asari, V.K., 2019. Breast cancer
classification from histopathological images with inception recurrent residual convolutional
neural network. Journal of digital imaging, 32, pp.605-617
Amethiya, Y., Pipariya, P., Patesl, S. and Shah, M., 2022. Comparative analysis of breast
cancer detection using machine learning and biosensors. Intelligent Medicine, 2(2), pp.69-81
Antropova, Huynh, B. Q., & Giger, M. L. (2017). A deep feature fusion methodology for breast
cancer diagnosis demonstrated on three imaging modality datasets. Medical Physics
(Lancaster), 44(10), 5162–5171. https://doi.org/10.1002/mp.12453
Atrey, A., Narayan, N., Vijh, S. and Kumar, S., 2022, January. Analysis of Breast Cancer using
Machine Learning Methods. In 2022 12th International Conference on Cloud Computing, Data
Science & Engineering (Confluence) (pp. 258-261). IEEE
Bell, J., 2022. What is machine learning?. Machine Learning and the City: Applications in
Architecture and Urban Design, pp.207-216
Bhise, S., Gadekar, S., Gaur, A.S., Bepari, S. and Deepmala Kale, D.S.A., 2021. Breast cancer
detection using machine learning techniques. Int. J. Eng. Res. Technol, 10(7)
Bicchierai, G., Di Naro, F., De Benedetto, D., Cozzi, D., Pradella, S., Miele, V. and Nori, J.,
2021. A review of breast imaging for timely diagnosis of disease. International Journal of
Environmental Research and Public Health, 18(11), p.5509
66
Cover, T. M., & Hart, P. E. (1967). Nearest neighbor pattern classification. IEEE Transactions
on Information Theory, 13(1), 21-27
De Blok, C. J. M., Wiepjes, C. M., Nota, N. M., van Engelen, K., Adank, M. A., Dreijerink, K.
M. A., Barbé, E., Konings, I. R. H. M., & den Heijer, M. (2019). Breast cancer risk in
transgender people receiving hormone treatment: nationwide cohort study in the Netherlands.
BMJ (Clinical research ed.), 365, l1652. https://doi.org/10.1136/bmj.l1652
Dong, W., Bensken, W.P., Kim, U., Rose, J., Berger, N.A. and Koroukian, S.M., 2022.
Phenotype discovery and geographic disparities of late-stage breast cancer diagnosis across US
Counties: a machine learning approach. Cancer Epidemiology, Biomarkers &
Prevention, 31(1), pp.66-76
Durgalakshmi, B., & Vijayakumar, V. (2015). Progonosis and modelling of breast cancer and
its growth novel naïve bayes. Procedia Computer Science, 50, 551-553
Edge, S. B., & Compton, C. C. (2010). The American Joint Committee on Cancer: the 7th
edition of the AJCC cancer staging manual and the future of TNM. Annals of Surgical
Oncology, 17(6), 1471-147
El Chamieh, C., Vielh, P. and Chevret, S., 2022. Statistical methods for evaluating the fine
needle aspiration cytology procedure in breast cancer diagnosis. BMC Medical Research
Methodology, 22(1), p.40
Elanany, M.A., Osman, E.E.A., Gedawy, E.M. and Abou-Seri, S.M., 2023. Design and
synthesis of novel cytotoxic fluoroquinolone analogs through topoisomerase inhibition, cell
cycle arrest, and apoptosis. Scientific Reports, 13(1), p.4144
Esteva, Kuprel, B., Novoa, R. A., Ko, J., Swetter, S. M., Blau, H. M., & Thrun, S. (2017).
Dermatologist-level classification of skin cancer with deep neural networks. Nature (London),
542(7639), 115–118. https://doi.org/10.1038/nature21056
67
Ferlay, J., Ervik, M., Lam, F., Colombet, M., Mery, L., Piñeros, M., Znaor, A., Soerjomataram,
I. and Bray, F., 2020. Global cancer observatory: cancer today. International Agency for
Research on Cancer, Lyon
Fuad, W. M. (2018). Early detection of breast cancer using machine learning (Doctoral
dissertation, Brac University)
Glielmo, A., Husic, B.E., Rodriguez, A., Clementi, C., Noé, F. and Laio, A., 2021.
Unsupervised learning methods for molecular simulation data. Chemical Reviews, 121(16),
pp.9722-9758
Gonçalves, C.B., Souza, J.R. and Fernandes, H., 2022. CNN architecture optimization using
bio-inspired algorithms for breast cancer detection in infrared images. Computers in Biology
and Medicine, 142, p.105205
Guha, A., Fradley, M.G., Dent, S.F., Weintraub, N.L., Lustberg, M.B., Alonso, A. and
Addison, D., 2022. Incidence, risk factors, and mortality of atrial fibrillation in breast cancer:
a SEER-Medicare analysis. European heart journal, 43(4), pp.300-312
Hastie, T., Tibshirani, R., & Friedman, J. (2009). The elements of statistical learning: data
mining, inference, and prediction. Springer Science & Business Media
Hastie, Tibshirani, R., & Friedman, J. (2009). The Elements of Statistical Learning Data
Mining, Inference, and Prediction (Second.). Springer New York. https://doi.org/10.1007/978-
0-387-84858-7
Hosmer Jr, D. W., Lemeshow, S., & Sturdivant, R. X. (2013). Applied logistic regression (Vol.
398). John Wiley & Sons
James, G., Witten, D., Hastie, T., & Tibshirani, R. (2013). An introduction to statistical
learning. Springer
James, Witten, D., Hastie, T., & Tibshirani, R. (2021). An Introduction to Statistical Learning:
With Applications in R. Springer.
Jia, K.Y., Menes, T.S., Bernstein-Molho, R., Nissan, A. and Zippel, D., 2023. Characterization
of patients with a diagnosis of breast cancer and melanoma: genetic susceptibility or increased
surveillance?. European Journal of Cancer Prevention: the Official Journal of the European
Cancer Prevention Organisation (ECP).
68
Kim, J., Kim, J.Y., Lee, H.B., Lee, Y.J., Seong, M.K., Paik, N., Park, W.C., Park, S., Jung,
S.P., Bae, S.Y. and Korean Breast Cancer Society, 2020. Characteristics and prognosis of 17
special histologic subtypes of invasive breast cancers according to World Health Organization
classification: comparative analysis to invasive carcinoma of no special type. Breast Cancer
Research and Treatment, 184, pp.527-542
Krishnan, M. M. R., Banerjee, S., Chakraborty, C., Chakraborty, C., & Ray, A. K. (2010).
Statistical analysis of mammographic features and its classification using support vector
machine. Expert Systems with Applications, 37(1), 470-478
Leão, D.C.M.R., Pereira, E.R., Pérez-Marfil, M.N., Silva, R.M.C.R.A., Mendonça, A.B.,
Rocha, R.C.N.P. and García-Caro, M.P., 2021. The importance of spirituality for women facing
breast cancer diagnosis: a qualitative study. International journal of environmental research and
public health, 18(12), p.6415
Łukasiewicz, S., Czeczelewski, M., Forma, A., Baj, J., Sitarz, R., & Stanisławek, A. (2021).
Breast Cancer-Epidemiology, Risk Factors, Classification, Prognostic Markers, and Current
Treatment Strategies-An Updated Review. Cancers, 13(17), 4287.
https://doi.org/10.3390/cancers13174287
Mahmood, F.M., 2023. A comparison of rural and urbans women's knowledge and attitudes
toward breast cancer. Journal of Population Therapeutics and Clinical Pharmacology, 30(3),
pp.515-521
McWilliam, A., Faivre-Finn, C., Kennedy, J., Kershaw, L. and Van Herk, M.B., 2016. Data
mining identifies the base of the heart as a dose-sensitive region affecting survival in lung
cancer patients. International Journal of Radiation Oncology, Biology, Physics, 96(2), pp. S48-
S49
Mushtaq, Z., Yaqub, A., Sani, S., & Khalid, A. (2020). Effective K-nearest neighbor
classifications for Wisconsin breast cancer data sets. Journal of the Chinese Institute of
Engineers, 43(1), 80-92
Naji, M.A., El Filali, S., Aarika, K., Benlahmar, E.H., Abdelouhahid, R.A. and Debauche, O.,
2021. Machine learning algorithms for breast cancer prediction and diagnosis. Procedia
Computer Science, 191, pp.487-492
69
Osman, A.H., 2017. An enhanced breast cancer diagnosis scheme based on two-step-SVM
technique. International Journal of Advanced Computer Science and Applications, 8(4)
Park, E.Y., Yi, M., Kim, H.S. and Kim, H., 2021. A decision tree model for breast
reconstruction of women with breast cancer: a mixed method approach. International Journal
of Environmental Research and Public Health, 18(7), p.3579
Park, K.H., Batbaatar, E., Piao, Y., Theera-Umpon, N. and Ryu, K.H., 2021. Deep learning
feature extraction approach for hematopoietic cancer subtype classification. International
Journal of Environmental Research and Public Health, 18(4), p.2197
Patrício, M., Pereira, J., Crisóstomo, J., Matafome, P., Seiça, R., Caramelo, F., & Gomes, M.
(2018). Breast Cancer Coimbra Dataset [Data file]. Faculty of Medicine of the University of
Coimbra and University Hospital Centre of Coimbra. UCI Machine Learning Repository.
https://archive.ics.uci.edu/ml/datasets/Breast+Cancer+Coimbra
Pise, N.N. and Kulkarni, P., 2008, December. A survey of semi-supervised learning methods.
In 2008 International conference on computational intelligence and security (Vol. 2, pp. 30-
34). IEEE
Rabiei, R., Ayyoubzadeh, S.M., Sohrabei, S., Esmaeili, M. and Atashi, A., 2022. Prediction of
breast cancer using machine learning approaches. Journal of Biomedical Physics &
Engineering, 12(3), p.297
Rasool, A., Tao, R., Kashif, K., Khan, W., Agbedanu, P. and Choudhry, N., 2020, February.
Statistic Solution for Machine Learning to Analyze Heart Disease Data. In Proceedings of the
2020 12th International Conference on Machine Learning and Computing (pp. 134-139)
Ross, Slodkowska, E. A., Symmans, W. F., Pusztai, L., Ravdin, P. M., & Hortobagyi, G. N.
(2009). The HER‐2 Receptor and Breast Cancer: Ten Years of Targeted Anti–HER‐2 Therapy
and Personalized Medicine. The Oncologist (Dayton, Ohio), 14(4), 320–368.
https://doi.org/10.1634/theoncologist.2008-0230
Saito, T., & Rehmsmeier, M. (2015). The precision-recall plot is more informative than the
ROC plot when evaluating binary classifiers on imbalanced datasets. PloS one, 10(3),
e0118432
70
Schölkopf, & Smola, A. J. (2001). Learning with Kernels: Support Vector Machines,
Regularization, Optimization, and Beyond. MIT Press
Sharma, R., 2021. Global, regional, national burden of breast cancer in 185 countries: evidence
from GLOBOCAN 2018. Breast Cancer Research and Treatment, 187, pp.557-567
Sharma, S., Aggarwal, A., & Choudhury, T. (2018, December). Breast cancer detection using
machine learning algorithms. In 2018 International conference on computational techniques,
electronics and mechanical systems (CTEMS) (pp. 114-118). IEEE
Shen, L., Margolies, L.R., Rothstein, J.H., Fluder, E., McBride, R. and Sieh, W., 2019. Deep
learning to improve breast cancer detection on screening mammography. Scientific
reports, 9(1), p.12495
Siegel, R.L., Miller, K.D., Fuchs, H.E. and Jemal, A., 2022. Cancer statistics, 2022. CA: a
cancer journal for clinicians, 72(1), pp.7-33
Subashini, T.S., Ramalingam, V. and Palanivel, S., 2009. Breast mass classification based on
cytological patterns using RBFNN and SVM. Expert Systems with Applications, 36(3),
pp.5284-5290
Sung, H., Ferlay, J., Siegel, R.L., Laversanne, M., Soerjomataram, I., Jemal, A. and Bray, F.,
2021. Global cancer statistics 2020: GLOBOCAN estimates of incidence and mortality
worldwide for 36 cancers in 185 countries. CA: a cancer journal for clinicians, 71(3), pp.209-
249
Sutton, R.S. and Barto, A.G., 2018. Reinforcement learning: An introduction. MIT press.
Vaka, A.R., Soni, B. and Reddy, S., 2020. Breast cancer detection by leveraging Machine
Learning. Ict Express, 6(4), pp.320-324
Vaka, A.R., Soni, B. and Reddy, S., 2020. Breast cancer detection by leveraging Machine
Learning. Ict Express, 6(4), pp.320-324
Vasundhara, S., Kiranmayee, B.V. and Suresh, C., 2019. Machine learning approach for breast
cancer prediction. International Journal of Recent Technology and Engineering (IJRTE), 8(1).
71
Wang, Z., Sun, H., Li, J., Chen, J., Meng, F., Li, H., Han, L., Zhou, S. and Yu, T., 2022.
Preoperative prediction of axillary lymph node metastasis in breast cancer using CNN based
on multiparametric MRI. Journal of Magnetic Resonance Imaging, 56(3), pp.700-709
Wolberg, W. H., & Mangasarian, O. L. (1990). Multisurface method of pattern separation for
medical diagnosis applied to breast cytology. Proceedings of the National Academy of
Sciences, 87(23), 9193-9196
Zhou, Z.H., 2018. A brief introduction to weakly supervised learning. National science
review, 5(1), pp.44-53