0% found this document useful (0 votes)

10 views

Foml Project Report

Uploaded by

wovec92659

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

10 views

Foml Project Report

Uploaded by

wovec92659

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

You are on page 1/ 8

JAYPEE INSTITUTE OF INFORMATION TECHNOLOGY, NOIDA

B. TECH 5th SEMESTER

Fundamentals of Machine Learning Project

TITLE OF PROJECT:

Cancer Detection Models

Submitted By: Submitted To :

Enrollment No. Name

22103124 Khushi Agarwal
22103148 Rishav Sachdeva Dr. Sherry Garg
22103143 Soham Kukreti
22103151 Daksh Jain
PROJECT REPORT
PROBLEM STATEMENT

Cancer diagnosis is a critical area of healthcare that demands accurate and timely
predictions. Early and precise detection of cancer can significantly improve
treatment outcomes and patient survival rates. This project focuses on building
machine learning models that can classify cancer levels (malignant/benign) using
patient data.

OVERVIEW OF THE PROJECT

This project employs machine learning techniques to detect cancer based on

medical data. By leveraging algorithms like K-Nearest Neighbors (KNN), Logistic
Regression, Naive Bayes, and Support Vector Machines (SVM), the project aims
to compare their effectiveness in predicting cancer levels. This comparative study
helps identify the most accurate and efficient model, enabling advancements in
automated cancer detection.

OBJECTIVE OF THE PROJECT

● To develop a machine learning-based cancer detection models capable of

classifying cancer as benign or malignant.
● To compare the accuracy and effectiveness of KNN, Logistic Regression,
Naive Bayes, and SVM algorithms.
● To provide a reliable and efficient tool to assist in the early diagnosis of
cancer.
ALGORITHM INSIGHTS

1. K-Nearest Neighbors (KNN)

○ A non-parametric algorithm that classifies data points based on the
majority class of their k-nearest neighbors.
○ Suitable for small datasets and intuitive to implement, though it can be
computationally expensive for large datasets.
2. Logistic Regression
○ A statistical method for binary classification based on a linear
relationship between the input features and the log-odds of the target
variable.
○ Simple to implement and interpret, making it a baseline model for
classification tasks.
3. Naive Bayes
○ A probabilistic algorithm based on Bayes’ theorem, assuming feature
independence.
○ Fast and efficient for high-dimensional datasets, though its
independence assumption may not hold for all features.
4. Support Vector Machine (SVM)
○ A robust algorithm that finds the hyperplane that best separates
classes in a dataset.
○ Effective for high-dimensional spaces and datasets with a clear
margin of separation.
FLOWCHART
MODEL IMPLEMENTATION AND EVALUATION

1. Dataset Overview

The dataset contains cancer-related medical data and labels.

● Total Records: 1000

● Columns:
○ 23 Features
○ Features include various medical measurements such as Obesity,
GeneticRisk, BalancedDiet, WeightLost, ShortnessOfBreath,etc.
○ Target: Level (0 for benign, 1 for malignant).

2. Missing Data Handling

● Identified missing values and imputed them using the median value for
numerical columns.

3. Feature Engineering

● Encoded categorical variables (if any) using one-hot encoding.

● Standardized numerical features to improve model performance.

MODEL TRAINING AND EVALUATION

1. Algorithms/Models Implemented

● K-Nearest Neighbors (KNN): Tested for its simplicity and performance on

small datasets.
● Logistic Regression: Used as a baseline model for cancer detection.
● Naive Bayes: Implemented to leverage its probabilistic approach for
classification.
● Support Vector Machine (SVM): Included for its robustness in handling
complex decision boundaries.

2. Evaluation Metrics

● Accuracy: Measures the proportion of correct predictions.

● Precision, Recall, F1 Score: Provide insights into the model’s ability to
classify positive and negative samples.
● Confusion Matrix: Visualizes true positives, false positives, true negatives,
and false negatives.
● Cross-validation: Ensures reliability and consistency in model evaluation.

Confusion Matrices for all 4 models:

1. Naive Bayes:

Accuracy : 91%

2. Logistic regression:

Accuracy: 98.4%

3. Support vector machines (SVM)

Accuracy: 99.9%

4. K nearest Neighbors:
Accuracy : 96.5%

CONCLUSION

This project demonstrates the application of machine learning algorithms in cancer

detection. Among the four algorithms tested, Support Vector Machines (SVM)
emerged as the most effective, achieving the highest accuracy and F1 score. By
leveraging SVM’s robustness, this model can serve as a reliable tool for early
cancer diagnosis. The study also highlights the strengths and weaknesses of KNN,
Logistic Regression, and Naive Bayes, providing a comprehensive understanding
of their applicability in medical data analysis.

REFERENCES

1. Hastie, Trevor, et al. "The Elements of Statistical Learning: Data Mining,

Inference, and Prediction." Springer, 2009.
2. Bishop, Christopher M. "Pattern Recognition and Machine Learning."
Springer, 2006.
3. Cortes, Corinna, and Vladimir Vapnik. "Support-vector networks." Machine
Learning 20.3 (1995): 273-297.

Breast Cancer Prediction Using Machine Learning
No ratings yet
Breast Cancer Prediction Using Machine Learning
8 pages
Dimma:: A Design and Implementation Methodology For Metaheuristic Algorithms - A Perspective From Software Development
No ratings yet
Dimma:: A Design and Implementation Methodology For Metaheuristic Algorithms - A Perspective From Software Development
18 pages
Autosys Technical Document
100% (1)
Autosys Technical Document
40 pages
Cancer Detection and Analysis Using Machine Learning: Abstract-Among The Various Types of Diseases, Cancer Is
No ratings yet
Cancer Detection and Analysis Using Machine Learning: Abstract-Among The Various Types of Diseases, Cancer Is
5 pages
Mini Project
No ratings yet
Mini Project
3 pages
Breast Cancer
No ratings yet
Breast Cancer
20 pages
Breast_Cancer_Classification_Report
No ratings yet
Breast_Cancer_Classification_Report
16 pages
A-14 Mini Project Abstract
No ratings yet
A-14 Mini Project Abstract
15 pages
Feature Selection For Breast Cancer Detection Using Machine Learning Algorithms
No ratings yet
Feature Selection For Breast Cancer Detection Using Machine Learning Algorithms
4 pages
BSAN Case 3
No ratings yet
BSAN Case 3
9 pages
Journal-Breast Cancer Prediction
No ratings yet
Journal-Breast Cancer Prediction
10 pages
Paper 3
No ratings yet
Paper 3
2 pages
Sandeep Report1
No ratings yet
Sandeep Report1
70 pages
Neural Network
No ratings yet
Neural Network
15 pages
Cancer Detection Using Data Mining
No ratings yet
Cancer Detection Using Data Mining
13 pages
Breast Cacner Detection
No ratings yet
Breast Cacner Detection
6 pages
ML Acti
No ratings yet
ML Acti
23 pages
Breast Cancer Detection and Prediction: Created by
No ratings yet
Breast Cancer Detection and Prediction: Created by
20 pages
Applications of Machine Learning Techniques To Predict Diagnostic Breast Cancer
No ratings yet
Applications of Machine Learning Techniques To Predict Diagnostic Breast Cancer
11 pages
Justification of the Research Proposed
No ratings yet
Justification of the Research Proposed
22 pages
A Hybrid Model To Predict The Breast Cancer Using Stacking and Bagging Model
No ratings yet
A Hybrid Model To Predict The Breast Cancer Using Stacking and Bagging Model
6 pages
ML Report2
No ratings yet
ML Report2
21 pages
Machine Learning Algorithms For Breast Cancer Analysis: Performance and Accuracy Comparison
No ratings yet
Machine Learning Algorithms For Breast Cancer Analysis: Performance and Accuracy Comparison
8 pages
BREAST CANCER VIJAY & ARAVIND PROJECT 2024-06-28 RECREATE
No ratings yet
BREAST CANCER VIJAY & ARAVIND PROJECT 2024-06-28 RECREATE
14 pages
Project Report: Bangladesh University of Business & Technology (BUBT)
No ratings yet
Project Report: Bangladesh University of Business & Technology (BUBT)
18 pages
4150-8028-1-PB
No ratings yet
4150-8028-1-PB
12 pages
Breast Cancer Diagnosis
No ratings yet
Breast Cancer Diagnosis
31 pages
Final Research Paper
No ratings yet
Final Research Paper
5 pages
MLReport.pdf
No ratings yet
MLReport.pdf
5 pages
Logistic Regression For Malignancy Prediction in Cancer - by Luca Zammataro - Towards Data Science
No ratings yet
Logistic Regression For Malignancy Prediction in Cancer - by Luca Zammataro - Towards Data Science
32 pages
IDS Project Group 11
No ratings yet
IDS Project Group 11
35 pages
Final Breast Cancer
100% (1)
Final Breast Cancer
23 pages
Breast Cancer Prediction
No ratings yet
Breast Cancer Prediction
5 pages
Goni 2020
No ratings yet
Goni 2020
5 pages
Breast Cancer Modeling and Prediction Combining
No ratings yet
Breast Cancer Modeling and Prediction Combining
6 pages
IJERT Developing A Web Based System For
No ratings yet
IJERT Developing A Web Based System For
5 pages
Chapter One to Three
No ratings yet
Chapter One to Three
39 pages
Breast Cancer Prediction Model Assignment
No ratings yet
Breast Cancer Prediction Model Assignment
37 pages
CHAPTER ONE to 3-1
No ratings yet
CHAPTER ONE to 3-1
51 pages
Sahana S_1BI22MC086
No ratings yet
Sahana S_1BI22MC086
47 pages
DiseasePredReport (3) (1)
No ratings yet
DiseasePredReport (3) (1)
42 pages
New Highlighted - Thesis Final V2
No ratings yet
New Highlighted - Thesis Final V2
160 pages
On Breast Cancer Detection: An Application of Machine Learning Algorithms On The Wisconsin Diagnostic Dataset
No ratings yet
On Breast Cancer Detection: An Application of Machine Learning Algorithms On The Wisconsin Diagnostic Dataset
5 pages
Using Predictive Analytics Model To Diagnose Breast Cnacer
No ratings yet
Using Predictive Analytics Model To Diagnose Breast Cnacer
9 pages
Classification of Breast Cancer Risk Using Naïve Bayes, Decision Tree, And Random Forest
No ratings yet
Classification of Breast Cancer Risk Using Naïve Bayes, Decision Tree, And Random Forest
15 pages
On Breast Cancer Detection: An Application of Machine Learning Algorithms On The Wisconsin Diagnostic Dataset
No ratings yet
On Breast Cancer Detection: An Application of Machine Learning Algorithms On The Wisconsin Diagnostic Dataset
5 pages
cancer detection
No ratings yet
cancer detection
8 pages
Breast Cancer Detection
No ratings yet
Breast Cancer Detection
15 pages
IRJMETS51200105224
No ratings yet
IRJMETS51200105224
5 pages
Breast Cancer Diagnostiic Using Machine Learning
No ratings yet
Breast Cancer Diagnostiic Using Machine Learning
72 pages
Breast Cancer Prediction Using Machine Learning
No ratings yet
Breast Cancer Prediction Using Machine Learning
1 page
Artikel Data Science Yohana Juniati Sitorus b.indo.Id.en
No ratings yet
Artikel Data Science Yohana Juniati Sitorus b.indo.Id.en
7 pages
A Computational Study On Classification of Malignant
No ratings yet
A Computational Study On Classification of Malignant
63 pages
Project Final
No ratings yet
Project Final
15 pages
Project PPT1 Enhanced
No ratings yet
Project PPT1 Enhanced
16 pages
Presentation 3
No ratings yet
Presentation 3
17 pages
Artikel Data Science yohana juniati sitorus b.indo.id.en
No ratings yet
Artikel Data Science yohana juniati sitorus b.indo.id.en
7 pages
Project Proposal
No ratings yet
Project Proposal
1 page
Brest Cancer Tumor Detection
No ratings yet
Brest Cancer Tumor Detection
40 pages
Extending the Boundaries: An Expansive Journey into Nonparametric Curve Estimation
From Everand
Extending the Boundaries: An Expansive Journey into Nonparametric Curve Estimation
Pasquale De Marco
No ratings yet
Smart Business Problems and Analytical Hints in Cancer Research
From Everand
Smart Business Problems and Analytical Hints in Cancer Research
Zemelak Goraga
No ratings yet
Uncertainty Theories and Multisensor Data Fusion
From Everand
Uncertainty Theories and Multisensor Data Fusion
Alain Appriou
No ratings yet
SIPI - Week 6 Group Discussion (SOLUTION)
100% (1)
SIPI - Week 6 Group Discussion (SOLUTION)
2 pages
Latex Font Encodings
No ratings yet
Latex Font Encodings
40 pages
Lect11 PDF
No ratings yet
Lect11 PDF
13 pages
Fidessa Sentinel Compliance
No ratings yet
Fidessa Sentinel Compliance
2 pages
Practical File: Parallel Computing
No ratings yet
Practical File: Parallel Computing
34 pages
Logical Execution of A SELECT: Chapter 3: Selecting
No ratings yet
Logical Execution of A SELECT: Chapter 3: Selecting
1 page
ZXHN F600 PON ONT User Manual
No ratings yet
ZXHN F600 PON ONT User Manual
9 pages
Spark User Guide
No ratings yet
Spark User Guide
7 pages
Java Mock Interview Questions
100% (1)
Java Mock Interview Questions
7 pages
Remote Use Projector - 2
No ratings yet
Remote Use Projector - 2
3 pages
64 Bits
No ratings yet
64 Bits
8 pages
Zara
No ratings yet
Zara
5 pages
Forms and Dashboard in TallyERP 9 Booklet
No ratings yet
Forms and Dashboard in TallyERP 9 Booklet
30 pages
Unit 1-2
No ratings yet
Unit 1-2
30 pages
Industrial Training Reply Form: Yes No 1. Willingness To Offer Industrial Training Placement To
No ratings yet
Industrial Training Reply Form: Yes No 1. Willingness To Offer Industrial Training Placement To
1 page
VLSM
100% (1)
VLSM
27 pages
stm32f10x Gpio
No ratings yet
stm32f10x Gpio
16 pages
Service Level Agreement
100% (2)
Service Level Agreement
14 pages
Input and Output Devices
No ratings yet
Input and Output Devices
6 pages
Automation - Engine INSIDE AE en PDF
No ratings yet
Automation - Engine INSIDE AE en PDF
322 pages
HP488 30 3
100% (1)
HP488 30 3
25 pages
Health Science Center IT Center - Training Training@health - Ufl.edu 352-273-5051
No ratings yet
Health Science Center IT Center - Training Training@health - Ufl.edu 352-273-5051
33 pages
Redmon Resume PDF
0% (1)
Redmon Resume PDF
1 page
Mechanical Engineering News: For The Power, Petrochemical and Related Industries December, 1997
No ratings yet
Mechanical Engineering News: For The Power, Petrochemical and Related Industries December, 1997
24 pages
Achieving Regulatory Compliance With Automated Data Flow (ADF)
0% (1)
Achieving Regulatory Compliance With Automated Data Flow (ADF)
3 pages
Practical: 9: Case Study: Testing Tools
No ratings yet
Practical: 9: Case Study: Testing Tools
7 pages
Practice Questions
No ratings yet
Practice Questions
2 pages

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.

Foml Project Report

Uploaded by

Foml Project Report

Uploaded by

JAYPEE INSTITUTE OF INFORMATION TECHNOLOGY, NOIDA

B. TECH 5th SEMESTER

Fundamentals of Machine Learning Project

Cancer Detection Models

Submitted By: Submitted To :

Enrollment No. Name

OVERVIEW OF THE PROJECT

This project employs machine learning techniques to detect cancer based on

OBJECTIVE OF THE PROJECT

● To develop a machine learning-based cancer detection models capable of

1. K-Nearest Neighbors (KNN)

The dataset contains cancer-related medical data and labels.

● Total Records: 1000

2. Missing Data Handling

● Encoded categorical variables (if any) using one-hot encoding.

MODEL TRAINING AND EVALUATION

● K-Nearest Neighbors (KNN): Tested for its simplicity and performance on

● Accuracy: Measures the proportion of correct predictions.

Confusion Matrices for all 4 models:

3. Support vector machines (SVM)

This project demonstrates the application of machine learning algorithms in cancer

1. Hastie, Trevor, et al. "The Elements of Statistical Learning: Data Mining,

You might also like

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.