0% found this document useful (0 votes)

29 views12 pages

Analysis the Biomedical Datasets CSV File

The report analyzes a biomedical dataset in CSV format, focusing on insurance data related to smokers. It details the process of reading and analyzing the data using Python's pandas library, including statistical summaries and visualizations to explore relationships between variables such as age, sex, and medical charges. The conclusion emphasizes the importance of managing CSV files and conducting various analyses to uncover insights from biomedical data.

Uploaded by

hassanmukhtiar1r1

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

29 views12 pages

Analysis the Biomedical Datasets CSV File

Uploaded by

hassanmukhtiar1r1

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 12

(CS-103L- Introduction to Programming for Data Science)

Report # Open Ended Lab

Analysis the Biomedical Datasets CSV File

Report # Open Ended Lab

(CS-103L- Introduction to Programming for Data Science)

(Spring-2024)

Submitted By
Hassan Mukhtiar
(2023-BME-5)

Submitted To

Mr. Farhan Yousaf

Mr. Ali Noman

Department of Biomedical Engineering,

University of Engineering and Technology, Lahore,
New Campus
(CS-103L- Introduction to Programming for Data Science)
Report # Open Ended Lab

Analysis the Biomedical Datasets CSV File

Report # Open Ended Lab
Objective:
❖ To learn how to store and retrieval of diverse biomedical data.
❖ To identify how to use streamlined data analysis for pattern discovery.
❖ To learn how to examine and analysis, sharing, and interpretation of biomedical data.
❖ To understand how biomedical CSV file is to store structured data related to biomedical
research, healthcare, or clinical studies.

Data Base:
The database is an organized collection of structured data to make it easily accessible, manageable and
update. In simple words, we can say, a database in a place where the data is stored. The best analogy is
the library. The library contains a huge collection of books of different genres, here the library is
database, and books are the data. [1]
Example:
There are some databases examples include such as grocery store, bank E-commerce platforms,
healthcare systems, social media platforms.

Biomedical Database:
Databases that store and maintain biomedical data such as gene and protein sequences. Biomedical
data: NER is used extensively in biomedical data for gene identification, DNA identification, and the
identification of drug names and disease names. These experiments use CRFs with features engineered
for their domain data. [2]
Example: [3]
❖ Generic gene expression databases
❖ Nucleosome positioning region database
❖ Protein structure database
Biomedical Datasets:
Healthcare data sets include a vast amount of medical data, various measurements, financial data,
statistical data, demographics of specific populations, and insurance data, to name just a few, gathered
from various healthcare data sources. To investigate how data sets are used in the healthcare industry.
Example: [4]

❖ The Uniform Hospital Discharge Data Set (UHDDS)

❖ The Human Mortality Database (HMD)
❖ HealthData.gov
❖ SEER cancer incidence
❖ BROAD Institute Cancer Program Datasets
❖ Chronic Disease Data
(CS-103L- Introduction to Programming for Data Science)
Report # Open Ended Lab

Analyze Biomedical Datasets CSV File

Introduction of CSV file:

This is my CSV file which has insurance data of smoker and download from Kaggle website and save it
into laptop files and perform different task on this file. This file screenshot shown in below and different
task which also perform on it given below step by step.

Figure 1-Download CSV(Excel) file.

Read CSV file:

The statement f=pd.read_csv('OEL.csv') reads the data from a CSV file named 'OEL.csv' into a panda
DataFrame and give name it to the variable 'f'. This allows us to analyze the data using the panda’s
library in Python.
(CS-103L- Introduction to Programming for Data Science)
Report # Open Ended Lab

Figure 2.Read CSV file

Info of CSV file:

The expression info = f.info( ) in Python show a summary of the DataFrame 'f', include information
about its structure, such as the number of entries and data types, and assigns this summary to the
variable 'info'.

Figure 3-Info of CSV file.

Columns of CSV file:

The line columns = f.columns in Python returned the column names or labels from the DataFrame 'f' and
assigns them to the variable 'columns'. This allows for easily access to the names of the columns within
the DataFrame, facilitating further data manipulation or analysis based on column names.
(CS-103L- Introduction to Programming for Data Science)
Report # Open Ended Lab

Figure 4-Print Columns name of CSV file.

Head of CSV file:

f.head( ) shows the first few rows or by default 5 of the DataFrame 'f', providing overview of its
structure and content. This method is often used to inspect the data and understand its format before
performing further analysis or operations.

Figure 5-Head of CSV file

Tail of CSV file:

The f.tail() method in pandas DataFrame, when applied like f.tail(), return and displays the last few rows
or by default 5 of the DataFrame 'f'.

Figure 6-Tail of CSV file

Describe CSV file:

The describe() method in pandas DataFrame, when apply like f.describe(), generates a statistical data of
the numerical columns in the DataFrame 'f'. This data consists of measures such as count, mean,
(CS-103L- Introduction to Programming for Data Science)
Report # Open Ended Lab

standard deviation, minimum, maximum and, providing a comprehensive summary of the distribution
the numerical data within the DataFrame.

Figure 7-Describe all factors of CSV file.

f['region'].value_counts():

This key f['region'].value_counts() in pandas counts the present of each unique value in the 'region'
column of the DataFrame 'f'. It provides a series where the index represents each unique value in the
'region' column, and the corresponding values indicate how many times each value appears in the
column.

Figure 8- Count CSV file unique name of region.

Graph analysis CSV file:

As age increasing, there is a showing increasingly trend in charges, express that older individuals tend to
have higher medical expenses. The markers on the line represent individual data points, showing the
specific charges associated with each age.
(CS-103L- Introduction to Programming for Data Science)
Report # Open Ended Lab

Figure 9- Plot the graph b/w age ,charges and smoking.

Males in CSV file:

The following program shown that from ‘sex’ column when index i==male then add 1 intger variables
and at last of column print the total number of males in the csv file.

Figure 10-Number of male CSV file

(CS-103L- Introduction to Programming for Data Science)
Report # Open Ended Lab

Females in CSV file:

The following program shown that from ‘sex’ column when index i==female then add 1 intger variables
and at last of cloumn print the total number of females in the csv file.

Figure 11-Number of females in CSV file.

Females smoker in CSV file:

In the given program, we check that number of female in the files but who are smoking when both
conditin fulfil then these are store in varible and print this varible which shows number of female who
are smoking given below.

Figure 12-Number of females who are smoking.

Males smoker in CSV file:

In the given program, we check that number of male in the files but who are smoking when both
conditin fulfil then these are store in varible and print this varible which shows number of male who
are smoking given below.

Figure 13-Number male who smoking

Total number Smoker and Nonsmoker:

(CS-103L- Introduction to Programming for Data Science)
Report # Open Ended Lab

The following given program after read csv files iterate all rows in smoker column and when both
conditions fulfil then print the total number of smokers which have yes and nonsmoker which have non
condition when fulfil then print the total number of smoker and non-smoker.

Figure 14-Total number of smoker and non-smoker in CSV file.

Total number Smoker in different region:

The following program shown that the following shows the smoker which are present in different ratio
in different regions such smoker shows below southeast, northeast, and northwest and southwest.

Figure 15-Number of smokers in different region.

Sex and charges & smoker and charges relationship:

• There is following a program which shows relationship between the average charges and sex.
• In this CSV file the average charge of female is 12569.578844 and male average charges is
13956.751178.
• Another relationship between smokers and charges and nonsmokers and charges shows.
(CS-103L- Introduction to Programming for Data Science)
Report # Open Ended Lab

• People who are smoker their average charges is much less but charges of smoker who are
smoking their average charges is very huge.

Figure 16-Relationship Charges with Sex and smoker.

Average_charges_sex.describe():
• The describe() method in pandas DataFrame, when apply like Average_charges_sex.describe(),
generates a statistical data of the numerical columns in the DataFrame 'f'.
• This data consists of measures such as count, mean, standard deviation, minimum, maximum
and, providing a comprehensive summary of the distribution the numerical data within the
DataFrame.

Figure 17-Describe detail of Average charges with sex.

Average_charges_smoker.describe():

• The describe() method in pandas DataFrame, when apply like

Average_charges_smoker.describe(), generates a statistical data of the numerical columns in
the DataFrame 'f'.
• This data consists of measures such as count, mean, standard deviation, minimum, maximum
and, providing a comprehensive summary of the distribution the numerical data within the
DataFrame.
(CS-103L- Introduction to Programming for Data Science)
Report # Open Ended Lab

Figure 18-Describe detail of Average charges with smoker.

Line Graph between charges and age:

• This line graph explains the relationship between age and medical charges in the
OEL/insurance dataset. As age increases, there is an increasing trend in charges,
expressing that older individuals tend to have higher medical expenses.
• The markers on the line represent individual data points, showing the specific charges
associated with each age.
• The line’s increase slope suggests a positive correlation between age and medical charges,
indicating that age is a significant factor influencing healthcare costs.
• This shown highlights the importance of age as an increase as medical expenses.

Figure 19-Plot line graph between Age and charges

(CS-103L- Introduction to Programming for Data Science)
Report # Open Ended Lab

Conclusion:
We learnt how to manage or handle CSV files and perform different operations on them. Handling the
Biomedical Datasets CSV file in Python involved reading the data using pandas, exploring its structure
and content using methods like head(), tail(), and describe(), and conducting various analyses such as
statistical summaries, visualization, or machine learning modeling. To establish relationships between
different columns such as sex and charges and smoking and charges in the file, we could use
correlation analysis to identify any linear relationships between pairs of columns.

References:

1- https://www.edureka.co/blog/what-is-a-database/#Database
2- https://www.sciencedirect.com/topics/computer-science/biomedical-data
3- https://en.wikipedia.org/wiki/List_of_biological_databases
4- https://www.kaggle.com/datasets?search=biomedical+datasets

XII CS Unit1 CSV Notes
No ratings yet
XII CS Unit1 CSV Notes
6 pages
EDA Report (1)
No ratings yet
EDA Report (1)
10 pages
cs3362 Foundations of Data Science Lab Manual
No ratings yet
cs3362 Foundations of Data Science Lab Manual
53 pages
NUM-BSMATH-2023-15_Lab_Report_8_663c5f49df9a0
No ratings yet
NUM-BSMATH-2023-15_Lab_Report_8_663c5f49df9a0
4 pages
Chapter 3 Introduction To Data Science A Python Approach To Concepts, Techniques and Applications
No ratings yet
Chapter 3 Introduction To Data Science A Python Approach To Concepts, Techniques and Applications
22 pages
Data Science lab manual..
No ratings yet
Data Science lab manual..
54 pages
dsp-N211010-1
No ratings yet
dsp-N211010-1
25 pages
Pandas For Machine Learning: Acadview
No ratings yet
Pandas For Machine Learning: Acadview
18 pages
Experiment No 3 Importing and Exporting Data in Python Using Pandas Student
No ratings yet
Experiment No 3 Importing and Exporting Data in Python Using Pandas Student
6 pages
DAV Guidelines
No ratings yet
DAV Guidelines
4 pages
2nd unit
No ratings yet
2nd unit
31 pages
Python For Exploratory Data Analysis
No ratings yet
Python For Exploratory Data Analysis
12 pages
CS 3362 FDS
No ratings yet
CS 3362 FDS
53 pages
33 CSV File SQP
No ratings yet
33 CSV File SQP
10 pages
ML LAB
No ratings yet
ML LAB
46 pages
Ss Project With Python
No ratings yet
Ss Project With Python
9 pages
3.1. Statistics in Python - Scipy Lecture Notes
No ratings yet
3.1. Statistics in Python - Scipy Lecture Notes
20 pages
Guidelines_DAVP
No ratings yet
Guidelines_DAVP
3 pages
EMPLOYEE DATA ANALYSIS SYSTEM (IP CLASS XII)
No ratings yet
EMPLOYEE DATA ANALYSIS SYSTEM (IP CLASS XII)
26 pages
PP Manual Exp no. 07
No ratings yet
PP Manual Exp no. 07
9 pages
DSBDA Lab Manual
No ratings yet
DSBDA Lab Manual
155 pages
Dap
No ratings yet
Dap
12 pages
CSV Quest
No ratings yet
CSV Quest
4 pages
Week2 lab
No ratings yet
Week2 lab
8 pages
Experiment_1 csd201
No ratings yet
Experiment_1 csd201
19 pages
Ip Practical 2024 2025
No ratings yet
Ip Practical 2024 2025
14 pages
Employee Data Analysis System ( Ip Class 12 ) ( 2024-25 )
No ratings yet
Employee Data Analysis System ( Ip Class 12 ) ( 2024-25 )
30 pages
Lab Manual for Students
No ratings yet
Lab Manual for Students
38 pages
Ml Lab Manual 2024
No ratings yet
Ml Lab Manual 2024
41 pages
DBDAL LAB - MANUAL - Final
No ratings yet
DBDAL LAB - MANUAL - Final
93 pages
DATASCIENCE (1)
No ratings yet
DATASCIENCE (1)
3 pages
py10
No ratings yet
py10
5 pages
Julia For Data Science
No ratings yet
Julia For Data Science
15 pages
Practical No.-01
No ratings yet
Practical No.-01
25 pages
DOC 2(1)
No ratings yet
DOC 2(1)
18 pages
Student Notebook HR Analysis
No ratings yet
Student Notebook HR Analysis
11 pages
DSBDA Lab Plan
No ratings yet
DSBDA Lab Plan
5 pages
Pandas
No ratings yet
Pandas
7 pages
IP Project Complete Color Coded Justification-Aligned Outputs Changed
No ratings yet
IP Project Complete Color Coded Justification-Aligned Outputs Changed
55 pages
cs3362 Foundations of Data Science Lab Manual
75% (8)
cs3362 Foundations of Data Science Lab Manual
53 pages
Data Understanding and Preparation
No ratings yet
Data Understanding and Preparation
48 pages
Server Hosting Management System (Ip Class 12) (2024-25)
No ratings yet
Server Hosting Management System (Ip Class 12) (2024-25)
21 pages
Data Science Practicals - Ipynb
No ratings yet
Data Science Practicals - Ipynb
54 pages
Assignment 1 Python Programs
No ratings yet
Assignment 1 Python Programs
2 pages
Pandas PDF(2)
No ratings yet
Pandas PDF(2)
25 pages
DataFrame.docx
No ratings yet
DataFrame.docx
95 pages
Ip Project - Docx1
100% (4)
Ip Project - Docx1
22 pages
Project file 12
No ratings yet
Project file 12
22 pages
DSBDAlab Manual
No ratings yet
DSBDAlab Manual
116 pages
Pandas 2
No ratings yet
Pandas 2
17 pages
Python_1st_10
No ratings yet
Python_1st_10
11 pages
Unit 5
No ratings yet
Unit 5
93 pages
RevisionQP _CS_Binary
No ratings yet
RevisionQP _CS_Binary
1 page
Share INFORMATICS PRACTICES KABIR
No ratings yet
Share INFORMATICS PRACTICES KABIR
37 pages
Hw0 Programming Handout 4TbRRB6IAl
No ratings yet
Hw0 Programming Handout 4TbRRB6IAl
2 pages
Worksheet Topic: Data File Handling in Python CSV Files
No ratings yet
Worksheet Topic: Data File Handling in Python CSV Files
4 pages
AP Computer Science Principles: Student-Crafted Practice Tests For Excellence
From Everand
AP Computer Science Principles: Student-Crafted Practice Tests For Excellence
Sama Alshatali
No ratings yet
Introduction To Business Statistics Through R Software: Software
From Everand
Introduction To Business Statistics Through R Software: Software
Editor IJSMI
No ratings yet
Coding In C Decoded: Decoded, #1
From Everand
Coding In C Decoded: Decoded, #1
D Brown
No ratings yet
Introduction to Bioinformatics Using Action Labs
From Everand
Introduction to Bioinformatics Using Action Labs
Jean-Louis Lassez
5/5 (1)
EIVA Specification
100% (1)
EIVA Specification
68 pages
Lego group forecasting techniques
No ratings yet
Lego group forecasting techniques
2 pages
U4+5- L10 GB
No ratings yet
U4+5- L10 GB
17 pages
MachineLearning Unit-III Ppt
No ratings yet
MachineLearning Unit-III Ppt
26 pages
Technicl Analysis Module 1
No ratings yet
Technicl Analysis Module 1
113 pages
MTH101 Math Solved
No ratings yet
MTH101 Math Solved
5 pages
Telos Omnia One Multicast
No ratings yet
Telos Omnia One Multicast
61 pages
Penyearah Dengan Filter
No ratings yet
Penyearah Dengan Filter
21 pages
009.0 - RH120E - Attachment Functions BH - Neu
No ratings yet
009.0 - RH120E - Attachment Functions BH - Neu
24 pages
Python Mysql Tutorials
No ratings yet
Python Mysql Tutorials
5 pages
Voices of Verbs
No ratings yet
Voices of Verbs
6 pages
Tracking Solar Panel
No ratings yet
Tracking Solar Panel
10 pages
Exam ZTE
No ratings yet
Exam ZTE
3 pages
Open Door Policy Thesis
100% (2)
Open Door Policy Thesis
7 pages
Multipurporse Request Form
No ratings yet
Multipurporse Request Form
2 pages
How To Become ALPHA MALE - 7 STEPS - English
No ratings yet
How To Become ALPHA MALE - 7 STEPS - English
11 pages
SMART AND INTELLIGENT BUILDINGS
No ratings yet
SMART AND INTELLIGENT BUILDINGS
2 pages
Educ 109 Drill 1
No ratings yet
Educ 109 Drill 1
18 pages
Instruction Manual: Order No. 386S
No ratings yet
Instruction Manual: Order No. 386S
12 pages
Computer Networks Lab Manual R18
0% (1)
Computer Networks Lab Manual R18
63 pages
1051871 rev2 PERISTALTIC PUMP
No ratings yet
1051871 rev2 PERISTALTIC PUMP
1 page
5 Reading Passage For English Holiday Homework
No ratings yet
5 Reading Passage For English Holiday Homework
12 pages
WEG ADL300 Quick Start Up Guide 1S9QSEN en
No ratings yet
WEG ADL300 Quick Start Up Guide 1S9QSEN en
142 pages
09.14 Jenis Sarana Perkeretaapian
No ratings yet
09.14 Jenis Sarana Perkeretaapian
39 pages
4 Inputs / 4 Outputs Electrical Design Transistor PNP Output
No ratings yet
4 Inputs / 4 Outputs Electrical Design Transistor PNP Output
2 pages
Substantial Completion Checklist
No ratings yet
Substantial Completion Checklist
2 pages
Maths 9709 Paper 1 - Differentiation
No ratings yet
Maths 9709 Paper 1 - Differentiation
139 pages
Inventor Registracija
No ratings yet
Inventor Registracija
2 pages
Aurum 簡單報告
No ratings yet
Aurum 簡單報告
8 pages
PPoMP 36 2002 PDF
No ratings yet
PPoMP 36 2002 PDF
322 pages

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.

Analysis the Biomedical Datasets CSV File

Uploaded by

Analysis the Biomedical Datasets CSV File

Uploaded by

(CS-103L- Introduction to Programming for Data Science)

Report # Open Ended Lab

Analysis the Biomedical Datasets CSV File

(CS-103L- Introduction to Programming for Data Science)

Mr. Farhan Yousaf

Department of Biomedical Engineering,

Analysis the Biomedical Datasets CSV File

❖ The Uniform Hospital Discharge Data Set (UHDDS)

Analyze Biomedical Datasets CSV File

Introduction of CSV file:

Figure 1-Download CSV(Excel) file.

Read CSV file:

Figure 2.Read CSV file

Info of CSV file:

Figure 3-Info of CSV file.

Columns of CSV file:

Figure 4-Print Columns name of CSV file.

Head of CSV file:

Figure 5-Head of CSV file

Tail of CSV file:

Figure 6-Tail of CSV file

Describe CSV file:

Figure 7-Describe all factors of CSV file.

Figure 8- Count CSV file unique name of region.

Graph analysis CSV file:

Figure 9- Plot the graph b/w age ,charges and smoking.

Males in CSV file:

Figure 10-Number of male CSV file

Females in CSV file:

Figure 11-Number of females in CSV file.

Females smoker in CSV file:

Figure 12-Number of females who are smoking.

Males smoker in CSV file:

Figure 13-Number male who smoking

Total number Smoker and Nonsmoker:

Figure 14-Total number of smoker and non-smoker in CSV file.

Total number Smoker in different region:

Figure 15-Number of smokers in different region.

Sex and charges & smoker and charges relationship:

Figure 16-Relationship Charges with Sex and smoker.

Figure 17-Describe detail of Average charges with sex.

• The describe() method in pandas DataFrame, when apply like

Figure 18-Describe detail of Average charges with smoker.

Line Graph between charges and age:

Figure 19-Plot line graph between Age and charges

You might also like

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.