Python - Unit1-7
Python - Unit1-7
Introduction to Python
Programming Cycle
Humans use various languages to communicate to each other.
Whereas Computers understand only Machine Language/Computer Language
of 1s and 0s.
We write code using various Programming Languages like C, C++, C#, PHP,
Java, Python etc.
What is Python?
Data Science
Machine Learning
Artificial Intelligence
Web Development
Mobile App Development
Game Development
and a lot more…
Variables
Creating Variables
Variable Names
A variable can have a short name (like x and y) or a more descriptive name
(age, carname, total_volume). Rules for Python variables:
Keywords
When the Python compiler reads your code, it looks for special words called
keywords to understand what your code is trying to say.
Math Operators
In Python the following math operators can be used in assignment statements:
+ Add – If operands are number, adds them; If operands are strings, joins them
– Subtract – Subtracts two numbers
* Multiply – Multiply two numbers
/ Divide – Divides two numbers and returns a float
// Integer Divide – Divides two numbers and returns an integer (floor value if
not perfectly divisible)
** Raise to the power of – Left operand raised to the power of right operand
% Modulo – returns remainer of the division
Built-in Data Types
Variables can store data of different types, and different types can do different
things.
Python has the following data types built-in by default, in these categories:
You can get the data type of any object by using the type() function:
Example
x=5
print(type(x))
Strings
Example
print("Hello")
print('Hello')
Python Numbers
int
float
complex
Variables of numeric types are created when you assign a value to them:
Example
x = 1 # int
y = 2.8 # float
z = 1j # complex
Int
Example
Integers:
x=1
y = 35656222554887711
z = -3255522
Float
Example
Floats:
x = 1.10
y = 1.0
z = -35.59
Complex
Example
Complex:
x = 3+5j
y = 5j
z = -5j
There are four collection data types in the Python programming language:
*Set items are unchangeable, but you can remove and/or add items whenever
you like.
**As of Python version 3.7, dictionaries are ordered. In Python 3.6 and earlier,
dictionaries are unordered.
List
Example
Create a List:
List items are indexed, the first item has index [0], the second item has
index [1] etc.
Ordered
When we say that lists are ordered, it means that the items have a defined order,
and that order will not change.
If you add new items to a list, the new items will be placed at the end of the list.
Changeable
The list is changeable, meaning that we can change, add, and remove items in a
list after it has been created.
Allow Duplicates
Since lists are indexed, lists can have items with the same value:
Example
To determine how many items a list has, use the len() function:
Example
Example
Example
Create a Tuple:
Tuple items are indexed, the first item has index [0], the second item has
index [1] etc.
Ordered
When we say that tuples are ordered, it means that the items have a defined
order, and that order will not change.
Unchangeable
Tuples are unchangeable, meaning that we cannot change, add or remove items
after the tuple has been created.
Allow Duplicates
Since tuples are indexed, they can have items with the same value:
Example
Tuple Length
To determine how many items a tuple has, use the len() function:
Example
Example
* Note: Set items are unchangeable, but you can remove items and add new
items.
Example
Create a Set:
Set items are unordered, unchangeable, and do not allow duplicate values.
Unordered
Unordered means that the items in a set do not have a defined order.
Set items can appear in a different order every time you use them, and cannot be
referred to by index or key.
Unchangeable
Set items are unchangeable, meaning that we cannot change the items after the
set has been created.
Once a set is created, you cannot change its items, but you can remove items
and add new items.
Duplicates Not Allowed
Example
Dictionary
As of Python version 3.7, dictionaries are ordered. In Python 3.6 and earlier,
dictionaries are unordered.
Dictionaries are written with curly brackets, and have keys and values:
Example
thisdict = {
"brand": "Ford",
"model": "Mustang",
"year": 1964
}
print(thisdict)
Dictionary Items
Example
Ordered or Unordered?
As of Python version 3.7, dictionaries are ordered. In Python 3.6 and earlier,
dictionaries are unordered.
When we say that dictionaries are ordered, it means that the items have a
defined order, and that order will not change.
Unordered means that the items do not have a defined order, you cannot refer to
an item by using an index.
Changeable
Dictionaries are changeable, meaning that we can change, add or remove items
after the dictionary has been created.
Example
thisdict = {
"brand": "Ford",
"model": "Mustang",
"year": 1964,
"year": 2020
}
print(thisdict)
UNIT 2
Control Structures and Functions
Python Conditional Statement
Python supports the usual logical conditions from mathematics:
Equals: a == b
Not Equals: a != b
Less than: a < b
Less than or equal to: a <= b
Greater than: a > b
Greater than or equal to: a >= b
These conditions can be used in several ways, most commonly in "if
statements" and loops.
If Statements
An "if statement" is written by using the if keyword.
Example
If statement:
a = 33
b = 200
if b > a:
print("b is greater than a")
Indentation
Python relies on indentation (whitespace at the beginning of a line) to define
scope in the code. Other programming languages often use curly-brackets for
this purpose.
Example
If statement, without indentation (will raise an error):
a = 33
b = 200
if b > a:
print("b is greater than a") # you will get an error
Elif
The elif keyword is Python's way of saying "if the previous conditions were not
true, then try this condition".
Example
a = 33
b = 33
if b > a:
print("b is greater than a")
elif a == b:
print("a and b are equal")
Else
The else keyword catches anything which isn't caught by the preceding
conditions.
Example
a = 200
b = 33
if b > a:
print("b is greater than a")
elif a == b:
print("a and b are equal")
else:
print("a is greater than b")
In this example a is greater than b, so the first condition is not true, also
the elif condition is not true, so we go to the else condition and print to screen
that "a is greater than b".
You can also have an else without the elif:
Example
a = 200
b = 33
if b > a:
print("b is greater than a")
else:
print("b is not greater than a")
Logical Functions
And
The and keyword is a logical operator, and is used to combine conditional
statements:
Example
Test if a is greater than b, AND if c is greater than a:
a = 200
b = 33
c = 500
if a > b and c > a:
print("Both conditions are True")
Or
The or keyword is a logical operator, and is used to combine conditional
statements:
Example
Test if a is greater than b, OR if a is greater than c:
a = 200
b = 33
c = 500
if a > b or a > c:
print("At least one of the conditions is True")
Not
The not keyword is a logical operator, and is used to reverse the result of the
conditional statement:
Example
Test if a is NOT greater than b:
a = 33
b = 200
if not a > b:
print("a is NOT greater than b")
TAKING IPDUT IN PYTHON
Developers often have a need to interact with users, either to get data or to
provide some sort of result. Most programs today use a dialog box as a way of
asking the user to provide some type of ipdut. While Python provides us with
two inbuilt functions to read the ipdut from the keyboard.
ipdut ( prompt )
raw_ipdut ( prompt )
ipdut (): This function first takes the ipdut from the user and converts it
into a string. The type of the returned object always will be <class
‘str’>. It does not evaluate the expression it just returns the complete
statement as String. For example, Python provides a built-in function
called ipdut which takes the ipdut from the user. When the ipdut
function is called it stops the program and waits for the user’s ipdut.
When the user presses enter, the program resumes and returns what the
user typed.
Syntax:
ipd = ipdut('STATEMENT')
Whatever you enter as ipdut, the ipdut function converts it into a string.
if you enter an integer value still ipdut() function converts it into a
string. You need to explicitly convert it into an integer in your code
using typecasting .
raw_ipdut(): This function works in older version (like Python 2.x).
This function takes exactly what is typed from the keyboard, converts it
to string, and then returns it to the variable in which we want to store it.
Example:
Python
# Python program showing
# a use of raw_ipdut()
g = raw_ipdut("Enter your name : ")
print g
We can use raw_ipdut() to enter numeric data also. In that case, we use
typecasting..
Note: ipdut() function takes all the ipdut as a string only
There are various function that are used to take as desired ipdut few of them
are : –
int(ipdut())
float(ipdut())
Python
num = int(ipdut("Enter a number: "))
print(num, " ", type(num))
floatNum = float(ipdut("Enter a decimal number: "))
print(floatNum, " ", type(floatNum))
Python Loops
Python has two primitive loop commands:
while loops
for loops
The while Loop
With the while loop we can execute a set of statements as long as a condition is
true.
Example
i=1
while i < 6:
print(i)
i += 1
With the break statement we can stop the loop even if the while condition is
true:
Example
i=1
while i < 6:
print(i)
if i == 3:
break
i += 1
The else Statement
With the else statement we can run a block of code once when the condition no
longer is true:
Example
Print a message once the condition is false:
i=1
while i < 6:
print(i)
i += 1
else:
print("i is no longer less than 6")
FOR LOOP
flowchart of Python For Loop
How to use the for loop in Python
In Python, the for loop is used to iterate over a sequence (such as a list, tuple,
string, or dictionary) or any iterable object. The basic syntax of the for loop is:
Python For Loop Syntax
for var in iterable:
# statements
The For Loops in Python are a special type of loop statement that is used for
sequential traversal. Python For loop is used for iterating over an iterable like a
String, Tuple, List, Set, or Dictionary.
In Python, there is no C style for loop, i.e., for (i=0; I <n; i++). The For Loops
in Python is similar to each loop in other languages, used for sequential
traversals.
Python For Loop with dictionary
This code uses a for loop to iterate over a dictionary and print each key-value
pair on a new line. The loop assigns each key to the variable i and uses string
formatting to print the key and its corresponding value.
s = "iips"
for i in s:
print(i)
Python for loop with Range
This code uses a Python for loop with index in conjunction with
the range() function to generate a sequence of numbers starting from 0, up to
(but not including) 10, and with a step size of 2. For each number in the
sequence, the loop prints its value using the print() function. The output will
show the numbers 0, 2, 4, 6, and 8.
Python
for i in range(0, 10, 2):
print(i)
Nested For Loops in Python
This code uses nested for loops to iterate over two ranges of numbers (1 to 3
inclusive) and prints the value of i and j for each combination of the two loops.
The inner loop is executed for each value of i in the outer loop. The output of
this code will print the numbers from 1 to 3 three times, as each value of i is
combined with each value of j.
for i in range(1, 4):
for j in range(1, 4):
print(i, j)
Python For Loop Over List
This code uses a for loop to iterate over a list of strings, printing each item in
the list on a new line. The loop assigns each item to the variable I and continues
until all items in the list have been processed.
Python
# Python program to illustrate
# Iterating over a list
l = ["mba", "5yrs", "iips"]
for i in l:
print(i)
Python Functions
Python Functions is a block of statements that return the specific task. The idea
is to put some commonly or repeatedly done tasks together and make a function
so that instead of writing the same code again and again for different ipduts, we
can do the function calls to reuse code contained in it over and over again.
Some Benefits of Using Functions
Increase Code Readability
Increase Code Reusability
Types of Functions in Python
Below are the different types of functions in Python:
Built-in library function: These are Standard functions in Python that
are available to use.
User-defined function: We can create our own functions based on our
requirements.
Creating a Function in Python
We can define a function in Python, using the def keyword. We can add any
type of functionalities and properties to it as we require. By the following
example, we can understand how to write a function in Python. In this way we
can create Python function definition by using def keyword.
Python
# A simple Python function
def fun():
print("Welcome to GFG")
Calling a Function in Python
After creating a function in Python we can call it by using the name of the
functions Python followed by parenthesis containing parameters of that
particular function. Below is the example for calling def function Python.
Python
# A simple Python function
def fun():
print("Welcome to GFG")
my_function("GURLEEN")
my_function("SIMARPREET")
A simple Python function to check
# whether x is even or odd
def evenOdd(x):
if (x % 2 == 0):
print("even")
else:
print("odd")
# Driver code
x=2
y=3
swap(x, y)
print(x)
print(y)
ERROR HANDLING
Errors are problems in a program due to which the program will stop the
execution. On the other hand, exceptions are raised when some internal events
occur which change the normal flow of the program.
Error in Python can be of two types
Syntax errors
Exceptions.
Syntax Error:
As the name suggests this error is caused by the wrong syntax in the code. It
leads to the termination of the program.
Example:
There is a syntax error in the code . The ‘if' statement should be followed by a
colon (:), and the ‘print' statement should be indented to be inside
the ‘if' block.
amount = 10000
if(amount > 2999)
print("You are eligible to purchase new software)
Output:
Exceptions:
Exceptions are raised when the program is syntactically correct, but the code
results in an error. This error does not stop the execution of the program,
however, it changes the normal flow of the program.
marks = 10000
a = marks / 0
print(a)
Output:
The code attempts to add an integer (‘x') and a string (‘y') together, which is not
a valid operation, and it will raise a ‘TypeError'. The code used
a ‘try' and ‘except' block to catch this exception and print an error message.
x=5
y = "hello"
try:
z=x+y
except TypeError:
print("Error: cannot add an int and a str")
What is PIP?
PIP is a package manager for Python packages, or modules if you like.
Note: If you have Python version 3.4 or later, PIP is included by default.
What is a Package?
A package contains all the files you need for a module.
Modules are Python code libraries you can include in your project.
Check if PIP is Installed
Navigate your command line to the location of Python's script directory, and
type the following:
Check PIP version:
C:\Users\Your Name\AppData\Local\Programs\Python\Python36-32\
Scripts>pip –version
File handling is an important part of any web application.
Python has several functions for creating, reading, updating, and deleting files.
File Handling
The key function for working with files in Python is the open() function.
The open() function takes two parameters; filename, and mode.
There are four different methods (modes) for opening a file:
"r" - Read - Default value. Opens a file for reading, error if the file does not
exist
"a" - Append - Opens a file for appending, creates the file if it does not exist
"w" - Write - Opens a file for writing, creates the file if it does not exist
"x" - Create - Creates the specified file, returns an error if the file exists
In addition you can specify if the file should be handled as binary or text mode
"t" - Text - Default value. Text mode
"b" - Binary - Binary mode (e.g. images)
Syntax
To open a file for reading it is enough to specify the name of the file:
f = open("Sales Data.csv")
The code above is the same as:
f = open("Sales Data.csv ", "rt")
Because "r" for read, and "t" for text are the default values, you do not need to
specify them.
Note: Make sure the file exists, or else you will get an error.
To open the file, use the built-in open() function.
The open() function returns a file object, which has a read() method for reading
the content of the file:
f = open("Sales Data.csv", "r")
print(f.read())
If the file is located in a different location, you will have to specify the file path,
like this:
Example
Open a file on a different location:
f = open("D:\\myfiles\welcome.txt", "r")
print(f.read())
Read Lines
You can return one line by using the readline() method:
Example
Read one line of the file:
f = open("Sales Data.csv", "r")
print(f.readline())
By calling readline() two times, you can read the two first lines:
Example
Read two lines of the file:
f = open("Sales Data.csv", "r")
print(f.readline())
print(f.readline())
By looping through the lines of the file, you can read the whole file, line by line:
Example
Loop through the file line by line:
f = open("Sales Data.csv", "r")
for x in f:
print(x)
Close Files
It is a good practice to always close the file when you are done with it.
Example
Close the file when you are finish with it:
f = open("Sales Data.csv", "r")
print(f.readline())
f.close()
Write to an Existing File
To write to an existing file, you must add a parameter to the open() function:
"a" - Append - will append to the end of the file
"w" - Write - will overwrite any existing content
Open the file "demofile2.txt" and append content to the file:
f = open("demofile2.txt", "a")
f.write("Now the file has more content!")
f.close()
Delete Folder
To delete an entire folder, use the os.rmdir() method:
Example
Remove the folder "myfolder":
import os
os.rmdir("myfolder")
Note: You can only remove empty folders.
Pandas Series
Creating a Series
Pandas Series is created by loading the datasets from existing storage (which
can be a SQL database, a CSV file, or an Excel file).
Pandas Series can be created from lists, dictionaries, scalar values, etc.
Example: Creating a series using the Pandas Library.
import pandas as pd
a = [1, 7, 2]
myvar = pd.Series(a)
print(myvar)
Create Labels
With the index argument, you can name your own labels.
Example
Create your own labels:
import pandas as pd
a = [1, 7, 2]
print(myvar)
When you have created labels, you can access an item by referring to the label.
Example
Return the value of "y":
print(myvar["y"])
Pandas DataFrame
A Pandas DataFrame is a 2 dimensional data structure, like a 2
dimensional array, or a table with rows and columns.
Series is like a column, a DataFrame is the whole table.
Create a simple Pandas DataFrame:
import pandas as pd
data = {
"calories": [420, 380, 390],
"duration": [50, 40, 45]
}
print(df)
Locate Row
Pandas use the loc attribute to return one or more specified row(s)
Example
Return row 0:
#refer to the row index:
print(df.loc[0])
Example
Return row 0 and 1:
#use a list of indexes:
print(df.loc[[0, 1]])
print(df.to_string())
Tip: use to_string() to print the entire DataFrame.
If you have a large DataFrame with many rows, Pandas will only return the first
5 rows, and the last 5 rows:
Example
Print the DataFrame without the to_string() method:
import pandas as pd
df = pd.read_csv('data.csv')
print(df)
max_rows
The number of rows returned is defined in Pandas option settings.
You can check your system's maximum rows with
the pd.options.display.max_rows statement.
Example
Check the number of maximum returned rows:
import pandas as pd
print(pd.options.display.max_rows)
By default the number is 60, which means that if the DataFrame contains more
than 60 rows, the print(df) statement will return only the headers and the first
and last 5 rows.
You can change the maximum rows number with the same statement.
Example
Increase the maximum number of rows to display the entire DataFrame:
import pandas as pd
pd.options.display.max_rows = 9999
df = pd.read_csv('data.csv')
print(df)
How to Use Pandas with Excel Files?
Pandas provides powerful tools to read from and write to Excel files, making it
easy to integrate Excel data with your Python scripts.
Reading Excel Files
You can read Excel files using the pd.read_excel() function. It requires
the opepdyxl or xlrd library for .xlsx files or the xlrd library for .xls files.
import pandas as pd
df = pd.read_csv('data.csv')
print(df.head(10))
In our examples we will be using a CSV file called 'data.csv'.
Note: if the number of rows is not specified, the head() method will return the
top 5 rows.
Example
Print the first 5 rows of the DataFrame:
import pandas as pd
df = pd.read_csv('data.csv')
print(df.head())
There is also a tail() method for viewing the last rows of the DataFrame.
The tail() method returns the headers and a specified number of rows, starting
from the bottom.
Example
Print the last 5 rows of the DataFrame:
print(df.tail())
Null Values
The info() method also tells us how many Non-Null values there are present in
each column, and in our data set it seems like there are 164 of 169 Non-Null
values in the "Calories" column.
Which means that there are 5 rows with no value at all, in the "Calories"
column, for whatever reason.
Empty values, or Null values, can be bad when analyzing data, and you should
consider removing rows with empty values.
Data Cleaning
Data cleaning means fixing bad data in your data set.
Bad data could be:
Empty cells
Data in wrong format
Wrong data
Duplicates
Empty Cells
Empty cells can potentially give you a wrong result when you analyze data.
Remove Rows
One way to deal with empty cells is to remove rows that contain empty cells.
This is usually OK, since data sets can be very big, and removing a few rows
will not have a big impact on the result.
Example
import pandas as pd
df = pd.read_csv('data.csv')
new_df = df.dropna()
print(new_df.to_string())
Note: By default, the dropna() method returns a new DataFrame, and will not
change the original.
If you want to change the original DataFrame, use the ipdlace = True argument:
Example
import pandas as pd
df = pd.read_csv('data.csv')
df.dropna(inplace = True)
print(df.to_string())
Note: Now, the dropna(ipdlace = True) will NOT return a new DataFrame, but
it will remove all rows containing NULL values from the original DataFrame.
df = pd.read_csv('data.csv')
df = pd.read_csv('data.csv')
Matplotlib
Matplotlib is a powerful plotting library in Python used for creating static,
animated, and interactive visualizations. Matplotlib’s primary purpose is to
provide users with the tools and functionality to represent data graphically,
making it easier to analyze and understand. It was originally developed by
John D. Hunter in 2003 and is now maintained by a large community of
developers.
Key Features of Matplotlib:
1. Versatility: Matplotlib can generate a wide range of plots, including
line plots, scatter plots, bar plots, histograms, pie charts, and more.
2. Customization: It offers extensive customization options to control
every aspect of the plot, such as line styles, colors, markers, labels,
and annotations.
3. Integration with NumPy: Matplotlib integrates seamlessly with
NumPy, making it easy to plot data arrays directly.
4. Publication Quality: Matplotlib produces high-quality plots suitable
for publication with fine-grained control over the plot aesthetics.
5. Extensible: Matplotlib is highly extensible, with a large ecosystem
of add-on toolkits and extensions like Seaborn, Pandas plotting
functions, and Basemap for geographical plotting.
6. Cross-Platform: It is platform-independent and can run on various
operating systems, including Windows, macOS, and Linux.
7. Interactive Plots: Matplotlib supports interactive plotting through
the use of widgets and event handling, enabling users to explore data
dynamically.
8. Matplotlib is popular due to its ease of use, extensive documentation,
and wide range of plotting capabilities. It offers flexibility in
customization, supports various plot types, and integrates well with
other Python libraries like NumPy and Pandas.
9. Matplotlib is a suitable choice for various data visualization tasks,
including exploratory data analysis, scientific plotting, and creating
publication-quality plots. It excels in scenarios where users require fine-
grained control over plot customization and need to create complex or
specialized visualizations.
Applications of Matplotlib
Matplotlib is widely used in various fields for data visualization, including:
1. Scientific Research: For plotting experimental results and
visualizations that describe the data more effectively.
2. Finance: For creating financial charts to analyze market trends and
movements.
3. Data Analysis: For exploratory data analysis in fields such as data
science and machine learning.
4. Education: For teaching complex concepts in mathematics, physics,
and statistics through visual aids.
5. Engineering: For visualizing engineering simulations and results.
Characteristics of Matplotlib
Key characteristics of Matplotlib include:
Versatility: It can create a wide range of static, animated, and
interactive plots.
Customizability: Almost every element of a plot (like sizes, colors, and
fonts) can be customized.
Extensibility: It can be used with a variety of GUI modules to create
graphical applications.
Integration: Easily integrates with other libraries like NumPy and
Pandas for efficient data manipulation.
Output Formats: Supports many output formats, including PNG, PDF,
SVG, EPS, and interactive backends.
Matplotlib is a low level graph plotting library in python that serves as a
visualization utility.
Matplotlib was created by John D. Hunter.
Matplotlib is open source and we can use it freely.
Matplotlib is mostly written in python, a few segments are written in C,
Objective-C and Javascript for Platform compatibility.
Installation of Matplotlib
If Python and PIP are already installed on a system, then
Install it using this command:
C:\Users\Your Name>pip install matplotlib
Import Matplotlib
Once Matplotlib is installed, import it in your applications by adding
the import module statement:
import matplotlib
print(matplotlib.__version__)
Pyplot
Most of the Matplotlib utilities lies under the pyplot submodule, and are
usually imported under the plt alias:
import matplotlib.pyplot as plt
Now the Pyplot package can be referred to as plt.
Parameters:
This function accepts parameters that enable us to set axes scales and format
the graphs. These parameters are mentioned below :-
plot(x, y): plot x and y using default line style and color.
plot.axis([xmin, xmax, ymin, ymax]): scales the x-axis and y-axis from
minimum to maximum values
plot.(x, y, color=’green’, marker=’o’, linestyle=’dashed’, linewidth=2,
markersize=12):
x and y co-ordinates are marked using circular markers of size 12 and
green color line with — style of width 2
plot.xlabel(‘X-axis’): names x-axis
plot.ylabel(‘Y-axis’): names y-axis
plot(x, y, label = ‘Sample line ‘): plotted Sample Line will be displayed
as a legend
Example
Draw a line in a diagram from position (0,0) to position (6,250):
import matplotlib.pyplot as plt
import pandas as pd
plt.plot(xpoints, ypoints)
plt.show()
Plotting Without Line
To plot only the markers, you can use shortcut string notation parameter 'o',
which means 'rings'.
Example
Draw two points in the diagram, one at position (1, 3) and one in position (8,
10):
import matplotlib.pyplot as plt
import pandas as pd
x = pd.array([80, 85, 90, 95, 100, 105, 110, 115, 120, 125])
y = pd.array([240, 250, 260, 270, 280, 290, 300, 310, 320, 330])
font1 = {'family':'serif','color':'blue','size':20}
plt.plot(x, y)
plt.title("Sports Watch Data",fontdict = font1, loc = 'left')
plt.xlabel("Average Pulse")
plt.ylabel("Calorie Burnage")
plt.show()
import matplotlib.pyplot as plt
import pandas as pd
#plot 1:
x = pd.array([0, 1, 2, 3])
y = pd.array([3, 8, 1, 10])
plt.subplot(1, 2, 1)
plt.plot(x,y)
plt.title("SALES")
#plot 2:
x = pd.array([0, 1, 2, 3])
y = pd.array([10, 20, 30, 40])
plt.subplot(1, 2, 2)
plt.plot(x,y)
plt.title("INCOME")
plt.show()
Matplotlib Scatter
Matplotlib Bars
Creating Bars
With Pyplot, you can use the bar() function to draw bar graphs:
Example
Draw 4 bars:
plt.bar(x,y)
plt.show()
Horizontal Bars
If you want the bars to be displayed horizontally instead of vertically, use
the barh() function:
import matplotlib.pyplot as plt
import panads as pd
plt.barh(x,y)
plt.show()
Bar Color
The bar() and barh() take the keyword argument color to set the color of the
bars:
plt.bar(x, y, color = "red")
Bar Width
The bar() takes the keyword argument width to set the width of the bars:
plt.bar(x, y, width = 0.1)
Note: For horizontal bars, use height instead of width.
Bar Height
The barh() takes the keyword argument height to set the height of the bars:
plt.barh(x, y, height = 0.1)
Matplotlib Histograms
A histogram is like a visual summary that shows how often different values
appear in a set of data. Imagine you have a collection of numbers, like ages of
people. A histogram divides these numbers into groups, called "bins," and then
uses bars to represent how many numbers fall into each bin. The taller the bar,
the more numbers are in that group.
Histogram in Matplotlib
We can create a histogram in Matplotlib using the hist() function. This function
allows us to customize various aspects of the histogram, such as the number of
bins, color, and transparency. Histogram in Matplotlib is used to represent the
distribution of numerical data, helping you to identify patterns.
The hist() Function
The hist() function in Matplotlib takes a dataset as input and divides it into
intervals (bins). It then displays the frequency (count) of data points falling
within each bin as a bar graph.
Following is the syntax of hist() function in Matplotlib −
Syntax
plt.hist(x, bins=None, range=None, density=False, cumulative=False,
color=None, edgecolor=None, ...)
Where,
x is the input data for which the histogram is determined.
bins (optional) is the number of bins or the bin edges.
range (optional) is the lower and upper range of the bins. Default is the
minimum and maximum of x
If density (optional) is True, the histogram represents a probability
density function. Default is False.
If cumulative (optional) is True, a cumulative histogram is computed.
Default is False.
These are just a few parameters; there are more optionals parameters available
for customization.
plt.pie(y)
plt.show()
Result:
As you can see the pie chart draws one piece (called a wedge) for each value in
the array (in this case [35, 25, 25, 15]).
By default the plotting of the first wedge starts from the x-axis and
moves counterclockwise:
Note: The size of each wedge is determined by comparing the value with all the
other values, by using this formula:
The value divided by the sum of all values: x/sum(x)
It uses comparatively
It uses comparatively complex simple syntax which is
and lengthy syntax. Example: easier to learn and
Syntax Syntax for bar graph- understand. Example:
matplotlib.pyplot.bar(x_axis, Syntax for bargraph-
y_axis). seaborn.barplot(x_axis,
y_axis).
Seaborn avoids
Matplotlib is a highly overlapping plots with
Pliability
customized and robust the help of its default
themes
Regression,
Algorithm Classification Purpose Method Use Cases
Linear
equation
Predict Predicting
minimizing
Regression continuous continuous
sum of
output values values
Linear squares of
Regression residuals
relationship
Tree-like Classification
Model
structure with and
Both decisions and
Decision decisions and Regression
outcomes
Trees outcomes tasks
Improve Reducing
classification Combining overfitting,
Both and multiple improving
Random regression decision trees prediction
Forests accuracy accuracy
Create Maximizing
hyperplane margin
Classification
for between
and
Both classification classes or
Regression
or predict predicting
tasks
continuous continuous
SVM values values
Finding k
closest Classification
Predict class
neighbors and
or value
and Regression
Both based on k
predicting tasks,
closest
based on sensitive to
neighbors
majority or noisy data
KNN average
Regression
tasks to
to create errors with
improve
strong model new models
prediction
accuracy
Text
Predict class Bayes’ classification,
based on theorem with spam
Classification feature feature filtering,
independence independence sentiment
Naive assumption assumption analysis,
Bayes medical
Unsupervised Learning
The input to the unsupervised learning models is as follows:
Unstructured data: May contain noisy(meaningless) data, missing
values, or unknown data
Unlabeled data: Data only contains a value for input parameters, there is
no targeted value(output). It is easy to collect as compared to the labeled
one in the Supervised approach.
Unsupervised Learning Algorithms
There are mainly 3 types of Algorithms which are used for Unsupervised
dataset.
Clustering
Association Rule Learning
Dimensionality Reduction
Applications of Unsupervised learning
Customer segmentation: Unsupervised learning can be used to segment
customers into groups based on their demographics, behavior, or
preferences. This can help businesses to better understand their customers
and target them with more relevant marketing campaigns.
Fraud detection: Unsupervised learning can be used to detect fraud in
financial data by identifying transactions that deviate from the expected
patterns. This can help to prevent fraud by flagging these transactions for
further investigation.
Recommendation systems: Unsupervised learning can be used to
recommend items to users based on their past behavior or
preferences. For example, a recommendation system might use
unsupervised learning to identify users who have similar taste in
movies, and then recommend movies that those users have enjoyed.
Natural language processing (NLP): Unsupervised learning is used in a
variety of NLP tasks, including topic modeling, document clustering, and
part-of-speech tagging.
Image analysis: Unsupervised learning is used in a variety of image
analysis tasks, including image segmentation, object detection, and image
pattern recognition.
Conclusion
Unsupervised learning is a versatile and powerful tool for exploring and
understanding unlabeled data. It has a wide range of applications, from
customer segmentation to fraud detection to image analysis. As the field of
machine learning continues to develop, unsupervised learning is likely to play
an increasingly important role in various domains.
Applications of unsupervised learning
Unsupervised learning has a wide range of applications, including:
Clustering: Grouping data points into clusters based on their
similarities.
Dimensionality reduction: Reducing the number of features in a dataset
while preserving as much information as possible.
Anomaly detection: Identifying data points that deviate from the
expected patterns, often signaling anomalies or outliers.
Recommendation systems: Recommending items to users based on their
past behavior or preferences.