0% found this document useful (0 votes)
50 views

Python - Unit1-7

Uploaded by

Priyal Choudhary
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
50 views

Python - Unit1-7

Uploaded by

Priyal Choudhary
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 68

UNIT 1

Introduction to Python

Programming Cycle
Humans use various languages to communicate to each other.
Whereas Computers understand only Machine Language/Computer Language
of 1s and 0s.
We write code using various Programming Languages like C, C++, C#, PHP,
Java, Python etc.

Higher Level Languages: Programming Languages closer to languages humans


use are Higher Level Languages.

Lower Level Languages: Programming Languages closer to language


computers understand i.e. 0s and 1s are Lower Level Languages.

Translators are used to convert code written in Programming Language (source


code) into Machine Language/Computer Language:
Translators can be:
Interpreters (Execution is done line by line)
or
Compilers (Whole code is executed in one go)
Overview of Python

What is Python?

Python is a very popular, widely-used, and easy to learn programming language.


Python is a general-purpose programming language.
It is used everywhere, some of the important fields where Python is used are:

 Data Science
 Machine Learning
 Artificial Intelligence
 Web Development
 Mobile App Development
 Game Development
and a lot more…

It supports a variety of paradigms for programming, including structured


programming, object-oriented programming, and functional programming.
Python was created in 1989 by Guido van Rossum.
The name Python doesn’t refer to the snake. Apparently, Guido van Rossum
was a big fan of Monty Python’s Flying Circus, a TV series by a well-known
comedy group from Britain, and he named the language after them.

Setting up the environment

IDLE & the Python Shell


Software Development Environments are applications that are used to write a
code.
Some of these environments are free; others can be costly.
There are many different software development environments (applications)
that you can use to write code in Python.

You may use IDLE environment for this course.


IDLE is named after one of the founding members of Monty Python (a well-
known comedy group from Britain) Eric Idle.
IDLE is free. When you download and run the Python installer, it installs IDLE
on your computer.
IDLE environment is completely platform independent i.e. it looks almost
identical on a Windows computer, Mac, or Linux system.
IDLE is Python’s Integrated Development and Learning Environment.

IDLE has the following features:

 cross-platform: works mostly the same on Windows, Unix, and macOS


 Python shell window (interactive interpreter) with colorizing of code
ipdut, output, and error messages
 multi-window text editor with multiple undo, Python colorizing, smart
indent, call tips, auto completion, and other features
 search within any window, replace within editor windows, and search
through multiple files (grep)
 debugger with persistent breakpoints, stepping, and viewing of global and
local namespaces
 configuration, browsers, and other dialogs

Variables

Variables are containers for storing data values.

Creating Variables

Python has no command for declaring a variable.

A variable is created the moment you first assign a value to it.

Variable Names
A variable can have a short name (like x and y) or a more descriptive name
(age, carname, total_volume). Rules for Python variables:

 A variable name must start with a letter or the underscore character


 A variable name cannot start with a number
 A variable name can only contain alpha-numeric characters and
underscores (A-z, 0-9, and _ )
 Variable names are case-sensitive (age, Age and AGE are three different
variables)
 A variable name cannot be any of the Python keywords.

Keywords
When the Python compiler reads your code, it looks for special words called
keywords to understand what your code is trying to say.

The following is a list of the Python keywords.


And As Assert break class
continue Def Del elif else
except Finally For from global
If Import In is lambda
nonlocal Not Or pass raise
return Try While with yield
False None True
Remember
You cannot use Python keywords as a variable name.
If you attempt to do so, the Python compiler will generate an error message
when it tries to compile your program.

Math Operators
In Python the following math operators can be used in assignment statements:

+ Add – If operands are number, adds them; If operands are strings, joins them
– Subtract – Subtracts two numbers
* Multiply – Multiply two numbers
/ Divide – Divides two numbers and returns a float
// Integer Divide – Divides two numbers and returns an integer (floor value if
not perfectly divisible)
** Raise to the power of – Left operand raised to the power of right operand
% Modulo – returns remainer of the division
Built-in Data Types

In programming, data type is an important concept.

Variables can store data of different types, and different types can do different
things.

Python has the following data types built-in by default, in these categories:

Text Type: Str

Numeric Types: int, float, complex

Sequence Types: list, tuple, range

Mapping Type: Dict

Set Types: set, frozenset

Boolean Type: Bool

Getting the Data Type

You can get the data type of any object by using the type() function:

Example

Print the data type of the variable x:

x=5
print(type(x))
Strings

Strings in python are surrounded by either single quotation marks, or double


quotation marks.

'hello' is the same as "hello".

You can display a string literal with the print() function:

Example
print("Hello")
print('Hello')

Python Numbers

There are three numeric types in Python:

 int
 float
 complex

Variables of numeric types are created when you assign a value to them:

Example
x = 1 # int
y = 2.8 # float
z = 1j # complex
Int

Int, or integer, is a whole number, positive or negative, without decimals, of


unlimited length.

Example

Integers:

x=1
y = 35656222554887711
z = -3255522
Float

Float, or "floating point number" is a number, positive or negative, containing


one or more decimals.

Example

Floats:
x = 1.10
y = 1.0
z = -35.59
Complex

Complex numbers are written with a "j" as the imaginary part:

Example

Complex:

x = 3+5j
y = 5j
z = -5j

Python Collections (Arrays)

There are four collection data types in the Python programming language:

 List is a collection which is ordered and changeable. Allows duplicate


members.
 Tuple is a collection which is ordered and unchangeable. Allows
duplicate members.
 Set is a collection which is unordered, unchangeable*, and unindexed. No
duplicate members.
 Dictionary is a collection which is ordered** and changeable. No
duplicate members.

*Set items are unchangeable, but you can remove and/or add items whenever
you like.

**As of Python version 3.7, dictionaries are ordered. In Python 3.6 and earlier,
dictionaries are unordered.

When choosing a collection type, it is useful to understand the properties of that


type. Choosing the right type for a particular data set could mean retention of
meaning, and, it could mean an increase in efficiency or security.

List

Lists are used to store multiple items in a single variable.


Lists are created using square brackets:

Example

Create a List:

thislist = ["apple", "banana", "cherry"]


print(thislist)
List Items

List items are ordered, changeable, and allow duplicate values.

List items are indexed, the first item has index [0], the second item has
index [1] etc.

Ordered

When we say that lists are ordered, it means that the items have a defined order,
and that order will not change.

If you add new items to a list, the new items will be placed at the end of the list.

Changeable

The list is changeable, meaning that we can change, add, and remove items in a
list after it has been created.

Allow Duplicates

Since lists are indexed, lists can have items with the same value:

Example

Lists allow duplicate values:

thislist = ["apple", "banana", "cherry", "apple", "cherry"]


print(thislist)
List Length

To determine how many items a list has, use the len() function:

Example

Print the number of items in the list:

thislist = ["apple", "banana", "cherry"]


print(len(thislist))

A list can contain different data types:

Example

A list with strings, integers and boolean values:

list1 = ["abc", 34, True, 40, "male"]


Tuple

Tuples are used to store multiple items in a single variable.

A tuple is a collection which is ordered and unchangeable.

Tuples are written with round brackets.

Example

Create a Tuple:

thistuple = ("apple", "banana", "cherry")


print(thistuple)
Tuple Items

Tuple items are ordered, unchangeable, and allow duplicate values.

Tuple items are indexed, the first item has index [0], the second item has
index [1] etc.
Ordered

When we say that tuples are ordered, it means that the items have a defined
order, and that order will not change.

Unchangeable

Tuples are unchangeable, meaning that we cannot change, add or remove items
after the tuple has been created.

Allow Duplicates

Since tuples are indexed, they can have items with the same value:

Example

Tuples allow duplicate values:

thistuple = ("apple", "banana", "cherry", "apple", "cherry")


print(thistuple)

Tuple Length

To determine how many items a tuple has, use the len() function:

Example

Print the number of items in the tuple:

thistuple = ("apple", "banana", "cherry")


print(len(thistuple))

A tuple can contain different data types:

Example

A tuple with strings, integers and boolean values:

tuple1 = ("abc", 34, True, 40, "male")


Set

Sets are used to store multiple items in a single variable.

A set is a collection which is unordered, unchangeable*, and unindexed.

* Note: Set items are unchangeable, but you can remove items and add new
items.

Sets are written with curly brackets.

Example

Create a Set:

thisset = {"apple", "banana", "cherry"}


print(thisset)
Set Items

Set items are unordered, unchangeable, and do not allow duplicate values.

Unordered

Unordered means that the items in a set do not have a defined order.

Set items can appear in a different order every time you use them, and cannot be
referred to by index or key.

Unchangeable

Set items are unchangeable, meaning that we cannot change the items after the
set has been created.

Once a set is created, you cannot change its items, but you can remove items
and add new items.
Duplicates Not Allowed

Sets cannot have two items with the same value.

Example

Duplicate values will be ignored:

Dictionary

Dictionaries are used to store data values in key:value pairs.

A dictionary is a collection which is ordered*, changeable and do not allow


duplicates.

As of Python version 3.7, dictionaries are ordered. In Python 3.6 and earlier,
dictionaries are unordered.

Dictionaries are written with curly brackets, and have keys and values:

Example

Create and print a dictionary:

thisdict = {
"brand": "Ford",
"model": "Mustang",
"year": 1964
}
print(thisdict)

Dictionary Items

Dictionary items are ordered, changeable, and do not allow duplicates.

Dictionary items are presented in key:value pairs, and can be referred to by


using the key name.

Example

Print the "brand" value of the dictionary:


thisdict = {
"brand": "Ford",
"model": "Mustang",
"year": 1964
}
print(thisdict["brand"])

Ordered or Unordered?

As of Python version 3.7, dictionaries are ordered. In Python 3.6 and earlier,
dictionaries are unordered.

When we say that dictionaries are ordered, it means that the items have a
defined order, and that order will not change.

Unordered means that the items do not have a defined order, you cannot refer to
an item by using an index.

Changeable

Dictionaries are changeable, meaning that we can change, add or remove items
after the dictionary has been created.

Duplicates Not Allowed

Dictionaries cannot have two items with the same key:

Example

Duplicate values will overwrite existing values:

thisdict = {
"brand": "Ford",
"model": "Mustang",
"year": 1964,
"year": 2020
}
print(thisdict)

UNIT 2
Control Structures and Functions
Python Conditional Statement
Python supports the usual logical conditions from mathematics:
 Equals: a == b
 Not Equals: a != b
 Less than: a < b
 Less than or equal to: a <= b
 Greater than: a > b
 Greater than or equal to: a >= b
These conditions can be used in several ways, most commonly in "if
statements" and loops.
If Statements
An "if statement" is written by using the if keyword.
Example
If statement:
a = 33
b = 200
if b > a:
print("b is greater than a")

Indentation
Python relies on indentation (whitespace at the beginning of a line) to define
scope in the code. Other programming languages often use curly-brackets for
this purpose.
Example
If statement, without indentation (will raise an error):
a = 33
b = 200
if b > a:
print("b is greater than a") # you will get an error
Elif
The elif keyword is Python's way of saying "if the previous conditions were not
true, then try this condition".
Example
a = 33
b = 33
if b > a:
print("b is greater than a")
elif a == b:
print("a and b are equal")

In this example a is equal to b, so the first condition is not true, but


the elif condition is true, so we print to screen that "a and b are equal".

Else
The else keyword catches anything which isn't caught by the preceding
conditions.
Example
a = 200
b = 33
if b > a:
print("b is greater than a")
elif a == b:
print("a and b are equal")
else:
print("a is greater than b")

In this example a is greater than b, so the first condition is not true, also
the elif condition is not true, so we go to the else condition and print to screen
that "a is greater than b".
You can also have an else without the elif:
Example
a = 200
b = 33
if b > a:
print("b is greater than a")
else:
print("b is not greater than a")

Logical Functions
And
The and keyword is a logical operator, and is used to combine conditional
statements:
Example
Test if a is greater than b, AND if c is greater than a:
a = 200
b = 33
c = 500
if a > b and c > a:
print("Both conditions are True")

Or
The or keyword is a logical operator, and is used to combine conditional
statements:
Example
Test if a is greater than b, OR if a is greater than c:
a = 200
b = 33
c = 500
if a > b or a > c:
print("At least one of the conditions is True")

Not
The not keyword is a logical operator, and is used to reverse the result of the
conditional statement:
Example
Test if a is NOT greater than b:
a = 33
b = 200
if not a > b:
print("a is NOT greater than b")
TAKING IPDUT IN PYTHON
Developers often have a need to interact with users, either to get data or to
provide some sort of result. Most programs today use a dialog box as a way of
asking the user to provide some type of ipdut. While Python provides us with
two inbuilt functions to read the ipdut from the keyboard.
 ipdut ( prompt )
 raw_ipdut ( prompt )
 ipdut (): This function first takes the ipdut from the user and converts it
into a string. The type of the returned object always will be <class
‘str’>. It does not evaluate the expression it just returns the complete
statement as String. For example, Python provides a built-in function
called ipdut which takes the ipdut from the user. When the ipdut
function is called it stops the program and waits for the user’s ipdut.
When the user presses enter, the program resumes and returns what the
user typed.
 Syntax:
 ipd = ipdut('STATEMENT')
 Whatever you enter as ipdut, the ipdut function converts it into a string.
if you enter an integer value still ipdut() function converts it into a
string. You need to explicitly convert it into an integer in your code
using typecasting .
 raw_ipdut(): This function works in older version (like Python 2.x).
This function takes exactly what is typed from the keyboard, converts it
to string, and then returns it to the variable in which we want to store it.
 Example:
 Python
 # Python program showing
 # a use of raw_ipdut()

 g = raw_ipdut("Enter your name : ")
 print g
We can use raw_ipdut() to enter numeric data also. In that case, we use
typecasting..
Note: ipdut() function takes all the ipdut as a string only
There are various function that are used to take as desired ipdut few of them
are : –
 int(ipdut())
 float(ipdut())
Python
num = int(ipdut("Enter a number: "))
print(num, " ", type(num))
floatNum = float(ipdut("Enter a decimal number: "))
print(floatNum, " ", type(floatNum))

Python Loops
Python has two primitive loop commands:
 while loops
 for loops
The while Loop

With the while loop we can execute a set of statements as long as a condition is
true.

Example

Print i as long as i is less than 6:

i=1
while i < 6:
print(i)
i += 1

Note: remember to increment i, or else the loop will continue forever.


The while loop requires relevant variables to be ready, in this example we need
to define an indexing variable, i, which we set to 1.

The break Statement

With the break statement we can stop the loop even if the while condition is
true:

Example

Exit the loop when i is 3:

i=1
while i < 6:
print(i)
if i == 3:
break
i += 1
The else Statement
With the else statement we can run a block of code once when the condition no
longer is true:
Example
Print a message once the condition is false:
i=1
while i < 6:
print(i)
i += 1
else:
print("i is no longer less than 6")

FOR LOOP
flowchart of Python For Loop
How to use the for loop in Python
In Python, the for loop is used to iterate over a sequence (such as a list, tuple,
string, or dictionary) or any iterable object. The basic syntax of the for loop is:
Python For Loop Syntax
for var in iterable:
# statements

The For Loops in Python are a special type of loop statement that is used for
sequential traversal. Python For loop is used for iterating over an iterable like a
String, Tuple, List, Set, or Dictionary.
In Python, there is no C style for loop, i.e., for (i=0; I <n; i++). The For Loops
in Python is similar to each loop in other languages, used for sequential
traversals.
Python For Loop with dictionary
This code uses a for loop to iterate over a dictionary and print each key-value
pair on a new line. The loop assigns each key to the variable i and uses string
formatting to print the key and its corresponding value.

d={'xyz' : 123,'abc' : 345}


for i in d:
print(i, d[i])
Python For Loop with Tuple
This code iterates over a tuple of tuples using a for loop with tuple updacking.
In each iteration, the values from the inner tuple are assigned to variables a and
b, respectively, and then printed to the console using the print() function. The
output will show each pair of values from the inner tuples.
Python
t = ((1, 2), (3, 4), (5, 6))
for a, b in t:
print(a, b)

Python For Loop with String


This code uses a for loop to iterate over a string and print each character on a
new line. The loop assigns each character to the variable i and continues until
all characters in the string have been processed.
# Iterating over a String
print("String Iteration")

s = "iips"
for i in s:
print(i)
Python for loop with Range
This code uses a Python for loop with index in conjunction with
the range() function to generate a sequence of numbers starting from 0, up to
(but not including) 10, and with a step size of 2. For each number in the
sequence, the loop prints its value using the print() function. The output will
show the numbers 0, 2, 4, 6, and 8.
Python
for i in range(0, 10, 2):
print(i)
Nested For Loops in Python
This code uses nested for loops to iterate over two ranges of numbers (1 to 3
inclusive) and prints the value of i and j for each combination of the two loops.
The inner loop is executed for each value of i in the outer loop. The output of
this code will print the numbers from 1 to 3 three times, as each value of i is
combined with each value of j.
for i in range(1, 4):
for j in range(1, 4):
print(i, j)
Python For Loop Over List
This code uses a for loop to iterate over a list of strings, printing each item in
the list on a new line. The loop assigns each item to the variable I and continues
until all items in the list have been processed.
Python
# Python program to illustrate
# Iterating over a list
l = ["mba", "5yrs", "iips"]

for i in l:
print(i)
Python Functions
Python Functions is a block of statements that return the specific task. The idea
is to put some commonly or repeatedly done tasks together and make a function
so that instead of writing the same code again and again for different ipduts, we
can do the function calls to reuse code contained in it over and over again.
Some Benefits of Using Functions
 Increase Code Readability
 Increase Code Reusability
Types of Functions in Python
Below are the different types of functions in Python:
 Built-in library function: These are Standard functions in Python that
are available to use.
 User-defined function: We can create our own functions based on our
requirements.
Creating a Function in Python
We can define a function in Python, using the def keyword. We can add any
type of functionalities and properties to it as we require. By the following
example, we can understand how to write a function in Python. In this way we
can create Python function definition by using def keyword.
Python
# A simple Python function
def fun():
print("Welcome to GFG")
Calling a Function in Python
After creating a function in Python we can call it by using the name of the
functions Python followed by parenthesis containing parameters of that
particular function. Below is the example for calling def function Python.
Python
# A simple Python function
def fun():
print("Welcome to GFG")

# Driver code to call a function


fun()
Arguments
Information can be passed into functions as arguments.
Arguments are specified after the function name, inside the parentheses. You
can add as many arguments as you want, just separate them with a comma.
The following example has a function with one argument (fname). When the
function is called, we pass along a first name, which is used inside the function
to print the full name:
Example
def my_function(fname):
print(fname + " KAUR")

my_function("GURLEEN")
my_function("SIMARPREET")
A simple Python function to check
# whether x is even or odd
def evenOdd(x):
if (x % 2 == 0):
print("even")
else:
print("odd")

A parameter is the variable listed inside the parentheses in the function


definition.
An argument is the value that is sent to the function when it is called.
Return Statement in Python Function
The function return statement is used to exit from a function and go back to the
function caller and return the specified value or data item to the caller. The
syntax for the return statement is:
return [expression_list]
The return statement can consist of a variable, an expression, or a constant
which is returned at the end of the function execution. If none of the above is
present with the return statement a None object is returned.
Example: Python Function Return Statement
def square_value(num):
"""This function returns the square
value of the entered number"""
return num**2
print(square_value(2))
print(square_value(-4))
# function to swap two values
def swap(x, y):
temp = x
x=y
y = temp

# Driver code
x=2
y=3
swap(x, y)
print(x)
print(y)

ERROR HANDLING
Errors are problems in a program due to which the program will stop the
execution. On the other hand, exceptions are raised when some internal events
occur which change the normal flow of the program.
Error in Python can be of two types
 Syntax errors
 Exceptions.
Syntax Error:
As the name suggests this error is caused by the wrong syntax in the code. It
leads to the termination of the program.
Example:
There is a syntax error in the code . The ‘if' statement should be followed by a
colon (:), and the ‘print' statement should be indented to be inside
the ‘if' block.

amount = 10000
if(amount > 2999)
print("You are eligible to purchase new software)

Output:

Exceptions:
Exceptions are raised when the program is syntactically correct, but the code
results in an error. This error does not stop the execution of the program,
however, it changes the normal flow of the program.

Different types of exceptions in python:


In Python, there are several built-in Python exceptions that can be raised when
an error occurs during the execution of a program. Here are some of the most
common types of exceptions in Python:
 TypeError: This exception is raised when an operation or function is
applied to an object of the wrong type, such as adding a string to an
integer.
 NameError: This exception is raised when a variable or function name is
not found in the current scope.
 IndexError: This exception is raised when an index is out of range for a
list, tuple, or other sequence types.
 KeyError: This exception is raised when a key is not found in a
dictionary.
 ValueError: This exception is raised when a function or method is called
with an invalid argument or ipdut, such as trying to convert a string to an
integer when the string does not represent a valid integer.
 AttributeError: This exception is raised when an attribute or method is
not found on an object, such as trying to access a non-existent attribute of
a class instance.
 IOError: This exception is raised when an I/O operation, such as reading
or writing a file, fails due to an ipdut/output error.
 ZeroDivisionError: This exception is raised when an attempt is made to
divide a number by zero.
 ImportError: This exception is raised when an import statement fails to
find or load a module.
Example:
Here in this code a s we are dividing the ‘marks’ by zero so a error will occur
known as ‘ZeroDivisionError’

marks = 10000
a = marks / 0
print(a)

Output:

In the above example raised the ZeroDivisionError as we are trying to divide a


number by 0.
TRY CATCH block to resolve exceptions:
The try block lets you test a block of code for errors.
The except block lets you handle the error.
The else block lets you execute code when there is no error.
try:
print(x)
except:
print("An exception occurred")

The code attempts to add an integer (‘x') and a string (‘y') together, which is not
a valid operation, and it will raise a ‘TypeError'. The code used
a ‘try' and ‘except' block to catch this exception and print an error message.

x=5
y = "hello"
try:
z=x+y
except TypeError:
print("Error: cannot add an int and a str")

Try with else clause


The code defines a function AbyB(a, b) that calculates c as ((a+b) / (a-b)) and
handles a potential ZeroDivisionError. It prints the result if there’s no division
by zero error. Calling AbyB(2.0, 3.0) calculates and prints -5.0, while
calling AbyB(3.0, 3.0) attempts to divide by zero, resulting in
a ZeroDivisionError, which is caught and “a/b results in 0” is printed.

def AbyB(a , b):


try:
c = ((a+b) / (a-b))
except ZeroDivisionError:
print ("a/b result in 0")
else:
print (c)
AbyB(2.0, 3.0)
AbyB(3.0, 3.0)

What is PIP?
PIP is a package manager for Python packages, or modules if you like.
Note: If you have Python version 3.4 or later, PIP is included by default.
What is a Package?
A package contains all the files you need for a module.
Modules are Python code libraries you can include in your project.
Check if PIP is Installed
Navigate your command line to the location of Python's script directory, and
type the following:
Check PIP version:
C:\Users\Your Name\AppData\Local\Programs\Python\Python36-32\
Scripts>pip –version
File handling is an important part of any web application.
Python has several functions for creating, reading, updating, and deleting files.
File Handling
The key function for working with files in Python is the open() function.
The open() function takes two parameters; filename, and mode.
There are four different methods (modes) for opening a file:
"r" - Read - Default value. Opens a file for reading, error if the file does not
exist
"a" - Append - Opens a file for appending, creates the file if it does not exist
"w" - Write - Opens a file for writing, creates the file if it does not exist
"x" - Create - Creates the specified file, returns an error if the file exists
In addition you can specify if the file should be handled as binary or text mode
"t" - Text - Default value. Text mode
"b" - Binary - Binary mode (e.g. images)

Syntax
To open a file for reading it is enough to specify the name of the file:
f = open("Sales Data.csv")
The code above is the same as:
f = open("Sales Data.csv ", "rt")
Because "r" for read, and "t" for text are the default values, you do not need to
specify them.
Note: Make sure the file exists, or else you will get an error.
To open the file, use the built-in open() function.
The open() function returns a file object, which has a read() method for reading
the content of the file:
f = open("Sales Data.csv", "r")
print(f.read())
If the file is located in a different location, you will have to specify the file path,
like this:
Example
Open a file on a different location:
f = open("D:\\myfiles\welcome.txt", "r")
print(f.read())

Read Only Parts of the File


By default the read() method returns the whole text, but you can also specify
how many characters you want to return:
Example
Return the 5 first characters of the file:
f = open("Sales_data.csv", "r")
print(f.read(5))

Read Lines
You can return one line by using the readline() method:
Example
Read one line of the file:
f = open("Sales Data.csv", "r")
print(f.readline())
By calling readline() two times, you can read the two first lines:
Example
Read two lines of the file:
f = open("Sales Data.csv", "r")
print(f.readline())
print(f.readline())
By looping through the lines of the file, you can read the whole file, line by line:
Example
Loop through the file line by line:
f = open("Sales Data.csv", "r")
for x in f:
print(x)
Close Files
It is a good practice to always close the file when you are done with it.
Example
Close the file when you are finish with it:
f = open("Sales Data.csv", "r")
print(f.readline())
f.close()
Write to an Existing File
To write to an existing file, you must add a parameter to the open() function:
"a" - Append - will append to the end of the file
"w" - Write - will overwrite any existing content
Open the file "demofile2.txt" and append content to the file:
f = open("demofile2.txt", "a")
f.write("Now the file has more content!")
f.close()

#open and read the file after the appending:


f = open("demofile2.txt", "r")
print(f.read())
Example
Open the file "demofile3.txt" and overwrite the content:
f = open("demofile3.txt", "w")
f.write("Woops! I have deleted the content!")
f.close()

#open and read the file after the overwriting:


f = open("demofile3.txt", "r")
print(f.read())
Note: the "w" method will overwrite the entire file.

Create a New File


To create a new file in Python, use the open() method, with one of the following
parameters:
"x" - Create - will create a file, returns an error if the file exist
"a" - Append - will create a file if the specified file does not exist
"w" - Write - will create a file if the specified file does not exist
Example
Create a file called "myfile.txt":
f = open("myfile.txt", "x")
Result: a new empty file is created!
Example
Create a new file if it does not exist:
f = open("myfile.txt", "w")
Delete a File
To delete a file, you must import the OS module, and run
its os.remove() function:
Remove the file "demofile.txt":
import os
os.remove("demofile.txt")

Check if File exist:


To avoid getting an error, you might want to check if the file exists before you
try to delete it:
Example
Check if file exists, then delete it:
import os
if os.path.exists("demofile.txt"):
os.remove("demofile.txt")
else:
print("The file does not exist")

Delete Folder
To delete an entire folder, use the os.rmdir() method:
Example
Remove the folder "myfolder":
import os
os.rmdir("myfolder")
Note: You can only remove empty folders.

What is Pandas Libray in Python?


Pandas is a Python library used for working with data sets.
It has functions for analyzing, cleaning, exploring, and manipulating data.
The name "Pandas" has a reference to both "Panel Data", and "Python Data
Analysis" and was created by Wes McKinney in 2008.
Pandas is a powerful and versatile library that simplifies the tasks of data
manipulation in Python. Pandas is well-suited for working with tabular data,
such as spreadsheets or SQL tables.
The Pandas library is an essential tool for data analysts, scientists, and engineers
working with structured data in Python.
The data produced by Pandas is often used as ipdut for plotting functions
in Matplotlib, statistical analysis in SciPy, and machine learning
algorithms in Scikit-learn.
Python’s Pandas library is the best tool to analyze, clean, and manipulate data.
Here is a list of things that we can do using Pandas.
 Data set cleaning, merging, and joining.
 Easy handling of missing data (represented as NaN) in floating point as
well as non-floating point data.
 Columns can be inserted and deleted from DataFrame and higher-
dimensional objects.
 Data Visualization.
Getting Started with Pandas
Let’s see how to start working with the Python Pandas library:
Installing Pandas
The first step in working with Pandas is to ensure whether it is installed in the
system or not. If not, then we need to install it on our system using the pip
command.
Follow these steps to install Pandas:
Step 1: Type ‘cmd’ in the search box and open it.
Step 2: Locate the folder using the cd command where the python-pip file has
been installed.
Step 3: After locating it, type the command:
pip install pandas
Importing Pandas
After the Pandas have been installed in the system, you need to import the
library. This module is generally imported as follows:
import pandas as pd
Note: Here, pd is referred to as an alias for the Pandas. However, it is not
necessary to import the library using the alias, it just helps in writing less code
every time a method or property is called.
Data Structures in Pandas Library
Pandas generally provide two data structures for manipulating data. They are:
 Series
 DataFrame
Pandas Series
A Pandas Series is a one-dimensional labeled array capable of holding data of
any type (integer, string, float, Python objects, etc.). The axis labels are
collectively called indexes.
The Pandas Series is nothing but a column in an Excel sheet.

Pandas Series
Creating a Series
Pandas Series is created by loading the datasets from existing storage (which
can be a SQL database, a CSV file, or an Excel file).
Pandas Series can be created from lists, dictionaries, scalar values, etc.
Example: Creating a series using the Pandas Library.
import pandas as pd

a = [1, 7, 2]

myvar = pd.Series(a)

print(myvar)
Create Labels
With the index argument, you can name your own labels.
Example
Create your own labels:
import pandas as pd

a = [1, 7, 2]

myvar = pd.Series(a, index = ["x", "y", "z"])

print(myvar)
When you have created labels, you can access an item by referring to the label.
Example
Return the value of "y":
print(myvar["y"])

Pandas DataFrame
A Pandas DataFrame is a 2 dimensional data structure, like a 2
dimensional array, or a table with rows and columns.
Series is like a column, a DataFrame is the whole table.
Create a simple Pandas DataFrame:
import pandas as pd

data = {
"calories": [420, 380, 390],
"duration": [50, 40, 45]
}

#load data into a DataFrame object:


df = pd.DataFrame(data)

print(df)

Locate Row
Pandas use the loc attribute to return one or more specified row(s)
Example
Return row 0:
#refer to the row index:
print(df.loc[0])
Example
Return row 0 and 1:
#use a list of indexes:
print(df.loc[[0, 1]])

Pandas Read CSV


Read CSV Files
A simple way to store big data sets is to use CSV files (comma separated files).
CSV files contain plain text and is a well know format that can be read by
everyone including Pandas.
In our examples we will be using a CSV file called 'data.csv'.
Load the CSV into a DataFrame:
import pandas as pd
df = pd.read_csv('data.csv')

print(df.to_string())
Tip: use to_string() to print the entire DataFrame.
If you have a large DataFrame with many rows, Pandas will only return the first
5 rows, and the last 5 rows:
Example
Print the DataFrame without the to_string() method:
import pandas as pd
df = pd.read_csv('data.csv')
print(df)
max_rows
The number of rows returned is defined in Pandas option settings.
You can check your system's maximum rows with
the pd.options.display.max_rows statement.
Example
Check the number of maximum returned rows:
import pandas as pd

print(pd.options.display.max_rows)
By default the number is 60, which means that if the DataFrame contains more
than 60 rows, the print(df) statement will return only the headers and the first
and last 5 rows.
You can change the maximum rows number with the same statement.
Example
Increase the maximum number of rows to display the entire DataFrame:
import pandas as pd

pd.options.display.max_rows = 9999

df = pd.read_csv('data.csv')
print(df)
How to Use Pandas with Excel Files?
Pandas provides powerful tools to read from and write to Excel files, making it
easy to integrate Excel data with your Python scripts.
Reading Excel Files
You can read Excel files using the pd.read_excel() function. It requires
the opepdyxl or xlrd library for .xlsx files or the xlrd library for .xls files.
import pandas as pd

# Load an Excel file into a DataFrame


df = pd.read_excel('filename.xlsx', sheet_name='Sheet1')
print(df)
Writing to Excel Files
To write a DataFrame to an Excel file, you can use the to_excel() method of the
DataFrame class. It requires the opepdyxl library to write to .xlsx files.
# Write the DataFrame to an Excel file
df.to_excel('output.xlsx', sheet_name='Sheet1', index=False)
How to Extract Data from Excel Using Pandas?
Extracting data involves loading an Excel file into a DataFrame
using read_excel() and then manipulating or analyzing the data as needed.
# Extract data from the second sheet of an Excel file
df = pd.read_excel('filename.xlsx', sheet_name='Sheet2')
print(df.head()) # Display the first few rows of the DataFrame
Can We Read XLSX File in Pandas?
Yes, Pandas can read .xlsx files using the read_excel() function, which
seamlessly handles modern Excel file formats. Ensure that you
have opepdyxl or xlrd installed for handling .xlsx files.
To work with Excel files in Pandas, especially for reading from and writing
to .xlsx files, the opepdyxl library is recommended but not strictly necessary
unless you need to interact with .xlsx files. Pandas uses opepdyxl as the default
engine for reading and writing .xlsx files:
 Reading: Without opepdyxl, Pandas won’t be able to read .xlsx files.
 Writing: To write to an .xlsx file, you will need
either opepdyxl or xlsxwriter. However, opepdyxl is needed if you want
to work with more advanced Excel functionalities like modifying existing
files or adding formulas.
To install opepdyxl, you can use pip:
pip install opepdyxl
This installation ensures that you can fully utilize Pandas’ capabilities for
handling Excel files, especially the modern .xlsx format.

Pandas - Analyzing DataFrames


Viewing the Data
One of the most used method for getting a quick overview of the DataFrame, is
the head() method.
The head() method returns the headers and a specified number of rows,
starting from the top.
Example
Get a quick overview by printing the first 10 rows of the DataFrame:
import pandas as pd

df = pd.read_csv('data.csv')

print(df.head(10))
In our examples we will be using a CSV file called 'data.csv'.
Note: if the number of rows is not specified, the head() method will return the
top 5 rows.
Example
Print the first 5 rows of the DataFrame:
import pandas as pd

df = pd.read_csv('data.csv')

print(df.head())
There is also a tail() method for viewing the last rows of the DataFrame.
The tail() method returns the headers and a specified number of rows, starting
from the bottom.
Example
Print the last 5 rows of the DataFrame:
print(df.tail())

Info About the Data


The DataFrames object has a method called info(), that gives you more
information about the data set.
Example
Print information about the data:
print(df.info())
Result

Null Values
The info() method also tells us how many Non-Null values there are present in
each column, and in our data set it seems like there are 164 of 169 Non-Null
values in the "Calories" column.
Which means that there are 5 rows with no value at all, in the "Calories"
column, for whatever reason.
Empty values, or Null values, can be bad when analyzing data, and you should
consider removing rows with empty values.
Data Cleaning
Data cleaning means fixing bad data in your data set.
Bad data could be:
 Empty cells
 Data in wrong format
 Wrong data
 Duplicates
Empty Cells

Empty cells can potentially give you a wrong result when you analyze data.

Remove Rows

One way to deal with empty cells is to remove rows that contain empty cells.

This is usually OK, since data sets can be very big, and removing a few rows
will not have a big impact on the result.

Example

Return a new Data Frame with no empty cells:

import pandas as pd

df = pd.read_csv('data.csv')

new_df = df.dropna()

print(new_df.to_string())

Note: By default, the dropna() method returns a new DataFrame, and will not
change the original.

If you want to change the original DataFrame, use the ipdlace = True argument:

Example

Remove all rows with NULL values:

import pandas as pd

df = pd.read_csv('data.csv')

df.dropna(inplace = True)

print(df.to_string())
Note: Now, the dropna(ipdlace = True) will NOT return a new DataFrame, but
it will remove all rows containing NULL values from the original DataFrame.

Replace Empty Values


Another way of dealing with empty cells is to insert a new value instead.
This way you do not have to delete entire rows just because of some empty cells.
The fillna() method allows us to replace empty cells with a value:
Example
Replace NULL values with the number 130:
import pandas as pd

df = pd.read_csv('data.csv')

df.fillna(130, ipdlace = True)


Replace Only For Specified Columns
The example above replaces all empty cells in the whole Data Frame.
To only replace empty values for one column, specify the column name for the
DataFrame:
Example
Replace NULL values in the "Calories" columns with the number 130:
import pandas as pd

df = pd.read_csv('data.csv')

df["Calories"].fillna(130, ipdlace = True)


Replacing Values
One way to fix wrong values is to replace them with something else.
In our example, it is most likely a typo, and the value should be "45" instead of
"450", and we could just insert "45" in row 7:
Example:
Set "Duration" = 45 in row 7:
df.loc[7, 'Duration'] = 45
Removing Duplicates
To discover duplicates, we can use the duplicated() method.
The duplicated() method returns a Boolean values for each row:
Example
Returns True for every row that is a duplicate, otherwise False:
print(df.duplicated())
To remove duplicates, use the drop_duplicates() method.
Example
df.drop_duplicates(ipdlace = True)

Matplotlib
Matplotlib is a powerful plotting library in Python used for creating static,
animated, and interactive visualizations. Matplotlib’s primary purpose is to
provide users with the tools and functionality to represent data graphically,
making it easier to analyze and understand. It was originally developed by
John D. Hunter in 2003 and is now maintained by a large community of
developers.
Key Features of Matplotlib:
1. Versatility: Matplotlib can generate a wide range of plots, including
line plots, scatter plots, bar plots, histograms, pie charts, and more.
2. Customization: It offers extensive customization options to control
every aspect of the plot, such as line styles, colors, markers, labels,
and annotations.
3. Integration with NumPy: Matplotlib integrates seamlessly with
NumPy, making it easy to plot data arrays directly.
4. Publication Quality: Matplotlib produces high-quality plots suitable
for publication with fine-grained control over the plot aesthetics.
5. Extensible: Matplotlib is highly extensible, with a large ecosystem
of add-on toolkits and extensions like Seaborn, Pandas plotting
functions, and Basemap for geographical plotting.
6. Cross-Platform: It is platform-independent and can run on various
operating systems, including Windows, macOS, and Linux.
7. Interactive Plots: Matplotlib supports interactive plotting through
the use of widgets and event handling, enabling users to explore data
dynamically.
8. Matplotlib is popular due to its ease of use, extensive documentation,
and wide range of plotting capabilities. It offers flexibility in
customization, supports various plot types, and integrates well with
other Python libraries like NumPy and Pandas.
9. Matplotlib is a suitable choice for various data visualization tasks,
including exploratory data analysis, scientific plotting, and creating
publication-quality plots. It excels in scenarios where users require fine-
grained control over plot customization and need to create complex or
specialized visualizations.
Applications of Matplotlib
Matplotlib is widely used in various fields for data visualization, including:
1. Scientific Research: For plotting experimental results and
visualizations that describe the data more effectively.
2. Finance: For creating financial charts to analyze market trends and
movements.
3. Data Analysis: For exploratory data analysis in fields such as data
science and machine learning.
4. Education: For teaching complex concepts in mathematics, physics,
and statistics through visual aids.
5. Engineering: For visualizing engineering simulations and results.
Characteristics of Matplotlib
Key characteristics of Matplotlib include:
 Versatility: It can create a wide range of static, animated, and
interactive plots.
 Customizability: Almost every element of a plot (like sizes, colors, and
fonts) can be customized.
 Extensibility: It can be used with a variety of GUI modules to create
graphical applications.
 Integration: Easily integrates with other libraries like NumPy and
Pandas for efficient data manipulation.
 Output Formats: Supports many output formats, including PNG, PDF,
SVG, EPS, and interactive backends.
Matplotlib is a low level graph plotting library in python that serves as a
visualization utility.
Matplotlib was created by John D. Hunter.
Matplotlib is open source and we can use it freely.
Matplotlib is mostly written in python, a few segments are written in C,
Objective-C and Javascript for Platform compatibility.
Installation of Matplotlib
If Python and PIP are already installed on a system, then
Install it using this command:
C:\Users\Your Name>pip install matplotlib
Import Matplotlib
Once Matplotlib is installed, import it in your applications by adding
the import module statement:
import matplotlib

Checking Matplotlib Version


The version string is stored under __version__ attribute.
Example
import matplotlib

print(matplotlib.__version__)
Pyplot
Most of the Matplotlib utilities lies under the pyplot submodule, and are
usually imported under the plt alias:
import matplotlib.pyplot as plt
Now the Pyplot package can be referred to as plt.
Parameters:
This function accepts parameters that enable us to set axes scales and format
the graphs. These parameters are mentioned below :-
 plot(x, y): plot x and y using default line style and color.
 plot.axis([xmin, xmax, ymin, ymax]): scales the x-axis and y-axis from
minimum to maximum values
 plot.(x, y, color=’green’, marker=’o’, linestyle=’dashed’, linewidth=2,
markersize=12):
x and y co-ordinates are marked using circular markers of size 12 and
green color line with — style of width 2
 plot.xlabel(‘X-axis’): names x-axis
 plot.ylabel(‘Y-axis’): names y-axis
 plot(x, y, label = ‘Sample line ‘): plotted Sample Line will be displayed
as a legend

Example
Draw a line in a diagram from position (0,0) to position (6,250):
import matplotlib.pyplot as plt
import pandas as pd

xpoints = pd.array([0, 6])


ypoints = pd.array([0, 250])
x,y=(plt.plot(xpoints, ypoints)
plt.show()
Plotting x and y points
The plot() function is used to draw points (markers) in a diagram.
By default, the plot() function draws a line from point to point.
The function takes parameters for specifying points in the diagram.
Parameter 1 is an array containing the points on the x-axis.
Parameter 2 is an array containing the points on the y-axis.
If we need to plot a line from (1, 3) to (8, 10), we have to pass two arrays [1, 8]
and [3, 10] to the plot function.
Example
Draw a line in a diagram from position (1, 3) to position (8, 10):
import matplotlib.pyplot as plt
import pandas as pd

xpoints = pd.array([1, 8])


ypoints = pd.array([3, 10])

plt.plot(xpoints, ypoints)
plt.show()
Plotting Without Line
To plot only the markers, you can use shortcut string notation parameter 'o',
which means 'rings'.
Example
Draw two points in the diagram, one at position (1, 3) and one in position (8,
10):
import matplotlib.pyplot as plt
import pandas as pd

xpoints = pd.array([1, 8])


ypoints = pd.array([3, 10])

plt.plot(xpoints, ypoints, 'o')


plt.show()
Multiple Points
You can plot as many points as you like, just make sure you have the same
number of points in both axis.
Example
Draw a line in a diagram from position (1, 3) to (2, 8) then to (6, 1) and finally
to position (8, 10):
import matplotlib.pyplot as plt
import pandas as pd

xpoints = pd.array([1, 2, 6, 8])


ypoints = pd.array([3, 8, 1, 10])
plt.plot(xpoints, ypoints)
plt.show()
Marker Size
You can use the keyword argument markersize or the shorter version, ms to set
the size of the markers:
Example
Set the size of the markers to 20:
import matplotlib.pyplot as plt
import pandas as pd

ypoints = pd.array([3, 8, 1, 10])

plt.plot(ypoints, marker = 'o', ms = 20)


plt.show()
Create Labels for a Plot
With Pyplot, you can use the xlabel() and ylabel() functions to set a label for
the x- and y-axis.
Example
Add labels to the x- and y-axis:
import pandas as pd
import matplotlib.pyplot as plt

x = pd.array([80, 85, 90, 95, 100, 105, 110, 115, 120, 125])
y = pd.array([240, 250, 260, 270, 280, 290, 300, 310, 320, 330])
font1 = {'family':'serif','color':'blue','size':20}
plt.plot(x, y)
plt.title("Sports Watch Data",fontdict = font1, loc = 'left')
plt.xlabel("Average Pulse")
plt.ylabel("Calorie Burnage")

plt.show()
import matplotlib.pyplot as plt
import pandas as pd
#plot 1:
x = pd.array([0, 1, 2, 3])
y = pd.array([3, 8, 1, 10])

plt.subplot(1, 2, 1)
plt.plot(x,y)
plt.title("SALES")

#plot 2:
x = pd.array([0, 1, 2, 3])
y = pd.array([10, 20, 30, 40])

plt.subplot(1, 2, 2)
plt.plot(x,y)
plt.title("INCOME")

plt.show()
Matplotlib Scatter
Matplotlib Bars
Creating Bars

With Pyplot, you can use the bar() function to draw bar graphs:

Example

Draw 4 bars:

import matplotlib.pyplot as plt


import panads as pd

x = pd.array(["A", "B", "C", "D"])


y = pd.array([3, 8, 1, 10])

plt.bar(x,y)
plt.show()
Horizontal Bars
If you want the bars to be displayed horizontally instead of vertically, use
the barh() function:
import matplotlib.pyplot as plt
import panads as pd

x = pd.array(["A", "B", "C", "D"])


y = pd.array([3, 8, 1, 10])

plt.barh(x,y)
plt.show()
Bar Color
The bar() and barh() take the keyword argument color to set the color of the
bars:
plt.bar(x, y, color = "red")
Bar Width
The bar() takes the keyword argument width to set the width of the bars:
plt.bar(x, y, width = 0.1)
Note: For horizontal bars, use height instead of width.

Bar Height
The barh() takes the keyword argument height to set the height of the bars:
plt.barh(x, y, height = 0.1)

Matplotlib Histograms
A histogram is like a visual summary that shows how often different values
appear in a set of data. Imagine you have a collection of numbers, like ages of
people. A histogram divides these numbers into groups, called "bins," and then
uses bars to represent how many numbers fall into each bin. The taller the bar,
the more numbers are in that group.
Histogram in Matplotlib
We can create a histogram in Matplotlib using the hist() function. This function
allows us to customize various aspects of the histogram, such as the number of
bins, color, and transparency. Histogram in Matplotlib is used to represent the
distribution of numerical data, helping you to identify patterns.
The hist() Function
The hist() function in Matplotlib takes a dataset as input and divides it into
intervals (bins). It then displays the frequency (count) of data points falling
within each bin as a bar graph.
Following is the syntax of hist() function in Matplotlib −
Syntax
plt.hist(x, bins=None, range=None, density=False, cumulative=False,
color=None, edgecolor=None, ...)
Where,
 x is the input data for which the histogram is determined.
 bins (optional) is the number of bins or the bin edges.
 range (optional) is the lower and upper range of the bins. Default is the
minimum and maximum of x
 If density (optional) is True, the histogram represents a probability
density function. Default is False.
 If cumulative (optional) is True, a cumulative histogram is computed.
Default is False.
These are just a few parameters; there are more optionals parameters available
for customization.

Creating a Vertical Histogram


In Matplotlib, creating a vertical histogram involves plotting a graphical
representation of the frequency distribution of a dataset, with the bars oriented
vertically along the y-axis. Each bar represents the frequency or count of data
points falling within a particular interval or bin along the x-axis.
Example
In the following example, we are creating a vertical histogram by setting the
"orientation" parameter to "vertical" within the hist() function −
import matplotlib.pyplot as plt
x = [1, 2, 3, 1, 2, 3, 4, 1, 3, 4, 5]
plt.hist(x, orientation="vertical")
plt.show()
Output
We get the output as shown below −
Customized Histogram with Density
When we create a histogram with density, we are providing a visual summary
of how data is distributed. We use this graph to see how likely different numbers
are occurring, and the density option makes sure the total area under the
histogram is normalized to one.
Example
In the following example, we are visualizing random data as a histogram with
30 bins, displaying it in green with a black edge. We are using the density=True
parameter to represent the probability density −
import matplotlib.pyplot as plt
import numpy as np

# Generate random data


data = np.random.randn(1000)

# Create a histogram with density and custom color


plt.hist(data, bins=30, density=True, color='green', edgecolor='black')
plt.xlabel('Values')
plt.ylabel('Probability Density')
plt.title('Customized Histogram with Density')
plt.show()
Output
After executing the above code, we get the following output −
A Pie Chart is a circular statistical plot that can display only one series of
data. The area of the chart is the total percentage of the given data. Pie charts
in Python are widely used in business presentations, reports, and dashboards
due to their simplicity and effectiveness in displaying data distributions
A pie chart consists of slices that represent different categories. The size of
each slice is proportional to the quantity it represents. The following
components are essential when creating a pie chart in Matplotlib:
 Data: The values or counts for each category.
 Labels: The names of each category, which will be displayed alongside
the slices.
 Colors: Optional, but colors can be used to differentiate between slices
effectively.

Creating Pie Charts


With Pyplot, you can use the pie() function to draw pie charts:
A simple pie chart:
import matplotlib.pyplot as plt
import numpy as np
y = np.array([35, 25, 25, 15])

plt.pie(y)
plt.show()
Result:

As you can see the pie chart draws one piece (called a wedge) for each value in
the array (in this case [35, 25, 25, 15]).
By default the plotting of the first wedge starts from the x-axis and
moves counterclockwise:
Note: The size of each wedge is determined by comparing the value with all the
other values, by using this formula:
The value divided by the sum of all values: x/sum(x)

Labels and colors


Add labels to the pie chart with the labels parameter.
Add colors to the pie chart with the colors parameter.
The labels parameter must be an array with one label for each wedge:
Example
A simple pie chart:
import matplotlib.pyplot as plt
import numpy as np

y = np.array([35, 25, 25, 15])


colors = ("orange", "cyan", "brown",
"grey", "indigo", "beige")
mylabels = ["Apples", "Bananas", "Cherries", "Dates"]

plt.pie(y, labels = mylabels,colors=colors)


plt.show()
Result:

Data Visualization is the graphic representation of data. It converts a huge


dataset into small graphs, thus aiding in data analysis and predictions. It is an
indispensable element of data science that makes complex data more
understandable and accessible. Matplotlib and Seaborn act as the backbone of
data visualization through Python.
Matplotlib: It is a Python library used for plotting graphs with the help of other
libraries like Numpy and Pandas. It is a powerful tool for visualizing data in
Python. It is used for creating statistical inferences and plotting 2D graphs of
arrays. It was first introduced by John D. Hunter in 2002. It uses Pyplot to
provide a MATLAB-like interface free and open-source. It is capable of dealing
with various operating systems and their graphical backends.
Seaborn: It is also a Python library used for plotting graphs with the help of
Matplotlib, Pandas, and Numpy. It is built on the roof of Matplotlib and is
considered as a superset of the Matplotlib library. It helps in visualizing
univariate and bivariate data. It uses beautiful themes for decorating Matplotlib
graphics. It acts as an important tool in picturing Linear Regression Models. It
serves in making graphs of statical Time-Series data. It eliminates the
overlapping of graphs and also aids in their beautification.
Table of differences between Matplotlib and Seaborn

Features Matplotlib Seaborn

Seaborn contains several


patterns and plots for
It is utilized for making basic
data visualization. It uses
graphs. Datasets are visualized
Functionalit fascinating themes. It
with the help of bar graphs,
y helps in compiling whole
histograms, pie charts, scatter
data into a single plot. It
plots, lines, and so on.
also provides the
distribution of data.

It uses comparatively
It uses comparatively complex simple syntax which is
and lengthy syntax. Example: easier to learn and
Syntax Syntax for bar graph- understand. Example:
matplotlib.pyplot.bar(x_axis, Syntax for bargraph-
y_axis). seaborn.barplot(x_axis,
y_axis).

We can open and use multiple


figures simultaneously.
Seaborn sets the time for
However, they are closed
Dealing the creation of each
distinctly. Syntax to close one
Multiple figure. However, it may
figure at a time:
Figures lead to (OOM) out of
matplotlib.pyplot.close().
memory issues
Syntax to close all the figures:
matplotlib.pyplot.close(“all”)
Features Matplotlib Seaborn

Matplotlib is well connected


Seaborn is more
with Numpy and Pandas and
comfortable in handling
acts as a graphics package for
Pandas data frames. It
data visualization in Python.
Visualization uses basic sets of
Pyplot provides similar
methods to provide
features and syntax as in
beautiful graphics in
MATLAB. Therefore, MATLAB
Python.
users can easily study it.

Seaborn avoids
Matplotlib is a highly overlapping plots with
Pliability
customized and robust the help of its default
themes

Seaborn is much more


Matplotlib works efficiently functional and organized
with data frames and arrays.It than Matplotlib and
treats figures and axes as treats the whole dataset
Data Frames
objects. It contains various as a single unit. Seaborn
and Arrays
stateful APIs for plotting. is not so stateful and
Therefore plot() like methods therefore, parameters are
can work without parameters. required while calling
methods like plot()

Seaborn is the extended


version of Matplotlib
Matplotlib plots various graphs which uses Matplotlib
Use Cases
using Pandas and Numpy along with Numpy and
Pandas for plotting
graphs

Machine learning (ML)


Machine learning (ML) is a subdomain of artificial intelligence (AI) that
focuses on developing systems that learn—or improve performance—based on
the data they ingest. Artificial intelligence is a broad word that refers to
systems or machines that resemble human intelligence. Machine learning and
AI are frequently discussed together, and the terms are occasionally used
interchangeably, although they do not signify the same thing. A crucial
distinction is that, while all machine learning is AI, not all AI is machine
learning.
What is Machine Learning?
Machine Learning is the field of study that gives computers the capability to
learn without being explicitly programmed. ML is one of the most exciting
technologies that one would have ever come across. As it is evident from the
name, it gives the computer that makes it more similar to humans: The ability to
learn. Machine learning is actively being used today, perhaps in many more
places than one would expect.
Features of Machine Learning
 Machine learning is a data-driven technology. A large amount of data is
generated by organizations daily, enabling them to identify notable
relationships and make better decisions.
 Machines can learn from past data and automatically improve their
performance.
 Given a dataset, ML can detect various patterns in the data.
 For large organizations, branding is crucial, and targeting a relatable
customer base becomes easier.
 It is similar to data mining, as both deal with substantial amounts of
data.
Data and It’s Processing:
Data is the foundation of machine learning. The quality and quantity of data
you have directly impact the performance of your machine learning models
Supervised machine learning is a fundamental approach for machine learning
and artificial intelligence. It involves training a model using labeled data,
where each input comes with a corresponding correct output. The process is
like a teacher guiding a student—hence the term “supervised” learning. In this
article, we’ll explore the key components of supervised learning, the different
types of supervised machine learning algorithms used, and some practical
examples of how it works.
Supervised Machine Learning

Types of Supervised Learning in Machine Learning


Now, Supervised learning can be applied to two main types of problems:
 Classification: Where the output is a categorical variable (e.g., spam vs.
non-spam emails, yes vs. no).
 Regression: Where the output is a continuous variable (e.g., predicting
house prices, stock prices).

Practical Examples of Supervised learning


Few practical examples of supervised machine learning across various
industries:
 Fraud Detection in Banking: Utilizes supervised learning algorithms on
historical transaction data, training models with labeled datasets of
legitimate and fraudulent transactions to accurately predict fraud patterns.
 Parkinson Disease Prediction: Parkinson’s disease is a progressive
disorder that affects the nervous system and the parts of the body
controlled by the nerves.
 Customer Churn Prediction: Uses supervised learning techniques to
analyze historical customer data, identifying features associated with
churn rates to predict customer retention effectively.
 Cancer cell classification: Implements supervised learning for cancer
cells based on their features, and identifying them if they are ‘malignant’
or ‘benign.
 Stock Price Prediction: Applies supervised learning to predict a signal
that indicates whether buying a particular stock will be helpful or not.
Supervised Machine Learning Algorithms
Supervised learning can be further divided into several different types, each
with its own unique characteristics and applications. Here are some of the
most common types of supervised learning algorithms:
Let’s summarize the supervised machine learning algorithms in table:

Regression,
Algorithm Classification Purpose Method Use Cases

Linear
equation
Predict Predicting
minimizing
Regression continuous continuous
sum of
output values values
Linear squares of
Regression residuals

Logistic Classification Predict Logistic Binary


Regression binary output function classification
variable transforming tasks
linear
Regression,
Algorithm Classification Purpose Method Use Cases

relationship

Tree-like Classification
Model
structure with and
Both decisions and
Decision decisions and Regression
outcomes
Trees outcomes tasks

Improve Reducing
classification Combining overfitting,
Both and multiple improving
Random regression decision trees prediction
Forests accuracy accuracy

Create Maximizing
hyperplane margin
Classification
for between
and
Both classification classes or
Regression
or predict predicting
tasks
continuous continuous
SVM values values

Finding k
closest Classification
Predict class
neighbors and
or value
and Regression
Both based on k
predicting tasks,
closest
based on sensitive to
neighbors
majority or noisy data
KNN average

Gradient Both Combine Iteratively Classification


Boosting weak learners correcting and
Regression,
Algorithm Classification Purpose Method Use Cases

Regression
tasks to
to create errors with
improve
strong model new models
prediction
accuracy

Text
Predict class Bayes’ classification,
based on theorem with spam
Classification feature feature filtering,
independence independence sentiment
Naive assumption assumption analysis,
Bayes medical

These types of supervised learning in machine learning vary based on


the problem you’re trying to solve and the dataset you’re working with. In
classification problems, the task is to assign inputs to predefined classes, while
regression problems involve predicting numerical outcomes.
Conclusion
Supervised learning is a powerful branch of machine learning that
revolves around learning a class from examples provided during training. By
using supervised learning algorithms, models can be trained to make predictions
based on labeled data. The effectiveness of supervised machine learning lies in
its ability to generalize from the training data to new, unseen data, making it
invaluable for a variety of applications, from image recognition to financial
forecasting.
Understanding the types of supervised learning algorithms and the
dimensions of supervised machine learning is essential for choosing the
appropriate algorithm to solve specific problems. As we continue to explore the
different types of supervised learning and refine these supervised learning
techniques, the impact of supervised learning in machine learning will only
grow, playing a critical role in advancing AI-driven solutions.
Unsupervised learning is a branch of machine learning that deals with
unlabeled data. Unlike supervised learning, where the data is labeled with a
specific category or outcome, unsupervised learning algorithms are tasked with
finding patterns and relationships within the data without any prior knowledge
of the data’s meaning. This makes unsupervised learning a powerful tool
for exploratory data analysis, where the goal is to understand the underlying
structure of the data.
Unsupervised Learning
In artificial intelligence, machine learning that takes place in the absence
of human supervision is known as unsupervised machine learning.
Unsupervised machine learning models, in contrast to supervised learning, are
given unlabeled data and allow discover patterns and insights on their own—
without explicit direction or instruction.
Unsupervised machine learning analyzes and clusters unlabeled datasets
using machine learning algorithms. These algorithms find hidden patterns and
data without any human intervention, i.e., we don’t give output to our model.
The training model has only input parameter values and discovers the groups or
patterns on its own.

Unsupervised Learning
The input to the unsupervised learning models is as follows:
 Unstructured data: May contain noisy(meaningless) data, missing
values, or unknown data
 Unlabeled data: Data only contains a value for input parameters, there is
no targeted value(output). It is easy to collect as compared to the labeled
one in the Supervised approach.
Unsupervised Learning Algorithms
There are mainly 3 types of Algorithms which are used for Unsupervised
dataset.
 Clustering
 Association Rule Learning
 Dimensionality Reduction
Applications of Unsupervised learning
 Customer segmentation: Unsupervised learning can be used to segment
customers into groups based on their demographics, behavior, or
preferences. This can help businesses to better understand their customers
and target them with more relevant marketing campaigns.
 Fraud detection: Unsupervised learning can be used to detect fraud in
financial data by identifying transactions that deviate from the expected
patterns. This can help to prevent fraud by flagging these transactions for
further investigation.
 Recommendation systems: Unsupervised learning can be used to
recommend items to users based on their past behavior or
preferences. For example, a recommendation system might use
unsupervised learning to identify users who have similar taste in
movies, and then recommend movies that those users have enjoyed.
 Natural language processing (NLP): Unsupervised learning is used in a
variety of NLP tasks, including topic modeling, document clustering, and
part-of-speech tagging.
 Image analysis: Unsupervised learning is used in a variety of image
analysis tasks, including image segmentation, object detection, and image
pattern recognition.
Conclusion
Unsupervised learning is a versatile and powerful tool for exploring and
understanding unlabeled data. It has a wide range of applications, from
customer segmentation to fraud detection to image analysis. As the field of
machine learning continues to develop, unsupervised learning is likely to play
an increasingly important role in various domains.
Applications of unsupervised learning
Unsupervised learning has a wide range of applications, including:
 Clustering: Grouping data points into clusters based on their
similarities.
 Dimensionality reduction: Reducing the number of features in a dataset
while preserving as much information as possible.
 Anomaly detection: Identifying data points that deviate from the
expected patterns, often signaling anomalies or outliers.
 Recommendation systems: Recommending items to users based on their
past behavior or preferences.

You might also like

pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy