Python - Module 3
Python - Module 3
Python - Module 3
syllabus
syllabus
Module - 3 : Lists
List methods
2) extend ( )
1) append( )
Takes a list as an argument and appends all of the elements:
Adds a new element to the end of a list:
>>> t1 = ['a', 'b', 'c']
>>> t = ['a', 'b', 'c']
>>> t2 = ['d', 'e']
>>> t.append('d')
>>> t1.extend(t2)
>>> print(t)
>>> print(t1)
['a', 'b', 'c', 'd']
['a', 'b', 'c', 'd', 'e']
3) Sort ( )
This example leaves t2 unmodified.
Arranges the elements of the list from low to high:
>>> print(t)
>>> t = ['a', 'b', 'c']
['a', ‘b']
>>> x = t.pop(1)
>>> print(x)
>>> print(t)
c
['a', 'c']
>>> print(x) 5) del( )
pop modifies the list and returns the element that was >>> t = ['a', 'b', 'c']
There are a number of built-in functions that can be used on lists that allow to quickly look through a list without writing own loops:
total = 0
count = 0
while (True):
inp = input('Enter a number: ')
if inp == 'done': break
value = float(inp)
total = total + value
count = count + 1
>>> s = ‘HKBK'
>>> t = list(s)
>>> print(t)
[‘H', ‘K', ‘B', ‘K']
The list function breaks a string into individual letters. If we want to break a string into words, use split method:
>>> s = ‘Welcome to HKBKCE'
>>> t = s.split()
>>> print(t)
[' Welcome ', ‘ to ‘, ‘ HKBKCE ’]
>>> print(t[2])
HKBKCE
Module - 3 : Lists
Lists and Strings
We can call split with an optional argument called a delimiter that specifies which characters to use as word
boundaries.
The following example uses a hyphen as a delimiter:
>>> s = 'spam-spam-spam'
>>> delimiter = '-'
>>> s.split(delimiter)
['spam', 'spam', 'spam']
Join( ) is the inverse of split. It takes a list of strings and concatenates the elements.
join is a string method, so we have to invoke it on the delimiter and pass the list as a parameter:
a = 'banana'
b = 'banana‘
we know that a and b both refer to a string, but we don’t know whether they refer to the same string.
In one case, a and b refer to two different objects that have the same value.
In the second case, they refer to the same object.
Module - 3 : Lists
Objects and values
To check whether two variables refer to the same object, use the is operator.
>>> a = 'banana'
>>> b = 'banana'
>>> a is b
True
In this example, Python only created one string object, and both a and b refer to it.
Module - 3 : Lists
Objects and values
>>> a = [1, 2, 3]
>>> b = [1, 2, 3]
>>> a is b
False
In this case , the two lists are equivalent, because they have the same elements, but not identical, because
they are not the same object.
If two objects are identical, they are also equivalent, but if they are equivalent, they are not necessarily
identical.
Module - 3 : Lists
Aliasing
If a refers to an object and assign b = a, then both variables refer to the same object:
>>> a = [1, 2, 3]
>>> b = a
>>> b is a
True
If the aliased object is mutable, changes made with one alias affect the other:
>>> a = [1, 2, 3]
>>> b = a
>>> b[0] = 17
>>> print(a)
[17, 2, 3]
When we pass a list to a function, the function gets a reference to the list.
If the function modifies a list parameter, the caller sees the change.
For example,
def delete_head(t):
del t[0]
The parameter t and the variable letters are aliases for the same object.
It is important to distinguish between operations that modify lists and operations that create new lists.
For example, the append method modifies a list, but the + operator creates a new list:
>>> t1 = [1, 2]
>>> t2 = t1.append(3)
>>> print(t1)
[1, 2, 3]
>>> print(t2)
None
>>> t3 = t1 + [3]
>>> print(t3)
[1, 2, 3]
>>> t2 is t3
False
Module - 3 : Lists
List arguments
This difference is important when we write functions that are supposed to modify lists.
For example, this function does not delete the head of a list:
def bad_delete_head(t):
t = t[1:] # WRONG!
The slice operator creates a new list and the assignment makes t refer to it, but none of that has any
effect on the list that was passed as an argument.
An alternative is to write a function that creates and returns a new list.
For example, tail returns all but the first element of a list:
def tail(t):
return t[1:] #This function leaves the original list unmodified. Here’s how it is used:
>>> letters = ['a', 'b', 'c']
>>> rest = tail(letters)
>>> print(rest)
['b', 'c']
Module - 3 : Dictionaries
Dictionary
• In a list, the index positions have to be integers; in a dictionary, the indices can be
(almost) any type.
• dictionary is as mapping between a set of indices (which are called keys) and a set
of values.
>>> print(city_capital)
{‘KAR': ‘Bangalore'}
Module - 3 : Dictionaries
Dictionary
>>> city_capital={'KAR':'Bangalore','TN':'Chennai','AP':'Hyderabad'}
>>> print(city_capital)
{'KAR': 'Bangalore', 'TN': 'Chennai', 'AP': 'Hyderabad'}
• For lists, it uses a linear search algorithm. As the list gets longer, the search time gets
longer in direct proportion to the length of the list.
• For dictionaries, Python uses an algorithm called a hash table that has a remarkable
property: the in operator takes about the same amount of time no matter how many items
there are in a dictionary.
Module - 3 : Dictionaries
Dictionary as a set of counters
We are effectively computing a histogram, which is a statistical term for a set of counters (or
frequencies).
Module - 3 : Dictionaries
Dictionary as a set of counters
• Dictionaries have a method called get that takes a key and a default value.
• If the key appears in the dictionary, get returns the corresponding value;
otherwise it returns the default value.
For example:
{'G': 2, 'o': 5, 'd': 2, ' ': 3, 'M': 2, 'r': 2, 'n': 2, 'i': 1, 'g': 1, 't': 1, 'h': 1, 'e': 1}
common uses of a dictionary is to count the occurrence of words in a file with some written
text.
Write a Python program to read through the lines of the file, break each line into a list of words,
and then loop through each of the words in the line and count each word using a dictionary.
The outer loop is reading the lines of the file and the inner loop is iterating through each of the
words on that particular line. This is an example of a pattern called nested loops because one of
the loops is the outer loop and the other loop is the inner loop
Module - 3 : Dictionaries
Dictionaries and files
fname = input('Enter the file name: ')
try:
fhand = open(fname)
HKBK.txt
except:
HKBK College of Engineering was established in 1997.
print('File cannot be opened:', fname) Teaching is more than a profession, for the faculty
exit() members of HKBKCE
HKBKCE is situated in Bangalore
counts = dict()
We are proud to be a HKBK Students.
for line in fhand:
words = line.split() Enter the file name: hkbk.txt
for word in words: {'HKBK': 2, 'College': 1, 'of': 2, 'Engineering': 1, 'was': 1, 'established':
if word not in counts: 1, 'in': 2, '1997.': 1, 'Teaching': 1, 'is': 2, 'more': 1, 'than': 1, 'a': 2,
counts[word] = 1 'profession,': 1, 'for': 1, 'the': 1, 'faculty': 1, 'members': 1, 'HKBKCE': 2,
else: 'situated': 1, 'Bangalore': 1, 'We': 1, 'are': 1, 'proud': 1, 'to': 1, 'be': 1,
counts[word] += 1 'Students.': 1}
print(counts)
Module - 3 : Dictionaries
Looping and dictionaries
If we use a dictionary as the sequence in a for statement, it traverses the keys of the dictionary.
Output:
BNG 1
MYS 42
SMG 100
Module - 3 : Dictionaries
Looping and dictionaries
Python split function looks for spaces and treats words as tokens separated by spaces.
We can solve both Punctuation and case sensitive problems by using the string methods lower,
punctuation, and translate.
The translate is the most subtle of the methods.
Replace the characters in fromstr with the character in the same position in tostr and delete all
characters that are in deletestr.
The fromstr and tostr can be empty strings and the deletestr parameter can be omitted.
Module - 3 : Dictionaries
Advanced text parsing
HKBKpunch.txt
The values stored in a tuple can be any type, and they are indexed by integers.
Tuples are also comparable and hashable so we can sort lists of them and use tuples as key
values in Python dictionaries.
>>> t = tuple()
>>> print(t)
()
If the argument is a sequence (string, list, or tuple), the result of the call to tuple is a tuple with
the elements of the sequence:
>>> t = tuple('lupins')
>>> print(t)
('l', 'u', 'p', 'i', 'n', 's')
Because tuple is the name of a constructor, avoid using it as a variable name.
Module - 3 : Tuples
Tuples are immutable
But if try to modify one of the elements of the tuple, will get an error:
>>> t[0] = 'A'
TypeError: object doesn't support item assignment
We can’t modify the elements of a tuple, but We can replace one tuple with another:
Decorate a sequence by building a list of tuples with one or more sort keys preceding the
elements from the sequence,
Sort the list of tuples using the Python built-in sort, and
[DSU]
Module - 3 : Tuples
Comparing tuples
print(res)
Module - 3 : Tuples
Comparing tuples
t.sort(reverse=True)
res = list()
for length, word in t: Output:
res.append(word)
['established', 'Engineering', 'College', 'HKBK', '1997', 'was', 'of', 'in']
print(res)
Module - 3 : Tuples
Tuple assignment
One of the unique syntactic features of the Python language is the ability to have a tuple on
the left side of an assignment statement.
This allows to assign more than one variable at a time when the left side is a sequence.
Two-element list (which is a sequence) and assign the first and second elements of the
sequence to the variables x and y in a single statement.
>>> m = [ 'have', 'fun' ] >>> m = [ 'have', 'fun' ] >>> m = [ 'have', 'fun' ]
>>> x, y = m >>> x = m[0] >>> (x, y) = m
>>> x >>> y = m[1] >>> x
'have' >>> x 'have'
>>> y 'have' >>> y
'fun' >>> y 'fun'
>>> 'fun' >>>
>>>
Module - 3 : Tuples
Tuple assignment
Tuple assignment allows us to swap the values of two variables in a single statement:
>>> a, b = b, a
The number of variables on the left and the number of values on the right must be
the same:
>>> a, b = 1, 2, 3
ValueError: too many values to unpack
>>> addr = 'monty@python.org'
>>> uname, domain = addr.split('@')
The return value from split is a list with two elements;
the first element is assigned to uname, the second to domain.
>>> print(uname)
monty
>>> print(domain)
python.org
Module - 3 : Tuples
Dictionaries and tuples
Converting a dictionary to a list of tuples is a way for us to output the contents of a dictionary
sorted by key:
The new list is sorted in ascending alphabetical order by the key value.
Module - 3 : Tuples
Multiple assignment with dictionaries
Combining items, tuple assignment, and for will give code pattern for traversing the keys and values of a
dictionary in a single loop:
This loop has two iteration variables because items returns a list of tuples and key, val is a tuple assignment
that successively iterates through each of the key-value pairs in the dictionary.
For each iteration through the loop, both key and value are advanced to the next key-value pair in the
dictionary (still in hash order).
directory[last,first] = number
The expression in brackets is a tuple. We could use tuple assignment in a for loop to traverse this
dictionary.
The task of searching and extracting from a string is done by Python with a very
powerful library called regular expressions.
Regular expressions are almost their own little programming language for
searching and parsing strings
The regular expression library ‘re’ must be imported into program before we
can use it.
The simplest use of the regular expression library is the search( ) function.
Module - 3 : Regular expressions
Output:
HKBKCE is situated in Bangalore
Module - 3 : Regular expressions
The power of the regular expressions comes when we add special characters to
the search string that allow us to more precisely control which lines match the
string.
For example, the caret (^) character is used in regular expressions to match “the
beginning” of a line.
Module - 3 : Regular expressions
import re
fhand = open(‘HKBK.txt')
for line in fhand:
HKBK.txt
line = line.rstrip() HKBK College of Engineering was established in 1997.
if re.search('^HKBK', line): Teaching is more than a profession, for the faculty
print(line) members of HKBKCE
HKBKCE is situated in Bangalore
We are proud to be a HKBK Students.
Output:
HKBK College of Engineering was established in 1997.
HKBKCE is situated in Bangalore
Module - 3 : Regular expressions
Character matching in regular expressions
The most commonly used special character is the period or full stop, which
matches any character to build even more powerful regular expressions since the
period characters in the regular expression match any character.
Example:
The regular expression “F..m:” would match any of the strings:
“From:”, “Fxxm:”, “F12m:”, or “F!@m:”
Module - 3 : Regular expressions
Character matching in regular expressions
# Search for lines that start with 'H', followed by 2 characters, followed by 'K'
import re
fhand = open('search.txt')
for line in fhand: search.txt
line = line.rstrip() HKBK College of Engineering was established in 1997.
Teaching is more than a profession, for the faculty
if re.search('^H..K', line): members of HKBKCE
print(line) HISK is sample sentance
We are proud to be a HKBK Students.
Output:
HKBK College of Engineering was established in 1997.
HISK is sample sentance
Module - 3 : Regular expressions
Character matching in regular expressions
This is particularly powerful when combined with the ability to indicate that a
character can be repeated any number of times using the “*” or “+” characters in
regular expression.
# Search for lines that start with From and have an at (@) sign
import re
fhand = open(‘from.txt')
for line in fhand: From.txt
line = line.rstrip() From: syedmustafa@gmail.com
Professor and HOD
if re.search('^From:.+@', line): From: nikilravi@yahoo.co.in
print(line) Email for sample
From: madhu@hotmail.com
Output:
From: syedmustafa@gmail.com
From: nikilravi@yahoo.co.in
From: madhu@hotmail.com
Module - 3 : Regular expressions
Extracting data using regular expressions
If we want to extract data from a string in Python, we can use the findall( )
method to extract all of the substrings which match a regular expression.
example program to extract anything that looks like an email address from
any line regardless of format.
For example, to pull the email addresses from each of the following lines:
Output:
['syed@gmail.com', 'mustafa@yahoo.com']
The findall() method searches the string in the second argument and returns a list of
all of the strings that look like email addresses.
We can use two-character sequence that matches a non-whitespace character (\S).
Module - 3 : Regular expressions
Extracting data using regular expressions
# Search for lines that have an at sign between characters
import re
fhand = open('mbox.txt')
for line in fhand: Mbox.txt
From stephen.marquard@uct.ac.za Sat Jan 5 09:14:16 2008
line = line.rstrip() Return-Path: <postmaster@collab.sakaiproject.org>
x = re.findall('\S+@\S+', line) for <source@collab.sakaiproject.org>;
if len(x) > 0: Received: (from apache@localhost)
Author: stephen.marquard@uct.ac.za
print(x)
Output:
['stephen.marquard@uct.ac.za']
['<postmaster@collab.sakaiproject.org>']
['<source@collab.sakaiproject.org>;']
['apache@localhost)']
Module - 3 : Regular expressions
Extracting data using regular expressions
Square brackets are used to indicate a set of multiple acceptable characters we are
willing to consider matching.
[a-z] –matches any one character from the range a to z
[A-Z] –matches any one character from the range A to Z
[0-9] –matches any one number from the range 0 to 9
new regular expression: [a-zA-Z0-9]\S*@\S*[a-zA-Z]
For substrings that start with a single lowercase letter, uppercase letter, or number
“[a-zA-Z0-9]”, followed by zero or more non-blank characters (“\S*”), followed by an
at-sign, followed by zero or more non-blank characters (“\S*”), followed by an
uppercase or lowercase letter.
Remember that the “*” or “+” applies to the single character immediately to the left
of the plus or asterisk.
Module - 3 : Regular expressions
Extracting data using regular expressions
# Search for lines that have an at sign between characters
# The characters must be a letter or number
import re
fhand = open('mbox.txt')
for line in fhand:
line = line.rstrip()
x = re.findall('[a-zA-Z0-9]\S+@\S+[a-zA-Z]', line)
if len(x) > 0:
print(x) Output:
['stephen.marquard@uct.ac.za']
['postmaster@collab.sakaiproject.org']
['source@collab.sakaiproject.org']
['apache@localhost']
['stephen.marquard@uct.ac.za']
Module - 3 : Regular expressions
Combining searching and extracting
If we want to find numbers on lines that start with the string “X-” such as:
X-DSPAM-Confidence: 0.8475
X-DSPAM-Probability: 0.0000
we don’t just want any floating-point numbers from any lines.
We only want to extract numbers from lines that have the above syntax.
regular expression : ^X-.*: [0-9.]+
start with “X-”, followed by zero or more characters (“.*”), followed by a colon (“:”)
and then a space.
After the space,looking for one or more characters that are either a digit (0-9) or a
period “[0-9.]+”.
Note that inside the square brackets, the period matches an actual period
(i.e., it is not a wildcard between the square brackets).
Module - 3 : Regular expressions
Combining searching and extracting
# Search for lines that start with 'X' followed by any non whitespace characters
# and ':' followed by a space and any number. The number can include a decimal.
import re Mboxno.txt
fhand = open('mboxno.txt') X-Authentication-Warning: nakamura.uits.iupui.edu: apache set
for line in fhand: sender to stephen.marquard@uct.ac.za using -f
To: source@collab.sakaiproject.org
line = line.rstrip() From: stephen.marquard@uct.ac.za
if re.search('^X\S*: [0-9.]+', line): Subject: [sakai] svn commit: r39772 - content/branches/sakai_2-
print(line) 5-x/content-impl/impl/src/java/org/sakaiproject/content/impl
X-Content-Type-Outer-Envelope: text/plain; charset=UTF-8
X-Content-Type-Message-Body: text/plain; charset=UTF-8
Content-Type: text/plain; charset=UTF-8
Output: X-DSPAM-Result: Innocent
X-DSPAM-Confidence: 0.8475 X-DSPAM-Processed: Sat Jan 5 09:14:16 2008
X-DSPAM-Probability: 0.0.8000 X-DSPAM-Confidence: 0.8475
X-DSPAM-Probability: 0.0.8000
Module - 3 : Regular expressions
Combining searching and extracting
‘’’ Search for lines that start with 'X' followed by any non whitespace characters and
':' followed by a space and any number. The number can include a decimal.
Then print the number if it is greater than zero. ‘’’ Mboxno.txt
import re X-Authentication-Warning: nakamura.uits.iupui.edu: apache set
hand = open(Mmboxno.txt') sender to stephen.marquard@uct.ac.za using -f
To: source@collab.sakaiproject.org
for line in hand: From: stephen.marquard@uct.ac.za
line = line.rstrip() Subject: [sakai] svn commit: r39772 - content/branches/sakai_2-
x = re.findall('^X\S*: ([0-9.]+)', line) 5-x/content-impl/impl/src/java/org/sakaiproject/content/impl
X-Content-Type-Outer-Envelope: text/plain; charset=UTF-8
if len(x) > 0: X-Content-Type-Message-Body: text/plain; charset=UTF-8
print(x) Content-Type: text/plain; charset=UTF-8
Output: X-DSPAM-Result: Innocent
['0.8475'] X-DSPAM-Processed: Sat Jan 5 09:14:16 2008
['0.0.8000'] X-DSPAM-Confidence: 0.8475
X-DSPAM-Probability: 0.0.8000
Module - 3 : Regular expressions
Combining searching and extracting
# Search for lines that start with 'Details: rev=' followed by numbers and '.'
# Then print the number if it is greater than zero
import re
hand = open('mbox-short.txt')
for line in hand:
line = line.rstrip()
x = re.findall('^Details:.*rev=([0-9.]+)', line)
if len(x) > 0: Output:
print(x) ['39772']
['39771']
['39770']
['39769']
['39766']
['39765']
Module - 3 : Regular expressions
Combining searching and extracting
‘’’ Search for lines that start with From and a characterfollowed by a two digit
number between 00 and 99 followed by ':' Then print the number if it is greater than
zero [extract hours] ’’’
import re
hand = open('mbox-short.txt')
for line in hand:
line = line.rstrip() Output:
x = re.findall('^From .* ([0-9][0-9]):', line) ['09']
if len(x) > 0: print(x) ['18']
['16']
['15']
['15']
['14']
Module - 3 : Regular expressions
Escape character to match the actual character such as a dollar sign or caret
# esape char example
import re
x = 'We just received $10.00 for cookies.'
y = re.findall('\$[0-9.]+',x)
print (y)
Output:
['$10.00']
Module - 3 : Regular expressions
Escape character to match the actual character such as a dollar sign or caret
Special characters and character sequences:
ˆ Matches the beginning of the line.
$ Matches the end of the line.
. Matches any character (a wildcard).
\s Matches a whitespace character.
\S Matches a non-whitespace character (opposite of \s).
* Applies to the immediately preceding character and indicates to
match zero or more of the preceding character(s).
*? Applies to the immediately preceding character and indicates to
match zero or more of the preceding character(s) in “non-greedy
mode”.
Module - 3 : Regular expressions
Escape character to match the actual character such as a dollar sign or caret
Special characters and character sequences:
+ Applies to the immediately preceding character and indicates to
match one or more of the preceding character(s).
[ˆA-Za-z] When the first character in the set notation is a caret, it inverts the
logic.This example matches a single character that is anything other than an
uppercase or lowercase letter.
( ) When parentheses are added to a regular expression, they are ignored for
the purpose of matching, but allow you to extract a particular subset of the
matched string rather than the whole string when using findall().
Module - 3 : Regular expressions
Escape character to match the actual character such as a dollar sign or caret
\b Matches the empty string, but only at the start or end of a word.
\B Matches the empty string, but not at the start or end of a word.
\d Matches any decimal digit; equivalent to the set [0-9].
\D Matches any non-digit character; equivalent to the set [ˆ0-9].
Module - 4 : Classes and Objects
CLASS
Module - 4 : Classes and Objects
Module - 4 : Classes and Objects
deep copy: To copy the contents of an object as well as any embedded objects,
and any objects embedded in them, and so on; implemented by the deepcopy
function in the copy module.
object diagram: A diagram that shows objects, their attributes, and the values of
the attributes.
Module - 4 : Classes and Objects
>>> print(p.x)
10
Module - 4 : Classes and Objects
# class and object example
>>> class point:
x=0
y=0
>>> p=point()
>>> p.x=10
>>> p.y=20
>>> print(p.x,p.y)
10 20
>>> p.x #read the value of an attribute
10
>>> p.y
20
Module - 4 : Classes and Objects
# class and object example
>>> x=p.x
>>> x
10
>>> '(%g, %g)' % (p.x, p.y)
'(10, 20)'
>>>
Module - 4 : Classes and Objects
class Point:
""" Represents a point in 2-D space.attributes: x, y """
x=0
y=0
p=Point()
p.x=10
p.y=20
print(p.x,p.y)
Module - 4 : Classes and Objects
class Point:
""" Represents a point in 2-D space.attributes: x, y """
pass
p=Point()
p.x=10
p.y=20
print(p.x,p.y)
Module - 4 : Classes and Objects
# class and object example An object that is an attribute of another object is embedded
class Point:
""" Represents a point in 2-D space.attributes: x, y """
pass
class Rectangle:
pass
p=Point()
p.x=10
p.y=20
print(p.x,p.y)
box = Rectangle()
box.width = 100.0
box.height = 200.0
box.corner = Point()
box.corner.x = 0.0
box.corner.y = 0.0
print(box.width,box.height,box.corner.x,box.corner.y)
Thank You