01 - Introduction + Sorting
01 - Introduction + Sorting
Winter 2023
Week 1
Prof. Diego Garbervetsky
1
About me
Professor at University of Buenos Aires
http://lafhis.dc.uba.ar/~diegog
diego.garbervetsy@gtiit.edu.cn
• To pass the course, the students need to complete all the assignments (either
A or B) collecting at least 55 points in the EG above. This is the exam grade.
• No midterm/final exams!
Goals of this course
• Cover the most common data structures, their properties,
implementations and applications.
• Basic concepts of complexity.
• Learn how to analyze algorithm complexity and how to choose the
right data structure for the right problem.
• Will bring you a good “general culture” on algorithms and data
structures commonly used in programming.
Topics covered in the course
• Reasoning about loops/recursive programs
• Abstract Data Types (definition, implementation)
• Analysis of Algorithms
• Correctness
• Complexity
• Data Structures
• Stacks and Queues
• Linked Lists. Trees.
• Binary Search Trees
• AVL Trees
• Hash Tables
• Heaps
• Tries
• Graphs
Book: Introduction
to Algorithms by
Cormen, Leiserson,
Rivest and Stein
(CLRS)
7
Book: Algorithms on
Strings, Trees and
Sequences by
Gusfield
8
Important concepts
• Functions Contracts
• Precondition and post conditions for functions
• Loop invariants
• Abstract Data types
• Interfaces
• Data Structures
• Algorithms
Contracts
An agreement between a programmer of a function and a user
• A function name and parameters:
• E.g,: sqr_root(real n): real
• Precondition: what the programmer assumes about the input data
• Conditions over the function arguments
• Facilitates the task of the programmer As a programmer I only care for n>0 (rights)
• Only cares for the cases when precondition is true As a user I need to provide n>0 (obligation)
• E.g.: n > 0
• Postcondition: what properties of the output
• Specify what (not how) function will compute if the theAs
precondition holds
a programmer I have to providethe
• E.g, ret_value * ret_value = n sqr_root (obligation)
As a user I get the sqr_root (rights)
Taking advantages of contracts
Problem: given a sequence s and an Problem: given a sequence s and an
element e, check if e belongs to s. element e, check if e belongs to s.
Input: a sequence s = [e1, …, en] of
Input: a sequence s = [e1, …, en] of elements and an element x
elements and an element x Precondition: s is sorted
Output: if x belongs to s, index i such that Output: if x belongs to s, index i such
ei is x; -1 if x does not belong to s that ei is x; -1 if x does not belong to s
Taking advantages of contracts
Problem: given a sequence s and an Problem: given a sequence s and an
element e, check if e belongs to s. element e, check if e belongs to s.
Input: a sequence s = [e1, …, en] of
Input: a sequence s = [e1, …, en] of elements and an element x
elements and an element x Precondition: s is sorted
Output: if x belongs to s, index i such that Output: if x belongs to s, index i such
ei is x; -1 if x does not belong to s that ei is x; -1 if x does not belong to s
Linear search still works!
But can we exploit the fact the s is
sorted to speed up the search?
Linear Search vs Binary Search
The goal is to reduce the space and time complexities of different tasks.
• Lists
• Sets
• Dictionaries
List (ADT)
ADT: List (a linearly organized collection of elements)
Data structures:
• arrays, linked lists, binary search trees,
hashes
Diccionaries (ADT)
ADT: DICT (handle key/value pairs)
Data structures:
• Arrays, linked lists, binary search trees,
hashes. Etc.
First example of Algorithm: insertion sort
• A good algorithm for sorting a small number of
elements.
• It works the way you might sort a hand of playing cards.
20
First example: insertion sort
• We start with an empty left hand and the cards face
down on the table.
• We then remove one card at a time from the table and
insert it into the correct position in the left hand.
• To find the correct position for a card, we compare it
with each of
the cards already in the hand, from right to left.
• At all times, the cards held in the left hand are sorted,
and these
cards were originally the top cards of the pile on the
table.
21
Executing insertion sort algorithm
22
Executing insertion sort algorithm
22
Executing insertion sort algorithm
22
Executing insertion sort algorithm
22
Executing insertion sort algorithm
22
Executing insertion sort algorithm
22
Executing insertion sort algorithm
22
Executing insertion sort algorithm
insert(A, j)
• Try it on <5,2,4,6,1,3>! 23
Insertion sort pseudocode
insert(A, j)
• Try it on <5,2,4,6,1,3>! 23
Insertion sort pseudocode
• Try it on <5,2,4,6,1,3>! 23
Insertion sort pseudocode
• Try it on <5,2,4,6,1,3>! 23
Some Observations
• Following Cormen: We use 1-origin indexing, as we do here.
• Careful, in the C language we use 0-origin indexing!
• We use “..” to denote a range within an array in pseudo code
• The array A is sorted in place: the numbers are rearranged within the
array, with at most a constant number outside the array at any time.
• The inner loop strongly relies on the property: A[1..j-1] is already sorted
and the remainder remain unchanged
• We call this property Loop Invariant
24
Loop invariant
A (logical) condition that must holds at each iteration
• Encodes what are the assumptions we made for Loop precondition (Pc)
building the loop Inv
While(B) {
• Some explain the progress made during the loop Inv && B
loop body
• A sort of inductive hypothesis we resort to think Inv
about the loop }
Inv && not B
• Must be established at the first iteration Loop postcondition (Qc)
(initialization)
• Must be maintained at each iteration
Loop precondition (Pc)
Inv
While(B) {
Loop invariant theorem Inv && B
loop body
Inv
Let Pc the conditions on the program state that hold }
Inv && not B
before the loop and Qc the condition expected after
Loop postcondition (Qc)
the loop
If the following conditions hold:
1. Initialization: PC => Inv
2. Maintenance: Assuming Inv && B holds, the execution of the loop body makes
Inv true again
3. Termination: Inv && not B => Qc
Then if the loop finishes, we can be sure than the loops transform the state into
one that satisfies Pc to another that satisfies Qc
Using loop invariants is like mathematical induction
To prove that a property holds, you prove a base case and an inductive step.
• Showing that the invariant holds before the first iteration is like the base case.
• Showing that the invariant holds from iteration to iteration is like the inductive step.
• The termination part differs from the usual use of mathematical induction, in which
the inductive step is used infinitely. We stop the “induction” when the loop
terminates.
27
Recipe to reason (and show correctness) of some algorithm
Initialization/Maintenance/Termination pattern
• For program with loops
• Find the invariant of the loop
• Use the invariant as an inductive hypothesis for creating de loop body
• Use the Invariant Theorem to prove the loop is correct
• For recursive programs
• Establish the base case
• Use an inductive hypothesis for building the recursive case
28
Correctness of insertion sort (outler loop)
Outler Loop Invariant I: perm(A, A0) and 0<=j<=|A|+1 and A[j..|A|] = A0[j..|A|]
and isSorted(A[0..j-1])
j=2
j=j+1
29
Correctness of insertion sort (outler loop)
Outler Loop Invariant I: perm(A, A0) and 0<=j<=|A|+1 and A[j..|A|] = A0[j..|A|]
and isSorted(A[0..j-1])
• Initialization: j=2. A[1..j-1] is the single
element A[1] (trivially sorted).√
j=2
j=j+1
29
Correctness of insertion sort (outler loop)
Outler Loop Invariant I: perm(A, A0) and 0<=j<=|A|+1 and A[j..|A|] = A0[j..|A|]
and isSorted(A[0..j-1])
• Initialization: j=2. A[1..j-1] is the single
element A[1] (trivially sorted).√
j=2 • Maintenance: Assume j=j0. A[1..j0-1] is
sorted and A[j0]=A0[j0] . If the inner loop
works well then A[1..j0+1] is sorted. Later,
j:=j0+1, then A[1..j-1] is sorted. √
insert(A, j)
j=j+1
29
Correctness of insertion sort (outler loop)
Outler Loop Invariant I: perm(A, A0) and 0<=j<=|A|+1 and A[j..|A|] = A0[j..|A|]
and isSorted(A[0..j-1])
• Initialization: j=2. A[1..j-1] is the single
element A[1] (trivially sorted).√
j=2 • Maintenance: Assume j=j0. A[1..j0-1] is
sorted and A[j0]=A0[j0] . If the inner loop
works well then A[1..j0+1] is sorted. Later,
j:=j0+1, then A[1..j-1] is sorted. √
insert(A, j)
• Termination: The outer for loop ends
when j > |A|, which occurs when j=|A|+1.
Therefore, assuming I and j=|A|+1,
isSorted(A[1..|A|]) and perm(A,A0). √
j=j+1
29
Correctness of insertion sort (inner loop)
Outler Loop Invariant I: perm(A, A0) and 0<=j<=|A|+1 and A[j..|A|] = A0[j..|A|]
and isSorted(A[0..j-1])
Inner Loop Invariant I2: I and 0<=i<j and A[1..i] = A0[1..i] and isSorted(A[1..i])
and A[i+2..j] = A0[i+1..j-1] and key <= x forall x in A0[i+1,j-1]
30
Correctness of insertion sort (inner loop)
Outler Loop Invariant I: perm(A, A0) and 0<=j<=|A|+1 and A[j..|A|] = A0[j..|A|]
and isSorted(A[0..j-1])
Inner Loop Invariant I2: I and 0<=i<j and A[1..i] = A0[1..i] and isSorted(A[1..i])
and A[i+2..j] = A0[i+1..j-1] and key <= x forall x in A0[i+1,j-1]
31
Correctness of insertion sort (inner loop)
Outler Loop Invariant I: perm(A, A0) and 0<=j<=|A|+1 and A[j..|A|] = A0[j..|A|]
and isSorted(A[0..j-1])
Inner Loop Invariant I2: I and 0<=i<j and A[1..i] = A0[1..i] and isSorted(A[1..i])
and A[i+2..j] = A0[i+1..j-1] and key <= x forall x in A0[i+1,j-1]
Termination:
i=0: A[2..j] = A0[1..j-1] and key <= x forall x in A0[1,j-1]
➔ we can safely do A[1] = key
key>=A[0]: we know isSorted(A[1..i]) and A[i+2..j] =
A0[i+1..j-1] and key <= x forall x in A0[i+i,j-1] and A[1..i]
= A0[1..i] then key can be placed in A[i+1]
32
Main pseudocode conventions
• Indentation indicates block structure (you can also use {} ).
• Looping constructs are like in C, C++. We assume that the loop variable in
a for loop is still defined when the loop exits.
• // indicates that the remainder of the line is a comment.
• Variables are local, unless otherwise specified.
• We often use objects, which have attributes. For an attribute attr of object
x, we write x.attr. (Equivalent to x→attr in C).
• Objects are treated as pointers/references. If x and y denote objects, then
the assignment y=x makes x and y reference the same object. It does not
cause attributes of one object to be copied to another.
33
Working with Java
• Similar to C syntax but some differences
• Classes to organize, create types (object oriented)
• Interfaces to specify abstract data types
• Automatic memory management
• Try to install java 17 (and use Visual Studio Code or other IDE)
End of part 1
35