0% found this document useful (0 votes)
9 views

01 - Introduction + Sorting

Uploaded by

air15902197881
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
9 views

01 - Introduction + Sorting

Uploaded by

air15902197881
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 49

Data Structures 1

Winter 2023
Week 1
Prof. Diego Garbervetsky

1
About me
Professor at University of Buenos Aires

Head of the UBA/CONICET Computer Science Institute

LaFHIS research group

http://lafhis.dc.uba.ar/~diegog

diego.garbervetsy@gtiit.edu.cn

Automated Program Analysis Tech Transfer


- Program Understanding - Microsoft
- Validation and Verification - Medallia
- Automated Testing Generation - OpenZeppelin
- Security - Coinfabrik
Organization
Classes:
• Tuesday: Lecture at 4.15pm-6.15pm
• Thursday: Tutorial + Workshop at 10am-12pm.

Please take advantages of tutorials to work and ask questions, solve


exercises, etc.

Office hours: by email, individually (via Moodle or to:


diego.garbervetsky@gtiit.edu.cn)
Credits for (part of) the slides: Profs. Guillaume Hoffmann and Raul Fervari (DS1 - GTIIT, Winter 2021/2022)
Evaluation and grading
• Evaluation:
• 4 mandatory assignments during the course (called A1-A4). 25 points each.
• 4 extra assignments (B1-B4, for grade improving), each of them corresponding
to one of the A instances.
• To be published around –10 days after its corresponding "instance A's" deadline.

Exam grade EG = Max{A1,B1} + Max{A2,B2} + Max{A3,B3} + Max{A4,B4}.

• To pass the course, the students need to complete all the assignments (either
A or B) collecting at least 55 points in the EG above. This is the exam grade.
• No midterm/final exams!
Goals of this course
• Cover the most common data structures, their properties,
implementations and applications.
• Basic concepts of complexity.
• Learn how to analyze algorithm complexity and how to choose the
right data structure for the right problem.
• Will bring you a good “general culture” on algorithms and data
structures commonly used in programming.
Topics covered in the course
• Reasoning about loops/recursive programs
• Abstract Data Types (definition, implementation)
• Analysis of Algorithms
• Correctness
• Complexity
• Data Structures
• Stacks and Queues
• Linked Lists. Trees.
• Binary Search Trees
• AVL Trees
• Hash Tables
• Heaps
• Tries
• Graphs
Book: Introduction
to Algorithms by
Cormen, Leiserson,
Rivest and Stein
(CLRS)

7
Book: Algorithms on
Strings, Trees and
Sequences by
Gusfield

8
Important concepts
• Functions Contracts
• Precondition and post conditions for functions
• Loop invariants
• Abstract Data types
• Interfaces
• Data Structures
• Algorithms
Contracts
An agreement between a programmer of a function and a user
• A function name and parameters:
• E.g,: sqr_root(real n): real
• Precondition: what the programmer assumes about the input data
• Conditions over the function arguments
• Facilitates the task of the programmer As a programmer I only care for n>0 (rights)
• Only cares for the cases when precondition is true As a user I need to provide n>0 (obligation)
• E.g.: n > 0
• Postcondition: what properties of the output
• Specify what (not how) function will compute if the theAs
precondition holds
a programmer I have to providethe
• E.g, ret_value * ret_value = n sqr_root (obligation)
As a user I get the sqr_root (rights)
Taking advantages of contracts
Problem: given a sequence s and an Problem: given a sequence s and an
element e, check if e belongs to s. element e, check if e belongs to s.
Input: a sequence s = [e1, …, en] of
Input: a sequence s = [e1, …, en] of elements and an element x
elements and an element x Precondition: s is sorted
Output: if x belongs to s, index i such that Output: if x belongs to s, index i such
ei is x; -1 if x does not belong to s that ei is x; -1 if x does not belong to s
Taking advantages of contracts
Problem: given a sequence s and an Problem: given a sequence s and an
element e, check if e belongs to s. element e, check if e belongs to s.
Input: a sequence s = [e1, …, en] of
Input: a sequence s = [e1, …, en] of elements and an element x
elements and an element x Precondition: s is sorted
Output: if x belongs to s, index i such that Output: if x belongs to s, index i such
ei is x; -1 if x does not belong to s that ei is x; -1 if x does not belong to s
Linear search still works!
But can we exploit the fact the s is
sorted to speed up the search?
Linear Search vs Binary Search

linearSearch(seq s, elem x) -> int { Assumes s is sorted


x_index = -1 binarySearch(seq s, elem x) -> int {
i=0 low = 0
while index == -1 and i < length(s) { high = length(s) -1
if s[i] == x then x_index = i while low <= high {
i=i+1 mid = (high-low)/2
} if s[mid] == x then return mid
return x_index else {
} if s[mid] < x then low = mid + 1
else high = mid - 1
}
}
return -1
}
Algorithms
It is a finite sequence of rigorous well-defined instructions, whose aim is to
solve a specific problem specified in a contract (either formally or informally)

Algorithms describe computations:


• They can be implemented as programs in a programming language or can also
be abstractly described using pseudo-code
• Pseudo-code is an informal notation to abstractly describe programs
• Strong syntactic rules of programming languages are omitted.
• Types, operators, and control structures are used in a flexible way.
• Subtasks can be referred to as functions/procedures (straighforward tasks, or complex
tasks to be further refined/implemented later on)
Algorithms
It is a finite sequence of rigorous well-defined instructions, whose aim
is to solve a specific problem specified in a contract (either formally or
informally)

Algorithms describe computations:


• It is important to analyze each particular implementation:
• Correctness
• Complexity (cost of execution, memory)
• Data structures --a way to store and organize data to facilitate access and
modifications.
Data types
An abstract data type (ADT) is a mathematical model for the data
elements that make up a data type as well as the functions that
operate on these data.

A data structure is a high level (design level) implementation of an


abstract data type.

Data structures provide specific implementations of ADTs


(representation and functions to operate on the representation). They
can be subject to the analysis of their running time and space
properties. For a given ADT, there typically exist multiple alternative
data structure implementations.
Abstract Data types
ADTs and data structures allow us to organize and manipulate data in an
effective manner.

The goal is to reduce the space and time complexities of different tasks.

Some popular ADTs are:

• Lists
• Sets
• Dictionaries
List (ADT)
ADT: List (a linearly organized collection of elements)

• insert(L,e): an element can be inserted at a given


position, increasing the size of the collection
• remove(L,e): given a valid position of a given list, the
element at the given position can be removed
• belongs(L,e): we may ask if an elements belongs to a
collection, as well as retrieve an element at a given
position.
Data structures:
• arrays, linked lists, etc
Set (ADT)
ADT: SET (unordered collection of elements)

• init/empty: creates an empty set.


• union(A, B): replaces A by A ∪ B.
• intersection(A, B): replaces A by A ∩ B.
• belongs(A, a): returns true if a ∈ A.

Data structures:
• arrays, linked lists, binary search trees,
hashes
Diccionaries (ADT)
ADT: DICT (handle key/value pairs)

• init/empty: creates an empty dictionary.


• insert(D,k,v): add a key and value to the
dictionary .
• get(D,k): get the value stored with the key k
• delete(D,k): remove the key/value pair
associated with k
• defined?(D,k): returns true if k is a value in
the dictionary.

Data structures:
• Arrays, linked lists, binary search trees,
hashes. Etc.
First example of Algorithm: insertion sort
• A good algorithm for sorting a small number of
elements.
• It works the way you might sort a hand of playing cards.

20
First example: insertion sort
• We start with an empty left hand and the cards face
down on the table.
• We then remove one card at a time from the table and
insert it into the correct position in the left hand.
• To find the correct position for a card, we compare it
with each of
the cards already in the hand, from right to left.
• At all times, the cards held in the left hand are sorted,
and these
cards were originally the top cards of the pile on the
table.

21
Executing insertion sort algorithm

22
Executing insertion sort algorithm

22
Executing insertion sort algorithm

22
Executing insertion sort algorithm

22
Executing insertion sort algorithm

22
Executing insertion sort algorithm

22
Executing insertion sort algorithm

Note: the left prefix (grey) is always sorted

22
Executing insertion sort algorithm

Note: the left prefix (grey) is always sorted


- We use that property to only focus in inserting the new element in the
right place
22
Insertion sort pseudocode

insert(A, j)

• Try it on <5,2,4,6,1,3>! 23
Insertion sort pseudocode

Note: A[1..j-1] is already sorted and the


rest of the array is untouched

insert(A, j)

• Try it on <5,2,4,6,1,3>! 23
Insertion sort pseudocode

Note: A[1..j-1] is already sorted and the


rest of the array is untouched

Shift the elements right until key is no


longer smaller than A[i]

• Try it on <5,2,4,6,1,3>! 23
Insertion sort pseudocode

Note: A[1..j-1] is already sorted and the


rest of the array is untouched

Shift the elements right until key is no


longer smaller than A[i]
At the end of the inner loop either i = 0
(key was smaller than all x in A[1..j-1]) or
A[i]<=key

• Try it on <5,2,4,6,1,3>! 23
Some Observations
• Following Cormen: We use 1-origin indexing, as we do here.
• Careful, in the C language we use 0-origin indexing!
• We use “..” to denote a range within an array in pseudo code
• The array A is sorted in place: the numbers are rearranged within the
array, with at most a constant number outside the array at any time.
• The inner loop strongly relies on the property: A[1..j-1] is already sorted
and the remainder remain unchanged
• We call this property Loop Invariant

24
Loop invariant
A (logical) condition that must holds at each iteration
• Encodes what are the assumptions we made for Loop precondition (Pc)
building the loop Inv
While(B) {
• Some explain the progress made during the loop Inv && B
loop body
• A sort of inductive hypothesis we resort to think Inv
about the loop }
Inv && not B
• Must be established at the first iteration Loop postcondition (Qc)
(initialization)
• Must be maintained at each iteration
Loop precondition (Pc)
Inv
While(B) {
Loop invariant theorem Inv && B
loop body
Inv
Let Pc the conditions on the program state that hold }
Inv && not B
before the loop and Qc the condition expected after
Loop postcondition (Qc)
the loop
If the following conditions hold:
1. Initialization: PC => Inv
2. Maintenance: Assuming Inv && B holds, the execution of the loop body makes
Inv true again
3. Termination: Inv && not B => Qc

Then if the loop finishes, we can be sure than the loops transform the state into
one that satisfies Pc to another that satisfies Qc
Using loop invariants is like mathematical induction

To prove that a property holds, you prove a base case and an inductive step.

• Showing that the invariant holds before the first iteration is like the base case.

• Showing that the invariant holds from iteration to iteration is like the inductive step.

• The termination part differs from the usual use of mathematical induction, in which
the inductive step is used infinitely. We stop the “induction” when the loop
terminates.

• We can show the three parts in any order.

27
Recipe to reason (and show correctness) of some algorithm

Initialization/Maintenance/Termination pattern
• For program with loops
• Find the invariant of the loop
• Use the invariant as an inductive hypothesis for creating de loop body
• Use the Invariant Theorem to prove the loop is correct
• For recursive programs
• Establish the base case
• Use an inductive hypothesis for building the recursive case

28
Correctness of insertion sort (outler loop)
Outler Loop Invariant I: perm(A, A0) and 0<=j<=|A|+1 and A[j..|A|] = A0[j..|A|]
and isSorted(A[0..j-1])

j=2

j=j+1
29
Correctness of insertion sort (outler loop)
Outler Loop Invariant I: perm(A, A0) and 0<=j<=|A|+1 and A[j..|A|] = A0[j..|A|]
and isSorted(A[0..j-1])
• Initialization: j=2. A[1..j-1] is the single
element A[1] (trivially sorted).√
j=2

j=j+1
29
Correctness of insertion sort (outler loop)
Outler Loop Invariant I: perm(A, A0) and 0<=j<=|A|+1 and A[j..|A|] = A0[j..|A|]
and isSorted(A[0..j-1])
• Initialization: j=2. A[1..j-1] is the single
element A[1] (trivially sorted).√
j=2 • Maintenance: Assume j=j0. A[1..j0-1] is
sorted and A[j0]=A0[j0] . If the inner loop
works well then A[1..j0+1] is sorted. Later,
j:=j0+1, then A[1..j-1] is sorted. √
insert(A, j)

j=j+1
29
Correctness of insertion sort (outler loop)
Outler Loop Invariant I: perm(A, A0) and 0<=j<=|A|+1 and A[j..|A|] = A0[j..|A|]
and isSorted(A[0..j-1])
• Initialization: j=2. A[1..j-1] is the single
element A[1] (trivially sorted).√
j=2 • Maintenance: Assume j=j0. A[1..j0-1] is
sorted and A[j0]=A0[j0] . If the inner loop
works well then A[1..j0+1] is sorted. Later,
j:=j0+1, then A[1..j-1] is sorted. √
insert(A, j)
• Termination: The outer for loop ends
when j > |A|, which occurs when j=|A|+1.
Therefore, assuming I and j=|A|+1,
isSorted(A[1..|A|]) and perm(A,A0). √
j=j+1
29
Correctness of insertion sort (inner loop)
Outler Loop Invariant I: perm(A, A0) and 0<=j<=|A|+1 and A[j..|A|] = A0[j..|A|]
and isSorted(A[0..j-1])

Inner Loop Invariant I2: I and 0<=i<j and A[1..i] = A0[1..i] and isSorted(A[1..i])
and A[i+2..j] = A0[i+1..j-1] and key <= x forall x in A0[i+1,j-1]

• Initialization: i= j-1, A[1..j-1] is sorted


(outer invariant), key = A[j] (for all applies:
A[j..j]=A[j] = key) and
iSorted(A[j]) and A[j+1..j] = A0[j..j-1] = []

30
Correctness of insertion sort (inner loop)
Outler Loop Invariant I: perm(A, A0) and 0<=j<=|A|+1 and A[j..|A|] = A0[j..|A|]
and isSorted(A[0..j-1])

Inner Loop Invariant I2: I and 0<=i<j and A[1..i] = A0[1..i] and isSorted(A[1..i])
and A[i+2..j] = A0[i+1..j-1] and key <= x forall x in A0[i+1,j-1]

Maintenance: since A[i] > key, we can shift A[i]


right (to A[i+1]) and maintain the forall
• 5: [2,5, 4, 6, 1 3] -> [2,5, 5, 6, 1 3]
• This breaks invariant, isSorted(A[1..i])
• 6: i := i0 -1 also reestablish
i key
the invariant [2, 5, …]

31
Correctness of insertion sort (inner loop)
Outler Loop Invariant I: perm(A, A0) and 0<=j<=|A|+1 and A[j..|A|] = A0[j..|A|]
and isSorted(A[0..j-1])

Inner Loop Invariant I2: I and 0<=i<j and A[1..i] = A0[1..i] and isSorted(A[1..i])
and A[i+2..j] = A0[i+1..j-1] and key <= x forall x in A0[i+1,j-1]

Termination:
i=0: A[2..j] = A0[1..j-1] and key <= x forall x in A0[1,j-1]
➔ we can safely do A[1] = key
key>=A[0]: we know isSorted(A[1..i]) and A[i+2..j] =
A0[i+1..j-1] and key <= x forall x in A0[i+i,j-1] and A[1..i]
= A0[1..i] then key can be placed in A[i+1]

32
Main pseudocode conventions
• Indentation indicates block structure (you can also use {} ).
• Looping constructs are like in C, C++. We assume that the loop variable in
a for loop is still defined when the loop exits.
• // indicates that the remainder of the line is a comment.
• Variables are local, unless otherwise specified.
• We often use objects, which have attributes. For an attribute attr of object
x, we write x.attr. (Equivalent to x→attr in C).
• Objects are treated as pointers/references. If x and y denote objects, then
the assignment y=x makes x and y reference the same object. It does not
cause attributes of one object to be copied to another.
33
Working with Java
• Similar to C syntax but some differences
• Classes to organize, create types (object oriented)
• Interfaces to specify abstract data types
• Automatic memory management

• We will take a look in the tutorial workshops

• Try to install java 17 (and use Visual Studio Code or other IDE)
End of part 1

35

You might also like

pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy