ADS Notes Unit1
ADS Notes Unit1
Chap 1
1. Primitive Data Structures are the data structures consisting of the numbers
and the characters that come in-built into programs.
2. These data structures can be manipulated or operated directly by machine-
level instructions.
3. Basic data types like Integer, Float, Character, and Boolean come under the
Primitive Data Structures.
4. These data types are also called Simple data types, as they contain
characters that can't be divided further
A data structure that preserves a linear connection among its data elements is
known as a Linear Data Structure. The arrangement of the data is done linearly,
where each element consists of the successors and predecessors except the first
and the last data element. However, it is not necessarily true in the case of memory,
as the arrangement may not be sequential.
Based on memory allocation, the Linear Data Structures are further classified into
two types:
1. Static Data Structures: The data structures having a fixed size are known as
Static Data Structures. The memory for these data structures is allocated at
the compiler time, and their size cannot be changed by the user after being
compiled; however, the data stored in them can be altered.
The Array is the best example of the Static Data Structure as they have a
fixed size, and its data can be modified later.
2. Dynamic Data Structures: The data structures having a dynamic size are
known as Dynamic Data Structures. The memory of these data structures is
allocated at the run time, and their size varies during the run time of the code.
Moreover, the user can change the size as well as the data elements stored in
these data structures at the run time of the code.
Linked Lists, Stacks, and Queues are common examples of dynamic data
structures
The following is the list of Linear Data Structures that we generally use:
1. Arrays
An Array is a data structure used to collect multiple data elements of the same data
type into one variable. Instead of storing multiple values of the same data types in
separate variable names, we could store all of them together into one variable. This
statement doesn't imply that we will have to unite all the values of the same data
type in any program into one array of that data type. But there will often be times
when some specific variables of the same data types are all related to one another in
a way appropriate for an array.
An Array is a list of elements where each element has a unique place in the list. The
data elements of the array share the same variable name; however, each carries a
different index number called a subscript. We can access any data element from the
list with the help of its location in the list. Thus, the key feature of the arrays to
understand is that the data is stored in contiguous memory locations, making it
possible for the users to traverse through the data elements of the array using their
respective indexes.
Figure 3. An Array
a. We can store a list of data elements belonging to the same data type.
c. The array also helps store data elements of a binary tree of the fixed count.
d. Array also acts as a storage of matrices.
2. Linked Lists
A Linked List is another example of a linear data structure used to store a collection
of data elements dynamically. Data elements in this data structure are represented
by the Nodes, connected using links or pointers. Each node contains two fields, the
information field consists of the actual data, and the pointer field consists of the
address of the subsequent nodes in the list. The pointer of the last node of the linked
list consists of a null pointer, as it points to nothing. Unlike the Arrays, the user can
dynamically adjust the size of a Linked List as per the requirements.
a. Singly Linked List: A Singly Linked List is the most common type of Linked
List. Each node has data and a pointer field containing an address to the next
node.
b. Doubly Linked List: A Doubly Linked List consists of an information field and
two pointer fields. The information field contains the data. The first pointer field
contains an address of the previous node, whereas another pointer field
contains a reference to the next node. Thus, we can go in both directions
(backward as well as forward).
c. Circular Linked List: The Circular Linked List is similar to the Singly Linked
List. The only key difference is that the last node contains the address of the
first node, forming a circular loop in the Circular Linked List.
a. The Linked Lists help us implement stacks, queues, binary trees, and graphs
of predefined size.
b. We can also implement Operating System's function for dynamic memory
management.
e. Circular Linked List is also helpful in a Slide Show where a user requires to go
back to the first slide after the last slide is presented.
3. Stacks
A Stack is a Linear Data Structure that follows the LIFO (Last In, First Out) principle
that allows operations like insertion and deletion from one end of the Stack, i.e., Top.
Stacks can be implemented with the help of contiguous memory, an Array, and non-
contiguous memory, a Linked List. Real-life examples of Stacks are piles of books, a
deck of cards, piles of money, and many more.
Figure 5. A Real-life Example of Stack
The above figure represents the real-life example of a Stack where the operations
are performed from one end only, like the insertion and removal of new books from
the top of the Stack. It implies that the insertion and deletion in the Stack can be
done only from the top of the Stack. We can access only the Stack's tops at any
given time.
Figure 6. A Stack
b. Stack is also utilized as Auxiliary Storage Structure for function calls, nested
operations, and deferred/postponed functions.
4. Queues
A Queue is a linear data structure similar to a Stack with some limitations on the
insertion and deletion of the elements. The insertion of an element in a Queue is
done at one end, and the removal is done at another or opposite end. Thus, we can
conclude that the Queue data structure follows FIFO (First In, First Out) principle to
manipulate the data elements. Implementation of Queues can be done using Arrays,
Linked Lists, or Stacks. Some real-life examples of Queues are a line at the ticket
counter, an escalator, a car wash, and many more.
Figure 7. A Real-life Example of Queue
The above image is a real-life illustration of a movie ticket counter that can help us
understand the Queue where the customer who comes first is always served first.
The customer arriving last will undoubtedly be served last. Both ends of the Queue
are open and can execute different operations. Another example is a food court line
where the customer is inserted from the rear end while the customer is removed at
the front end after providing the service they asked for.
Figure 8. A Queue
c. Queues are responsible for CPU scheduling, Job scheduling, and Disk
Scheduling.
f. Queues are also responsible for handling interrupts generated by the User
Applications for the CPU.
Non-Linear Data Structures are data structures where the data elements are not
arranged in sequential order. Here, the insertion and removal of data are not feasible
in a linear manner. There exists a hierarchical relationship between the individual
data items.
The following is the list of Non-Linear Data Structures that we generally use:
1. Trees
The Tree data structure is a specialized method to arrange and collect data in the
computer to be utilized more effectively. It contains a central node, structural nodes,
and sub-nodes connected via edges. We can also say that the tree data structure
consists of roots, branches, and leaves connected.
Figure 9. A Tree
a. Binary Tree: A Tree data structure where each parent node can have at most
two children is termed a Binary Tree.
b. Binary Search Tree: A Binary Search Tree is a Tree data structure where we
can easily maintain a sorted list of numbers.
c. AVL Tree: An AVL Tree is a self-balancing Binary Search Tree where each
node maintains extra information known as a Balance Factor whose value is
either -1, 0, or +1.
f. Trees are also responsible for parsing expressions and statements in the
compilers of different programming languages.
g. We can use Trees to store data keys for indexing for Database Management
System (DBMS).
2. Graphs
G = (V,E)
Depending upon the position of the vertices and edges, the Graphs can be
classified into different types:
a. Null Graph: A Graph with an empty set of edges is termed a Null Graph.
b. Trivial Graph: A Graph having only one vertex is termed a Trivial Graph.
c. Simple Graph: A Graph with neither self-loops nor multiple edges is known
as a Simple Graph.
h. Connected Graph: A Graph with at least a single path between every pair of
vertices is termed a Connected Graph.
i. Disconnected Graph: A Graph where there does not exist any path between
at least one pair of vertices is termed a Disconnected Graph.
j. Regular Graph: A Graph where all vertices have the same degree is termed
a Regular Graph.
k. Complete Graph: A Graph in which all vertices have an edge between every
pair of vertices is known as a Complete Graph.
l. Cycle Graph: A Graph is said to be a Cycle if it has at least three vertices and
edges that form a cycle.
m. Cyclic Graph: A Graph is said to be Cyclic if and only if at least one cycle
exists.
o. Finite Graph: A Graph with a finite number of vertices and edges is known as
a Finite Graph.
p. Infinite Graph: A Graph with an infinite number of vertices and edges is
known as an Infinite Graph.
q. Bipartite Graph: A Graph where the vertices can be divided into independent
sets A and B, and all the vertices of set A should only be connected to the
vertices present in set B with some edges is termed a Bipartite Graph.
s. Euler Graph: A Graph is said to be Euler if and only if all the vertices are
even degrees.
f. Graphs are also used in Utility networks in order to identify the problems
posed to local or municipal corporations.
h. Graphs are also used to make document link maps of the websites in order to
display the connectivity between the pages through hyperlinks.
In the following section, we will discuss the different types of operations that we can
perform to manipulate data in every data structure:
a. Compile-time
b. Run-time
For example, the malloc() function is used in C Language to create
data structure.
8. Selection: Selection means selecting a particular data from the available
data. We can select any particular data by specifying conditions inside the
loop.
9. Update: The Update operation allows us to update or modify the data in the
data structure. We can also update any particular data by specifying some
conditions inside the loop, like the Selection operation.
10. Splitting: The Splitting operation allows us to divide data into various
subparts decreasing the overall process completion time.
Time Complexity
To travel from House A to House B, above ways of transport is possible, but we have
challenges of cost , speed. We don’t prefer to go in rocket because it’s the fastest,
because of cost. Similarly, we don’t prefer bicycle to travel because it is cheap, as
the speed gets compromised there.
Same way algorithm needs to work on these parameters for their algorithm to be the
best, average or worst algorithm. 1. Big-O Notation
We define an algorithm’s worst-case time complexity by using the Big-O notation,
which determines the set of functions grows slower than or at the same rate as the
expression. Furthermore, it explains the maximum amount of time an algorithm
requires to consider all input values.
2. Omega Notation
It defines the best case of an algorithm’s time complexity, the Omega notation
defines whether the set of functions will grow faster or at the same rate as the
expression. Furthermore, it explains the minimum amount of time an algorithm
requires to consider all input values.
3. Theta Notation
It defines the average case of an algorithm’s time complexity, the Theta notation
defines when the set of functions lies in
both O(expression) and Omega(expression), then Theta notation is used. This is
how we define a time complexity average case for an algorithm.
Some of Searching & Sorting algorithms and their time complexities:
Space complexity
Space complexity refers to the total amount of memory space used by an
algorithm/program, including the space of input values for execution.
Method for Calculating Space and Time Complexity
Methods for Calculating Time Complexity
To calculate time complexity, you must consider each line of code in the program.
Consider the multiplication function as an example. Now, calculate the time
complexity of the multiply function:
1. mul <- 1
2. i <- 1
3. While i <= n do
4. mul = mul * 1
5. i=i+1
6. End while
Let T(n) be a function of the algorithm's time complexity. Lines 1 and 2 have a time
complexity of O. (1). Line 3 represents a loop. As a result, you must repeat lines 4
and 5 (n -1) times. As a result, the time complexity of lines 4 and 5 is O. (n).
Finally, adding the time complexity of all the lines yields the overall time complexity
of the multiple function fT(n) = O(n).
The iterative method gets its name because it calculates an iterative algorithm's time
complexity by parsing it line by line and adding the complexity.
Aside from the iterative method, several other concepts are used in various cases.
The recursive process, for example, is an excellent way to calculate time complexity
for recurrent solutions that use recursive trees or substitutions. The master's theorem
is another popular method for calculating time complexity.
Methods for Calculating Space Complexity
With an example, you will go over how to calculate space complexity in this section.
Here is an example of computing the multiplication of array elements:
1. int mul, i
2. While i < = n do
3. mul <- mul * array[i]
4. i <- i + 1
5. end while
6. return mul
Let S(n) denote the algorithm's space complexity. In most systems, an integer
occupies 4 bytes of memory. As a result, the number of allocated bytes would be the
space complexity.
Line 1 allocates memory space for two integers, resulting in S(n) = 4 bytes multiplied
by 2 = 8 bytes. Line 2 represents a loop. Lines 3 and 4 assign a value to an already
existing variable. As a result, there is no need to set aside any space. The return
statement in line 6 will allocate one more memory case. As a result, S(n)= 4 times 2
+ 4 = 12 bytes.
Because the array is used in the algorithm to allocate n cases of integers, the final
space complexity will be fS(n) = n + 12 = O (n).
As you progress through this tutorial, you will see some differences between space
and time complexity.
Conclusion
Thus, while time complexity and space complexity share similarities in their role in
evaluating algorithm efficiency and scalability, they fundamentally differ in the
specific resources they measure and their implications on algorithm design and
performance.
------------------------------------------------------------------------------------------------------------
Binary Search Tree
A Binary Search Tree is a data structure used in computer science for organizing
and storing data in a sorted manner. Each node in a Binary Search Tree has at most
two children, a left child and a right child, with the left child containing values less
than the parent node and the right child containing values greater than the parent
node. This hierarchical structure allows for efficient searching, insertion,
and deletion operations on the data stored in the tree.
Binary Search Tree
AVL Trees
An AVL tree defined as a self-balancing Binary Search Tree (BST) where the
difference between heights of left and right subtrees for any node cannot be more
than one.
The difference between the heights of the left subtree and the right subtree for any
node is known as the balance factor of the node.
The AVL tree is named after its inventors, Georgy Adelson-Velsky and Evgenii
Landis, who published it in their 1962 paper “An algorithm for the organization of
information”.
The tree rotations involve rearranging the tree structure without changing the order
of elements. The positions of the nodes of a subtree are interchanged. There are
four types of AVL rotations:
1. Left Rotation (LL Rotation)
In a left rotation, a node's right child becomes the new root, while the original node
becomes the left child of the new root. The new root's left child becomes the original
node's right child.
In a right rotation, a node's left child becomes the new root, while the original node
becomes the right child of the new root. The new root's right child becomes the
original node's left child.
3. Left-Right Rotation (LR Rotation)
Step 1: START
Step 5: Else, perform tree rotations according to the insertion done. Once the tree is
balanced go to step 6.
Step 6: END
The above tree is not AVL because the differences between the heights of the left
and right subtrees for 8 and 12 are greater than 1.
Insert 9
Insert 15
Insert 20
Imbalance at node 15. H(LST)-H(RST) = 0-2 =-2
Which is not in range {-1,0,1}
Insert 8
Insert 7
Imbalance at node 8
Insert :13
Imbalance at 15 because of 8
With LR Rotation 15-8-9 .. left heavy swap & RR
Insert 10
AVL Deletion
AVL Deletion Example:
Delete 88
Delete 99
Left heavy -RR rotation
Delete 22
1. Time Complexity:
Operations Best Case Average Case Worst Case
1. Implementing it is challenging.
2. Certain procedures have high constant factors.
3. Compared to Red-Black trees, AVL trees are less common.
4. As additional rotations are conducted, AVL trees offer complex insertion and
removal procedures because of their rather rigid balance.
5. Requires to process more for balancing.
-----------------------------------------------------------------------------------------------------
M-way search Tree
Insert 14,
Insert 32
Insert
Insert 75, 35, 36:
Deletion of M-way tree
B Tree
A B-tree is a self-balancing tree where all the leaf nodes are at the same level which
allows for efficient searching, insertion and deletion of records.
Because of all the leaf nodes being on the same level, the access time of data is
fixed regardless of the size of the data set.
Characteristics of B-Tree?
B-trees have several important characteristics that make them useful for storing and
retrieving large amounts of data efficiently. Some of the key characteristics of B-trees
are:
Balanced: B-trees are balanced, meaning that all leaf nodes are at the same
level. This ensures that the time required to access data in the tree remains
constant, regardless of the size of the data set.
Self-balancing: B-trees are self-balancing, which means that as new data is
inserted or old data is deleted, the tree automatically adjusts to maintain its
balance.
Multiple keys per node: B-trees allow multiple keys to be stored in each node.
This allows for efficient use of memory and reduces the height of the tree,
which in turn reduces the number of disk accesses required to retrieve data.
Ordered: B-trees maintain the order of the keys, which makes searching and
range queries efficient.
Efficient for large data sets: B-trees are particularly useful for storing and
retrieving large amounts of data, as they minimize the number of disk
accesses required to find a particular piece of data.
Application of B-Tree:
B-trees are commonly used in applications where large amounts of data need to be
stored and retrieved efficiently. Some of the specific applications of B-trees include:
Databases: B-trees are widely used in databases to store indexes that allow
for efficient searching and retrieval of data.
File systems: B-trees are used in file systems to organize and store files
efficiently.
Operating systems: B-trees are used in operating systems to manage
memory efficiently.
Network routers: B-trees are used in network routers to efficiently route
packets through the network.
DNS servers: B-trees are used in Domain Name System (DNS) servers to
store and retrieve information about domain names.
Compiler symbol tables: B-trees are used in compilers to store symbol tables
that allow for efficient compilation of code.
Advantages of B-Tree:
B-trees have several advantages over other data structures for storing and retrieving
large amounts of data. Some of the key advantages of B-trees include:
Sequential Traversing: As the keys are kept in sorted order, the tree can be
traversed sequentially.
Minimize disk reads: It is a hierarchical structure and thus minimizes disk
reads.
Partially full blocks: The B-tree has partially full blocks which speed up
insertion and deletion.
Disadvantages of B-Tree:
B Tree Algorithm
Insert 12
Insert 23
Insert 6
But the node is full , so take the median of 6,12,23
12 is the median , so push it up and make left keys as LST and right keys are RST
Insert 8
Insert 15
Insert 19
19 is the median , push it up to prev node, space is there . make 15 as LST, 23 as
RST
Insert 45:
Insert 1:
Place is there in 1 ,
So 4 fits
Insert 7:
Place is there in 8, so place 7
Insert 5
Left of 6, we have 1&4
So 1,4,5
Split node at 4.
Final Answer
;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;
B- Tree Deletion Example:
Delete 32
Simply delete 32 because it is the leaf node and case 1.1 , so delete the key
Delete 14
1.2.1 -> largest left sibling is 12, but if its borrowed, the keys < min keys, so cant use
it
1.2.2-> smallest right sibling is 22, and after its borrowed , the keys are more than
minimum.. so we can use it.
Now replace the “22” with “21” in parent and shift 21 to the child node
Delete 15
Merge left sibling with the parent and delete the target node.. link is removed at left
of 13.
But here 22 is a single node –less than min keys, so we further merge
The root node can have 67 (as a single node)
Delete 56:
Left subtree largest no is 55, replaced with 56 and 55 brought down and deleted
Delete 34
LST borrow has less than min keys, so lets borrow from RST =the smallest 41
Delete 41
Both sides less than min keys, so we merge both with parent
23,31,41,43,44 and delete 41
Height =1
-----------------------------------------------------------------------------------------------------------