DSA Module 1
DSA Module 1
Module 1
Basic Concepts
❖ Data structure
In Computer Science, a data structure is a way of organizing information, so that it is easier to use.
Data structures determine the way in which information can be stored in computer and used. If the
focus of use is on the things that can be done, people often talk about an abstract data type (ADT). Data
structures are often optimized for certain operations. Finding the best data structure when solving
problem is an important part of programming. Programs that use the right data structure are easier to
write, and work better.
➢ Need of data structure
● It gives different level of organization data.
● It tells how data can be stored and accessed in its elementary level.
● Provide operation on group of data, such as adding an item, looking up
highest priority item.
● Provide a means to manage huge amount of data efficiently.
● Provide fast searching and sorting of data.
It can be defined as a collection of data items together with the operations on the data. The word
“abstract” refers to the fact that the data and the basic operations defined on it are being studied
independently of how they are implemented. It involves what can be done with the data, not
how has to be done. For ex, in the below figure the user would be involved in checking that what
can be done with the data collected not how it has to be done
ADT’s are entities that are definition of data and operations but do not have
implementation details.
Specifies the logical properties of data type or data structure.
The operations that can be performed on that data.
Refers to the mathematical concept that governs them.
They are not concerned with the implementation details like space and time efficiency
➔ Examples include:
● A queue is a first-in first-out list. Variations are Deque and Priority queue.
● A set can store certain values, without any particular order, and with no repeated values.
● A stack is a last-in, first out.
● A tree is a hierarchical structure.
● A graph.
● A hash or dictionary or map or Map/Associative array/Dictionary is a more flexible
variation on a record, in which name-value pairs can be added and deleted freely.
● A smart pointer is the abstract counterpart to a pointer.
In computer science, an abstract data type (ADT) is a mathematical model for data types where
a data type is defined by its behavior from the point of view of a user of the data, specifically in
terms of possible values, possible operations on data of this type, and the behavior of these
operations. This contrasts with data structures, which are concrete representations of data, and
are the point of view of an implementer, not a user.
The notion of a data structure in the abstract needs to be treated differently from whatever is
used to implement the structure. The abstract notion of a data structure is defined in terms of
theoperations we plan to perform on the data.
Considering both the organization of data and the expected operations on the data, leads to the
notion of an abstract data type. An abstract data type in a theoretical construct that consists of
data as well as the operations to be performed on the data while hiding implementation.
For example, if we want to read a file, we wrote the code to read the physical file device. That
is, we may have to write the same code over and over again. So we created what is known today
as an ADT. We wrote the code to read a file and placed it in a library for a programmer to use.
As another example, the code to read from a keyboard is an ADT. It has a data structure, character
and set of operations that can be used to read that data structure.
For example, a stack is a typical abstract data type. Items stored in a stack can only be added
and removed in certain order – the last item added is the first item removed. We call these
operations, pushing and popping. In this definition, we haven’t specified have items are stored on
the stack, or how the items are pushed and popped. We have only specified the valid operations
that can be performed.
Classification of Data Structure:
Data structures are normally divided into two broad categories.
1. Primitive Data Structures (Built-in)
2. Non primitive Data Structure (User defined)
These are basic structures and are directly operated upon by the machine instructions. These, in
general, have different representations on different computers. Integers, floating-point
numbers, character constants, string constants, pointers etc. fall in this category.
The primitive data types are the basic data types that are available in most of the programming
languages. The primitive data types are used to represent single values.
Eg: 12, 90
● Float and Double: This is used to represent a number with decimal point.
These are more complicated data structures. These are derived from the primitive data
structures. The non-primitive data structures emphasize on structuring of a group of
homogeneous (same type) or heterogeneous (different type) data items. Arrays, structures,
lists and files are examples.
The data appearing in our data structures are processed by means of certain operations. In fact,
the particular data structure that one chooses for a given situation depends largely on the
frequency with which specific operations are performed.
● Arrays
● Structure
● Union
● linked list
● Stacks
● Queue etc.
Q. Enlist Difference between primitive data structure and non-primitive data
structure
2. Primitive data are only single values; they have not special capabilities.
3. The examples of Primitive data types are given byte, short, int, long, float, double, char etc.
4. The integer reals, logic data character data pointer and reference are primitive data
structures data structure that normally are directly operated upon by machine level
instructions are known as primitive structure and data type.
1. A non-primitive data type is something else such as an array structure or class is known as
thenon-primitive data type.
2. The data type that are derived from primary data types are known as non-primitive data type.
3. The non-primitive data types are used to store the group of values.
The basic operations that are performed on data structures are as follows:
1. Traversing: Accessing each record exactly once so that certain items in the record maybe
processed. (This accessing and processing is sometimes called “visiting” the record).
2. Searching: Searching operation finds the presence of the desired data item in the list of
data item. It may also find the locations of all elements that satisfy certain conditions.
3. Inserting: Inserting means addition of a new data element in a data structure.
4. Deleting: Deleting means removal of a data element from a data structure.
5. Sorting: Sorting is the process of arranging all data items in a data structure in aparticular
order say for example, either in ascending order or in descending order.
6. Merging: Combining the records of two different sorted files into a single sorted file.
7. Updation: It update or modifies the data in data Structure.
8. Splitting: It is process of partitioning Single list to multiple list.
9. Destroy: It destroys the memory space allocated for specified data structure.
10. Selection: It deals with accessing a particular data within a data structure.
1. Arrays
An array is defined as a set of finite number of homogeneous elements or data items. It means an
array can contain one type of data only, either all integers, all floating- point numbers, or all
characters.
Declaration of arrays is as follows:
int A[10];
where int specifies the data type or type of elements array stores. “A” is the name of the array, and
the number specified inside the square brackets is the number of elements an array can store,this
is also called size or length of array.
2. The number of elements that can be stored in an array i.e., the size of array or its length is given
by the following equation :(upper bound – lower bound) + 1
For the above array, it would be (9-0) + 1 = 10. Where 0 is the lower bound of array, and 9 is the
upper bound of array.
Arrays can always be read or written through loop. If we read a one-dimensional array, it
3.
requires one loop for reading and other for writing (printing) the array. For example:
If we are reading or writing two-dimensional array it would require two loops. And
similarly, the array of n dimension would require n loops.
Some common operations performed on arrays are:
1. Creation of an array.
2. Traversing an array (accessing array elements).
3. Modification of an element.
4. Merging of arrays.
2. Linked Lists
A linked list is a linear collection of data elements, called node pointing to the next nodes by means
of pointers. Each node is divided into two parts: the first part containing the information of the
element, and the second part called the link or next pointer containing the address of the next
node in the list. Technically, each such element is referred to as a node, therefore a list can be
defined as a collection of nodes as shown in Fig. (2) below:
3. Stacks
A stack is a non-primitive linear data structure. It is an ordered list in which addition of new data
item and deletion of already existing data item is done from only one end, known as Top of Stack
(TOS) . As all the deletion and insertion in a stack is done from top of the stack, the last added
element will be the first to be removed from the stack. Due to this reason, the stack is also called
4. Queues
Queue is a linear data structure that permits insertion of an element at one end and deletion of an
element at the other end. The end at which the deletion of an element take place is called front,
and the end at which insertion of a new element can take place is called rear. The deletion or
insertion of elements can take place only at the front and rear end of the list respectively.
The first element that gets added into the queue is the first one to get removed from the list. Hence,
Queue is also referred to as First-In-First-Out (FIFO) list. The name ‘ Queue’ comes from the
everyday use of the term. Consider a railway reservation booth, at which we have to getinto the
reservation queue. New customers got into the queue from the rear end, whereas the customers
who get their seats reserved leave the queue from the front end. It means the customers are
serviced in the order in which they arrive the service center (i.e. first come first serve type of
service). The same characteristics apply to our Queue. Fig. (3) shows the pictorial representation
of a Queue.
In Fig (3), 10 is the first element and 80 is the last element added to the Queue. Similarly, 10
would be the first element to get removed and 80 would be the last element to get removed.
5. Trees
A Tree can be defined as a finite set of data items (nodes). Tree is a non-linear type of data
structure in which data items are arranged of stored in a sorted sequence. Trees represent the
hierarchical relationship between various elements. In trees :
1. There is a special data item at the top of hierarchy called the Root of the Tree.
2. The remaining data items are partitioned into number of mutually exclusive (i.e. disjoint)
subsets, each of which is itself, a tree, which is called the subtree.
3. The tree always grows in length towards bottom in data structures, unlike natural trees
which grows upwards.
The tree structure organizes the data into branches, which relate the information. A tree is shown
in Fig. (4).
6. Graphs
Data sometimes contain a relationship between pairs of elements which is not necessarily
hierarchical in nature. Geometrical arrangement are very important while working with real life
projects. For example, let us suppose there are five villages separated by a river. Now we want to
construct bridges to connect these villages as shown in Fig. (5)
We can reduce the landmass of villages to a dot and we can change the shape of bridges.
This will not change geometric arrangement of paths to connect different villages. We
can draw a similar geometry of our project as follows :
We have represented villages by dots (which are called vertex or node) and bridges by lines
which are called edges. This type of drawing is called graph. Hence a graph can be defined as a
ordered set (V,E), where V(G) represents the set of all elements called vertices and E(G)
represents the edges between these vertices.
Fig. (6) Shows a graph, for which V(G)={ v1, v2, v3, v4, v5 } and E(G) = { b1, b2, b3, b4}.
Algorithms:
Algorithm is a step-by-step procedure to solve a particular function i.e., it is a set of
instructions written to carry out certain tasks and the data structure is the way of
organizing the data withtheir logical relationship maintained.
An algorithm is a finite sequence of instructions, each of which has a clear
meaning and can be performed with a finite amount of effort in a finite length of time.
No matter what the input values may be, an algorithm terminates after executing a finite
number of instructions.
An algorithm is defined as a step-by-step procedure or method for solving a
problem by acomputer in a finite number of steps. Steps of an algorithm definition may
include branching or repetition depending upon what problem the algorithm is being
developed for. While defining an algorithm steps are written in human understandable
Notes by Prof. Ankita Mandore
CSE (Data Science)
DSCE
DAYANANDA SAGAR COLLEGE OF ENGINEERING
(An autonomous Institution affiliated by VTU, Belagavi)
Characteristic of Algorithm
● Definiteness (Unambiguous) − Algorithm should be clear and unambiguous. Each of its steps (or
phases), and their inputs/outputs should be clear, and Every statement should have single and exact
meaning. You cannot write statement which cannot understandable.
● Input − An algorithm should have 0 or more well-defined inputs.
● Output − An algorithm should have 1 or more well-defined outputs
● and should match the desired output.
● Finiteness − if we trace out the instructions of an algorithm, then for all cases the algorithm will
terminate after a finite number of steps.
● Feasibility − should be feasible with the available resources.
● Effectiveness: Every step must be accurate. Don’t write unnecessary step which are not required.
Example
Algorithms tell the programmers how to code the program. Alternatively, thealgorithm can be
written as −
Notes by Prof. Ankita Mandore
CSE (Data Science)
DSCE
DAYANANDA SAGAR COLLEGE OF ENGINEERING
(An autonomous Institution affiliated by VTU, Belagavi)
Recursion:
The process in which a function calls itself directly or indirectly is called recursion and
the corresponding function is called a recursive function. Using a recursive algorithm,
certain problems can be solved quite easily. Examples of such problems are Towers of
Hanoi (TOH), Inorder/Preorder/Postorder Tree Traversals, DFS of Graph, etc. A
recursive function solves a particular problem by calling a copy of itself and solving
smaller subproblems of the original problems. Many more recursive calls can be
generated as and when required. It is essential to know that we should provide a certain
case in order to terminate this recursion process. So we can say that every time the
function calls itself with a simpler version of the original problem.
Need of Recursion
Recursion is an amazing technique with the help of which we can reduce the length of
our code and make it easier to read and write. It has certain advantages over the
iteration technique which will be discussed later. A task that can be defined with its
similar subtask, recursion is one of the best solutions for it. For example; The Factorial
of a number.
What is the difference between direct and indirect recursion?
A function fun is called direct recursive if it calls the same function fun. A function fun
is called indirect recursive if it calls another function say fun_new and fun_new calls
fun directly or indirectly.
directRecFun();
// Some code...
}
indirectRecFun2();
// Some code...
}
void indirectRecFun2()
{
// Some code...
indirectRecFun1();
// Some code...}
Factorial Using Recursion:
1. 5! = 5*4*3*2*1 = 120
2. 3! = 3*2*1 = 6
Here, 5! is pronounced as "5 factorial", it is also called "5 bang" or "5 shriek".
There are many ways to write the factorial program in c language. Let's see the 2 ways to
write the factorial program.
#include<stdio.h>
int fact(int);
int main()
{
int x,n;
printf(" Enter the Number to Find Factorial :");
scanf("%d",&n);
x=fact(n);
printf(" Factorial of %d is %d",n,x);
return 0;
}
int fact(int n)
{
Notes by Prof. Ankita Mandore
CSE (Data Science)
DSCE
DAYANANDA SAGAR COLLEGE OF ENGINEERING
(An autonomous Institution affiliated by VTU, Belagavi)
if(n==0)
return(1);
return(n*fact(n-1));
}
Assume that we have an array named “Num” of size “n,” and we have to find an element
“key” in that array.
Step 1: START
Step 2: Initialize Left =0 and Right = n-1.
Step 3: Find the middle using the formula
Middle = Left + (Right - Left)/2.
Step 4: If Middle = key,
Then return “Middle” and print “element found.”
Step 5: If Middle > Key,
Then set Right = Middle-1 and GOTO step 3.
Step 6: If Middle < Key,
Then set Left = Middle + 1 and GOTO step 3.
Step 7: Repeat steps 3 to 6 till Left >= Right.
Step 8: If Left > Right
Return -1 and print “element not found.”
Step 9: STOP
int main() {
int size = 0, key = 0, found = 0;
printf("Enter the size of the array: ");
scanf("%d", & size);
int num[size];
printf("Enter the elements of the array in ascending order: ");
for (int i = 0; i < size; i++) {
scanf("%d", & num[i]);
}
printf("Enter the element to be searched: ");
scanf("%d", & key);
found = b_search(num, 0, size - 1, key);
if (found == -1) {
printf("Element is not present in the array");
}
else {
printf("Element found at index %d", found);
}
return 0;
}
Notes by Prof. Ankita Mandore
CSE (Data Science)
DSCE
DAYANANDA SAGAR COLLEGE OF ENGINEERING
(An autonomous Institution affiliated by VTU, Belagavi)
Tower of Hanoi
Tower of Hanoi is a mathematical puzzle where we have three rods (A, B, and C)
and N disks. Initially, all the disks are stacked in decreasing value of diameter i.e., the
smallest disk is placed on the top and they are on rod A. The objective of the puzzle is
to move the entire stack to another rod (here considered C), obeying the following
simple rules:
Only one disk can be moved at a time.
Each move consists of taking the upper disk from one of the stacks and placing it
on top of another stack i.e. a disk can only be moved if it is the uppermost disk on a
stack.
No disk may be placed on top of a smaller disk.
Tower of Hanoi using Recursion:
The idea is to use the helper node to reach the destination using recursion. Below is the
pattern for this problem:
Shift ‘N-1’ disks from ‘A’ to ‘B’, using C.
Shift last disk from ‘A’ to ‘C’.
Shift ‘N-1’ disks from ‘B’ to ‘C’, using A.
int main()
{
int no_of_disks ;
scanf("%d", &no_of_disks);
toH(no_of_disks, 'A','C','B');
return 0;
}
In more detail, a stack is a data structure that follows the Last-In-First-Out (LIFO)
principle. This means that the last item added to the stack is the first one to be removed.
When a recursive call is made, the current state of the function, including the values of
its variables and its position in the program, is pushed onto the stack. This is known as a
stack frame. The function then calls itself with new arguments, and a new stack frame is
created.
The stack is crucial in managing recursive calls because it allows the program to
'remember' where it left off each time a function calls itself. When the base case of the
recursion is reached and the function starts returning, the stack begins to unwind. Each
stack frame is popped off the stack, and the function resumes execution where it left off,
with all its variables restored to their previous state.
Without a stack, recursive calls would be much more difficult to manage. The program
would need some other way to keep track of the state of each recursive call and to
resume execution correctly after each return. The stack provides a simple and efficient
way to handle this.
However, it's important to note that the stack has a limited size. If a program makes too
many recursive calls and the stack overflows, this can lead to a runtime error. This is
one of the potential drawbacks of recursion and something that programmers need to
be aware of.
The stack plays a vital role in executing recursive calls. It provides a way to manage the
state of each call, to resume execution correctly after each return, and to handle the
potentially large number of calls involved in a recursive process.
Notes By- Prof. Ankita Mandore
CSE (Data Science)
DSCE.
The linear array discussed in the previous chapter allowed one to insert or deletes
elements at any place in the list - at the beginning, at the end, or in the middle. There
are certain frequent situations in computer science when one wants to restrict insertion
or deletion so that they can take place only at the beginning or at the end of the list, but
not in the middle. Two of data structure that are useful in such situations are stacks
and queue.
STACK:-
A stack is a list of elements in which an element may be inserted or deleted only at one
end, called the top of the stack. e.g A stack of dishes.
0P "
|0 |o
>|m
Observe that an item may be added or removed only from the top of the stack. This
means in that the last item to be added to a stack is the first item to be removed.
Accordingly the stack are also called as last in first out (LIFO) lists.
If a stack contains a single item and the stack is popped, the resulting stack contains
no items and is called the empty stack.
Primary operations: Push and Pop
Push
m Add an element to the top of the stack
1Pop
m Remove the element at the top of the stack
=l
emply stack push an element push another
Primitive Operations on Stack:
: ;
i ed v;m:k acks.
Special terminology is used for two basic operations associat s\:vh ich e
a) Create — Creates a new stack. This operation creates a new sta Ply.
b) PUSH - is the term used to insert an element into a sttackk.
¢) POP - is the term used to delete an element from a stack. : :
d) ISEMPTY- Checks whether a stack is empty. This operations fetut';gz ZeRcL;Eslf the
stack is empty and false otherwise. This is required for the pop opera € we
cannot pop from an empty stack. : 2 h th
e) ISOVERFLOW (STACK OVERFLOW)- This is the situation when the stack
becomes full, and no more elements can be pushed onto stack.
In executing the procedure Push; one must first test Whether there is
for the new item if not then we room in the stack
have the condition known an ove
executing the procedure pop one rflow. Similarly in
must first test whether there is
stack to be deleted, if not, then we an element in the
have the condition known as und
erflow.
Application Of Stack:- application. Stacks
use d in ope rat ing sys tems, by compiler and by
Stacks are widely expressions etc.
also use d in pro ced ure call , to eval |uation of arithmetic
are
Some of the applications are:
s
. Recursion, subroutine call
evaluation.
» Expression conversion and sion)
tching parenthese: s in expres
« Well formed parentheses (ma
n
. Decimal to binary conversio
Postfix Expression:-
Transforming (Conversion) Infix Expression into
Q into its postfix expression P.
The following algorithm transforms the infix expression
s and left parenthesis. The
The algorithm uses a stack to temporarily hold operator
operands from Q
postfix expression P will be constructed from left to right using the
ing a left
and the operators, which are removed from stack. We begin by push
m is
parenthesis onto stack and adding a right parenthesis at the end of Q. The algorith
completed when stack is empty.
1. Push '(' onto stack and add ')’ at the end of Q.
2. Scan Q from left to right and repeat step 3 to 6 for each element of Q until the
stack is empty.
3. If an operand is encountered, add it to P.
4. If a left parenthesis is encountered, Push it onto stack.
5. If an operator (x) is encountered then,
a) Repeatedly pop from stack and add to i
precedence as or higher than (x) R e (e saime
b) Add (x) to stack.
[End of if ]
6. If a right parenthesis is encountered, then
a) Repeatedly pop from stack and add to P each operator (on
the top of stack)
until a left parenthesis is encountered : ‘
b) Remove the left parenthesis. [Do not add the left parenthesis to P)
[End of If]
[End of step 2 loop]
7. Exit