CPE 322 Data Structure and Algorithm
CPE 322 Data Structure and Algorithm
LECTURE NOTE
ON
1
INTRODUCTION TO DATA STRUCTURE
Data Structure can be defined as the group of data elements which provides an efficient way of storing
and organizing data in the computer so that it can be used efficiently. Some examples of Data
Structures are arrays, Linked List, Stack, Queue, etc. Data Structures are widely used in almost every
aspect of Computer Science i.e. operating System, Compiler Design, Artificial intelligence, Graphics
and many more.
A data structure helps you to understand the relationship of one data element with the other and organize
it within the memory. Sometimes the organization might be simple. E.g.: List of names of months in a
year –Linear Data Structure, List of historical places in the world- Non-Linear Data Structure. A data
structure helps you to analyze the data, store it and organize it in a logical and mathematical manner.
Data and Information
1. Raw facts gathered about a condition, event, idea, entity or anything else which is bare and random,
is called data. Information refers to facts concerning a particular event or subject, which are refined
by processing.
2. Data are simple text and numbers, while information is processed and interpreted data.
Data is in an unorganized form, i.e. it is randomly collected facts and figures which are processed to draw
conclusions. On the other hand, when the data is organized, it becomes information, which presents data
in a better way and gives meaning to it.
2
Primitive Data Structures
Primitive Data Structures are the basic data structures that directly operate upon the machine
instructions. They have different representations on different computers. For example, integer,
character, and string are all primitive data types. Programmers can use these data types when creating
variables in their programs.
Primitive data structures or types are the most basic data units available in programming languages.
They include
integers: whole numbers—positive, negative, or zero, for example, 1, -30, 100, etc.;
floats: decimal numbers with a fractional part, like 3.14, -0.5, 7.0;
characters: individual characters, for example, “A,” “d,” and “1”;
strings: sequences of characters; and
booleans: binary values that can be either true or false, commonly used for conditional expressions.
Primitive data structures represent simple values that cannot be broken down further into smaller
components. They have a fixed size and format, and this predictability helps optimize memory usage.
Primitive data types are also consistent across various programming languages, with only slight
variations in naming or specific implementations.
Non- primitive data structures are created with primitive data structures as their building blocks to
efficiently organize and manage a collection of data. They can handle different data types
and complex operations like searching, sorting, insertion, deletion, and more. They are more
complicated data structures.
Non-primitive data structures fall into two large categories: linear and non linear structures.
Arrays: An array is a collection of similar type of data items and each data item is called an element
of the array. The data type of the element may be any valid data type like char, int, float or double. The
3
elements of array share the same variable name but each one carries a different index number known
as subscript. The array can be one dimensional, two dimensional or multidimensional.
Linked List: Linked list is a linear data structure which is used to maintain a list in the memory. It can
be seen as the collection of nodes stored at non-contiguous memory locations. Each node of the list
contains a pointer to its adjacent node.
Stack: Stack is a linear list in which insertion and deletions are allowed only at one end, called top. A
stack is an abstract data type (ADT), can be implemented in most of the programming languages. It is
named as stack because it behaves like a real- world stack, for example: - piles of plates or deck of
cards etc.
Queue: Queue is a linear list in which elements can be inserted only at one end called rear and deleted
only at the other end called front.It is an abstract data structure, similar to stack. Queue is opened at
both end therefore it follows First-In-First- Out (FIFO) methodology for storing the data items.
Graphs: Graphs can be defined as the pictorial representation of the set of elements (represented by
vertices) connected by the links known as edges. A graph is different from tree in the sense that a
graph can have cycle while the tree cannot have the one.
4
5
ALGORITHMS
An algorithm is a procedure having well defined steps for solving a particular problem. Algorithm is finite
set of logic or instructions, written in order for accomplish the certain predefined task. It is not the
complete program or code, it is just a solution (logic) of a problem, which can be represented either as
an informal description using a Flowchart or Pseudo code.
Characteristics of an Algorithm
Sometimes, there are more than one way to solve a problem. We need to learn how to compare the
performance different algorithms and choose the best one to solve a particular problem. While analysing
an algorithm, we mostly consider time complexity and space complexity. Time complexity of an algorithm
quantifies the amount of time taken by an algorithm to run as a function of the length of the input. Similarly,
Space complexity of an algorithm quantifies the amount of space or memory taken by an algorithm to run
as a function of the length of the input.
6
The performance of algorithm is measured on the basis of following properties:
Example: Design an algorithm to multiply the two numbers x and y and display the result in z.
o Step 1 START
o Step 2 declare three integers x, y & z
o Step 3 define values of x & y
o Step 4 multiply values of x & y
o Step 5 store the output of step 4 in z
o Step 6 print z
o Step 7 STOP
7
Summary
Each programming language provides various data types and each data type is
represented differently within the computer’s memory.
The memory requirement of a data type determines the permissible range of
values for that data type.
The data types can be classified into several categories, including primitive data
types and composite data types.
The data types provided by a programming language are known as primitive data
types or in-built data types.
In addition to primitive and composite data types, programming languages allow
the user to define new data types (or user-defined data types) as per his
requirements.
Generally, handling small problems is much easier than handling comparatively
larger problems.
The size of each module is kept as small as possible and if required, other
modules are invoked from it.
Second, a well-designed modular program has modules independent of each
other’s, implementation, which will make the program easily modifiable.
An abstract data type (ADT) is an extension of a modular design in a way that the
set of operations of an ADT are defined at a formal, logical level, and nowhere in
ADT’s definition, it is mentioned how these operations are implemented.
The basic idea of ADT is that the implementation of the set of operations are
written once in the program and the part of program which needs to perform an
operation on ADT accomplishes this by invoking the required operation.
If there is a need to change the implementation details of an ADT, the change will
be completely transparent to the programs using it.
The logical or mathematical model used to organize the data in main memory is
called a data structure.
These features should be kept in mind while choosing a data structure for a
particular situation.
The choice of a data structure depends on its simplicity and effectiveness in
processing of data.
Data structures are divided into two categories, namely, linear data structure and
non-linear data structure.
8
A linear data structure is one in which its elements form a sequence. It means each element
in the structure has a unique predecessor and a unique successor.
A finite collection of homogenous elements is termed as an array.
The elements of an array are always stored in a contiguous memory locations irrespective of
the array size.
A stack is a linear list of data elements in which the addition of a new element or the deletion
of an element occurs only at one end.
A queue is a linear data structure in which the addition or insertion of a new element occurs at
one end, called ‘Rear’, and deletion of an element occurs at other end, called ‘Front’.
A tree consists of multiple nodes, with each node containing zero, one or more pointers to other
nodes called child nodes.
Keywords
Traversing: It means accessing all the data elements one by one to process all or
some of them.
Substitution method: In this method, a reasonable guess for the solution is made
and it is proved through mathematical induction.
Recursion tree: In this method, recurrences are represented as a tree whose nodes
indicate the cost that is incurred at the various levels of recursion.
Searching: It is the process of finding the location of a given data element in the data
structure.
Insertion: It means adding a new data element in the data structure. A new element
can be inserted anywhere in the structure, such as in the beginning, in the end, or in
the middle.
Deletion: It means removing any existing data element from the data structure.
Sorting: It is the process of arranging all the elements of a data structure in a logical
order such as ascending or descending order.
Merging: It is the process of combining the elements of two sorted data structures
into a single sorted data structure.
9
LINEAR DATA STRUCTURES
In a linear data structure, the elements are arranged sequentially in a particular order at one level. Each
element has one predecessor and one successor, except for the first and last elements. This
arrangement allows for a single uninterrupted run or iteration through the structure, starting from one
end and progressing to the other.
Linear data structures are typically straightforward and efficient for basic operations like adding,
retrieving, or deleting. However, as the program becomes more complex, the limitations of this category
become apparent. Even though linear data structures allow for single-run traversal, the process can still
be complex as you have to visit each element one by one from the beginning. This results in a time
complexity that increases linearly with the number of elements added.
Quick deletes
10
ARRAY DATA STRUCTURE: EFFECTIVE DATA STORAGE AND RETRIEVAL
An array is a collection of elements of the same data type placed in contiguous memory locations that
can be individually referenced by using an index to a unique identifier.
An array stores elements at contiguous memory locations—next to each other without gaps.
It’s homogeneous, meaning that all data within a structure is of the same data type (only integers,
characters, or others described earlier.) Other linear data structures can be homogeneous or
heterogeneous depending on the programming language and their implementation.
In an array, each element is associated with an index, a numerical identifier that indicates the element's
position. Indexing begins at 0 for the first element and increments sequentially up to the array size minus
one. For example, in an array of size 5, the indices range from 0 to 4, like in the picture below.
Compared to other linear structures, a key advantage here is that accessing any element takes the
same time regardless of its position or the array’s size.
Depending on the programming language, arrays can be fixed or flexible in length. For example, in Java,
you specify a constant size for an array, while in Python, you can create dynamic arrays.
Arrays are commonly used in algorithms that require random access to elements, such as searching and
sorting. They are also effective for storing lists of information (like dates or addresses), performing
mathematical computations, and image processing. Additionally, arrays serve as the foundation for more
complex data structures.
Initialization
Types of Arrays
where
data_type = data type of elements to be stored in array
array_name = name of the array
size = the size of the array indicating that the lower bound of the array is 0 and the upper
bound is size-1. Hence, the value of the subscript ranges from 0 to size-1.
12
For example, in the statement int array[10], an integer array of ten elements is declared and the array
elements are indexed from 0 to 9. Once the compiler reads a single-dimensional array declaration, it
allocates a specific amount of memory for the array. Memory is allocated to the array at the compile-time
before the program is executed.
Multi-Dimensional Arrays
Multi-dimensional arrays can be described as ‘arrays of arrays’. A multi- dimensional array of dimension
n is a collection of elements, which are accessed with the help of n subscript values. Most of the high-
level languages, including C, support arrays with more than one dimension. However, the maximum limit
of an array dimension is compiler dependent.
A two-dimensional array is one in which two subscript values are used to access an array element. They
are useful when the elements being processed are to be arranged in rows and columns (matrix form).
For example, in the statement int a[3][3], an integer array of three rows and three columns is
declared. Once a compiler reads a two- dimensional array declaration, it allocates a specific
13
amount of memory for this array.
In working with two-dimensional arrays, once a two-dimensional array is declared and initialized, the
array elements can be accessed anytime. Same as one-dimensional arrays, two-dimensional array
elements can also be accessed by using a combination of the name of the array and subscript values.
The only difference is that instead of one subscript value, two subscript values are used. The first
subscript indicates the row number and the second subscript indicates the column number of a two-
dimensional arrays.
A program to illustrate the traversal of a matrix (two-dimensional array) and finding the
sum of its elements is as follows:
#include<stdio.h> #include<conio.h>
#define MAX 10
/*Function prototype*/
void traverse(int [][MAX], int,int); void main()
{
int ARR[MAX][MAX], i, j, m, n;
clrscr();
printf(“Enter the number of rows and columns of a matrix A: “); scanf(“%d%d”, &m, &n);
printf(“Enter the elements of matrix A: \n”); for(i=0;i<m;i++)
for(j=0;j<n;j++) scanf(“%d”, &ARR[i][j]);
traverse(ARR, m, n); getch();
14
The output of the program is as follows:
15
STACK DATA STRUCTURE: MANAGING UNDO AND ACCESSING RECENT ELEMENTS
A stack operates on the principle of Last In, First Out (LIFO), where inserting and retrieving data is
possible from only one end. The last element added to the stack is the first one to be removed.
Stack elements have two primary operations—push and pop. Pushing adds a new element, making it
the top of the stack. Popping removes the topmost element first, exposing the next one as the new top of
the stack. Regardless of the stack size, pushing or popping an element takes the same amount of time
because there is always only one place to do the operation: the top of the stack.
Stacks are particularly useful for managing data when the order of operations is important and when you
need to access the most recently added elements first. A clear example is undo mechanisms in text
editors, where each action performed by a user is recorded as a command and pushed onto the stack.
When you initiate an undo operation, commands are popped from the stack, cancelling the previous
actions and restoring the document to its previous state.
When a stack is organized as an array, a variable named Top is used to point to the top element of the
stack. Initially, the value of Top is set as -1 to indicate an empty stack. Before inserting a new element
onto a stack, it is necessary to test the condition of overflow. Overflow occurs when a stack is full and
there is no space for a new element and an attempt is made to push a new element. If a stack is not full
then the push operation can be performed successfully. To push an item onto a stack, Top is incremented
by one and the element is inserted at that position.
Similarly, before removing the top element from the stack, it is necessary to check the condition of
underflow. Underflow occurs when a stack is empty and an attempt is made to pop an element. If a stack
is not empty, POP operation can be performed successfully. To POP (or remove) an element from a stack,
the element at the top of the stack is assigned to a local variable and then Top is decremented by one.
The total number of elements in a stack at a given point of time can be calculated from the value of Top
as follows.
16
Number of elements = Top + 1
Empty stack
Overflow: It occurs when a stack is full and there is no space for a new element and an attempt is
made to push a new element.
Stack: It is an abstract data type that serves as a collection of elements, with two principal operations:
push, which adds an element to the collection, and pop, which removes the most recently added
element that was not yet removed.
Underflow: An error condition that occurs when an item is called for from the stack, but the stack is
empty. Contrast with stack overflow.
Operations on Stack
Push: Adds an item in the stack. If the stack is full, then it is said to be an Overflow condition.
17
Algorithm Push( )
{
if (TOP =n-1)
{
Print”stack is full”
}
Else
{
Top = top+1
[STOP] = x
}
}
Pop: Removes an item from the stack. The items are popped in the reversed order in which they are
pushed. If the stack is empty, then it is said to be an Underflow condition.
Algorithm Pop()
{
if ( TOP = -1)
{
Print”stack is empty”
}
Else
{
y= STOP ]
TOP = TOP -1
}
18
}
19
QUEUE DATA STRUCTURE: SEQUENTIAL PROCESSING OF TASKS OR DATA
A Queue is a linear data structure which is somewhat similar to Stacks. But unlike stacks, a queue is
open at both its ends. One end of a queue is always used to insert data (called enqueue) and the other
is used to remove data (called dequeue).
A queue operates on the principle of First In, First Out (FIFO), where the first element added to the queue
is the first one to be removed. In this case, unlike with a stack, inserting and retrieving data is done from
different ends. In a queue, the elements are added at the back (enqueue) and removed from the front
(dequeue). Similar to a stack, adding and removing an element takes the same time regardless of the
queue size. The end of the queue from which the element is deleted is known as the Front and the end
at which a new element is added is known as the Rear.
Queues help process tasks or data in the order they were received. An example would be programs
sending their print jobs, typically with only one printer available to process them sequentially. This printer
handles each job in order, one at a time. Additionally, in event-driven programming, where events are
processed as they occur, queues preserve the correct sequence of actions in the system.
20
LINKED LIST DATA STRUCTURE: FLEXIBILITY WITH DYNAMICALLY GROWING DATA
A linked list consists of elements called nodes, each containing both data and a pointer (reference) to
the next node in the sequence. The first node is called the head, and the last one has a null reference,
indicating the end of the list. Each node can be placed at any available memory location, with the
references between nodes enabling the traversal of the list.
A linked list is a sequence of data structures, which are held together by links. A Linked List is a sequence
of links which contains items. Each link contains a connection to another link. A Linked list is the second
most-used data structure after an array.
A linked list is a linear collection of homogeneous elements called nodes. Successive nodes of a linked
list need not occupy adjacent memory locations. The linear order between nodes is maintained by means
of pointers. In linked lists, insertion or deletion of nodes do not require shifting of existing nodes as in the
case of arrays; they can be inserted or deleted merely by adjusting the pointers or links.
Singly-linked list
Doubly-linked list
Circular-linked list
21
Singly-Linked Lists
A singly-linked list is also known as a linear linked list. In it, each node consists of two fields, viz. ‘data’
and ‘next’. The ‘data’ field contains the data and the ‘next’ field contains the address of memory location
where the subsequent node is stored. The last node of the singly-linked list contains NULL in its ‘next’
field which indicates the end of the list.
A linked list contains a list pointer variable ‘Head’ that stores the address of the first node of the list. In
case, the ‘Head’ node contains NULL, the list is called an empty list or a null list
Operations
Since each node of the list contains only a single pointer pointing to the next node, not to the previous
node—allowing traversing in only one direction—hence, it is also referred to as a one-way list.
A number of operations can be performed on singly-linked lists. These operations include traversing,
searching, inserting and deleting nodes, reversing, sorting, and merging linked lists.
Doubly-Linked Lists
In a singly-linked list, each node contains a pointer to the next node and it has no information about its
previous node. Thus, one can traverse only in one direction, i.e., from beginning to end. However,
sometimes it is required to traverse in the backward direction, i.e., from end to beginning. This can be
implemented by maintaining an additional pointer in each node of the list that points to the previous node.
Such type of a linked list is called doubly-linked list.
Each node of a doubly-linked list consists of three fields—prev, info, and next. The info field contains the
data, the prev field contains the address of the previous node, and the next field contains the address of
the next node.
Since a doubly-linked list allows traversing in both forward and backward directions, it is also referred to
as a two-way list.
22
Circular Linked List
A circular linked list is a special type of linked list where all the nodes are connected to form a
circle. The last node connects back to the first, forming a loop. Unlike a regular linked list, which
ends with a node pointing to NULL, the last node in a circular linked list points back to the first
node. This means that you can keep traversing the list without ever reaching a NULL value.
Circular linked lists are especially helpful for tasks like scheduling and managing playlists,
allowing for smooth navigation. In this tutorial, we’ll cover the basics of circular linked lists, how
to work with them, their advantages and disadvantages, and their applications.
We can create a circular linked list from both singly linked lists and doubly linked lists. So,
circular linked lists are basically of two types:
In Circular Singly Linked List, each node has just one pointer called the "next" pointer. The next
pointer of the last node points back to the first node and this results in forming a circle. In this type
of Linked list, we can only move through the list in one direction.
In circular doubly linked list, each node has two pointers prev and next, similar to doubly linked
list. The prev pointer points to the previous node and the next points to the next node. Here, in
addition to the last node storing the address of the first node, the first node will also store the
address of the last node.
23
Representation of Circular Doubly Linked List
Note: In this article, we will use the singly linked list to explain the working of
circular linked lists.
24
25