Chapter 1
Chapter 1
1.1 Introduction
1.2 Logical and Physical Files
1.2.1 File
1.2.2 File Structure
1.2.3 Logical and Physical Files Definitions
Introduction :-
File:-
File is a collection of records. For example, a telephone book.
A file is named collection of related information that is recorded on secondary storage such as magnetic disks,
magnetic tables and optical disks.
In this method the file are stored one after another in a sequential manner.
In sequential file organisation records are stored or arranged in Ascending or
Descending Order according to the search key / key field.
• First record in the order is placed at the beginning of the file.
• Second record is stored right after the first and the third after the second and so on.
Types of Sequential file Organization
1. Pile File Organization
2. Sorted file organization
Pile File Method:
It is a quite simple method. In this method, we store the record in a sequence, i.e.,
one after another. Here, the record will be inserted in the order in which they are
inserted into tables.
In case of updating or deleting of any record, the record will be searched in the
memory blocks. When it is found, then it will be marked for deleting, and the new
record is inserted.
Insertion of the new record:
Suppose we have four records R1, R3 and so on upto R9 and R8 in a
sequence. Hence, records are nothing but a row in the table. Suppose we
want to insert a new record R2 in the sequence, then it will be placed at
the end of the file. Here, records are nothing but a row in any table.
Sorted File Method:
In this method, the new record is always inserted at the file's end, and then it will sort
the sequence in ascending or descending order. Sorting of records is based on any
primary key or any other key.
In the case of modification of any record, it will update the record and then sort the
file, and lastly, the updated record is placed in the right place.
Insertion of the new record:
• It contains a fast and efficient method for the huge amount of data.
• In this method, files can be easily stored in cheaper storage mechanism like
magnetic tapes.
• This method is used when most of the records have to be accessed like grade
calculation of a student, generating the salary slip, etc.
• It will waste time as we cannot jump on a particular record that is required but we
have to move sequentially which takes our time.
• Sorted file method takes more time and space for sorting the records.
Heap File Organization
It is the simplest and most basic type of organization. It works with data blocks. In
heap file organization, the records are inserted at the file's end. When the records
are inserted, it doesn't require the sorting and ordering of records.
When the data block is full, the new record is stored in some other block. This
new data block need not to be the very next data block, but it can select any data
block in the memory to store new records. The heap file is also known as an
unordered file.
In the file, every record has a unique id, and every page in a file is of the same
size. It is the DBMS responsibility to store and manage the new records.
Insertion of new record
Suppose we have five records R1, R3, R6, R4 and R5 in a heap and suppose we
want to insert a new record R2 in a heap. If the data block 3 is full then it will be
inserted in any of the database selected by the DBMS, let's say data block 1.
If we want to search, update or delete the data in heap file organization, then we
need to traverse the data from staring of the file till we get the requested record.
If the database is very large then searching, updating or deleting of record will be
time-consuming because there is no sorting or ordering of records. In the heap file
organization, we need to check all the data until we get the requested record.
Pros of Heap file organization
It is a very good method of file organization for bulk insertion. If there is a large
number of data which needs to load into the database at a time, then this method is
best suited.
In case of a small database, fetching and retrieving of records is faster than the
sequential record.
Cons of Heap file organization
• This method is inefficient for the large database because it takes time to search or modify
the record.
Hash File Organization uses the computation of hash function on some fields of the
records. The hash function's output determines the location of disk block where the
records are to be placed.
• When a record has to be received using the hash key columns, then the address is
generated, and the whole record is retrieved using that address.
• In the same way, when a new record has to be inserted, then the address is
generated using the hash key and record is directly inserted. The same process is
applied in the case of delete and update.
• In this method, there is no effort for searching and sorting the entire file. In this
method, each record will be stored randomly in the memory.
Indexing
• The index is a type of data structure. It is used to locate and access the data
in a database table quickly.
• The first column of the database is the search key that contains a copy of the
primary key or candidate key of the table. The values of the primary key are stored
in sorted order so that the corresponding data can be accessed easily.
• The second column of the database is the data reference. It contains a set of
pointers holding the address of the disk block where the value of the particular key
can be found.
Indexing Methods
Primary Index
• If the index is created on the primary key of the table then it is called as Primary
Indexing.
• If the index is created on the basis of the primary key of the table, then it is known as
primary indexing. These primary keys are unique to each record and contain 1:1 relation
between the records.
• As primary keys are stored in sorted order, the performance of the searching operation is
quite efficient.
The primary Indexing in DBMS is also further divided into two types.
• Dense Index
• Sparse Index
Dense Index
• The dense index contains an index record for every search key value
in the data file. It makes searching faster.
• In this, the number of records in the index table is same as the
number of records in the main table.
• It needs more space to store index record itself. The index records
have the search key and a pointer to the actual record on the disk.
Sparse index
• In the data file, index record appears only for a few items. Each item
points to a block.
• In this, instead of pointing to each record in the main table, the index
points to the records in the main table in a gap.