0% found this document useful (0 votes)
7 views5 pages

Assignment 6

The document discusses hashing, its advantages, and various hashing functions, including the Division Method, Multiplication Method, and Mid-Square Method. It also covers hash clashes, their resolution techniques such as chaining, open addressing, and rehashing, as well as concepts like primary and secondary clustering. Additionally, it provides examples of inserting keys into hash tables using linear probing and chaining methods.

Uploaded by

sunilvirdi225
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
7 views5 pages

Assignment 6

The document discusses hashing, its advantages, and various hashing functions, including the Division Method, Multiplication Method, and Mid-Square Method. It also covers hash clashes, their resolution techniques such as chaining, open addressing, and rehashing, as well as concepts like primary and secondary clustering. Additionally, it provides examples of inserting keys into hash tables using linear probing and chaining methods.

Uploaded by

sunilvirdi225
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 5

ASSIGNMENT 6

Q1.What is Hashing and Advantages of Hashing?Explain any three Hashing functions.

Hashing is a process of converting data (like a string, number, or even file) into a fixed-size value or a "hash" using a
mathematical algorithm, known as a hash function. The resulting hash is unique to the original input, making it useful for
data retrieval, encryption, and ensuring data integrity. In hash-based data structures like hash tables, hashing allows for fast
data access and search operations.

Advantages of Hashing:

1. Efficient Data Retrieval: Hashing enables faster data access by generating a unique key for each data item. This
allows for constant-time complexity O(1)O(1)O(1) for search, insert, and delete operations, which is efficient in
scenarios involving large data sets.
2. Data Integrity and Security: Hashing is widely used in security for password storage and data verification. A
unique hash for each input ensures that unauthorized changes to data can be detected since even a minor change in
input results in a completely different hash.
3. Space Efficiency: Hash tables can store a large amount of data efficiently by reducing the need for complex tree or
list structures, allowing for faster access without using significant memory.

Three Common Hashing Functions:

1. Division Method (Modulus Method):


o This hashing function takes a key and divides it by a prime number, returning the remainder as the hash
value.
o Formula: h(k) = k % m, where k is the key and m is the size of the hash table.
o Example: If k = 123 and m = 10, the hash value is 123 % 10 = 3.
o This method is simple and effective, but it can lead to clustering if the choice of m is not optimal
(typically, m should be a prime number).
2. Multiplication Method:
o This method multiplies the key by a constant A (where 0<A<10 ) and takes the fractional part, which is
then multiplied by the table size mmm to get the hash value.
o Formula: h(k) = floor(m * (k * A % 1)), where % 1 extracts the fractional part of the
product.
o Example: With A = 0.618033, k = 123, and m = 10, this method would calculate a hash value
using the fractional result of k×Ak \times Ak×A.
o This method is less susceptible to clustering and distributes hash values more evenly.
3. Mid-Square Method:
o This function squares the key and extracts the middle portion of the result as the hash value.
o Steps: Square the key, and if the table size mmm has a fixed number of digits, take the middle digits of
the squared value as the hash.
o Example: If k = 12, then 122=144. For a hash table of size 10, the middle digit 4 could be used.
o This method works well in some cases but may be less effective for large datasets or keys that don’t
generate unique middle digits.

Q2.List the features of a good hash function

Here are the key features of a good hash function, simplified:

1. Uniform Distribution: Distributes data evenly to reduce collisions.


2. Deterministic: Always produces the same output for the same input.
3. Fast Computation: Allows quick hashing for large datasets.
4. Avalanche Effect: Small input changes lead to big changes in the output.

These qualities help ensure both performance and security in hashing applications.
Q3. Explain hash clash and its resolving techniques.
A hash clash (or hash collision) occurs when two different keys produce the same hash value, resulting in both keys being
mapped to the same index in a hash table. Collisions are inevitable in hashing since multiple keys can potentially map to a
limited number of indices in the hash table. Resolving collisions effectively is crucial to maintain the efficiency of hash-
based data structures.

Techniques to Resolve Hash Collisions:

1. Chaining (Separate Chaining):


o In chaining, each index of the hash table points to a list (or chain) of all entries that hash to the same
index.
o When a collision occurs, the new entry is simply added to the list at that index.
o Advantages: Easy to implement; handles large numbers of collisions effectively.
o Disadvantages: Performance degrades with many collisions, as each index becomes a list that takes
longer to search.
2. Open Addressing:
o In open addressing, all elements are stored within the hash table itself, and no additional data structures
(like lists) are used. Instead, when a collision occurs, the algorithm searches for the next available slot
based on a specific probing sequence.
o Types of Open Addressing:
▪ Linear Probing: When a collision occurs, it checks the next slot in sequence (index + 1, index
+ 2, etc.) until an empty slot is found.
▪ Quadratic Probing: Instead of checking the next slot in sequence, it checks slots by a
quadratic formula (index + 1², index + 2², etc.), reducing clustering issues.
▪ Double Hashing: Uses a second hash function to determine the jump size when a collision
occurs, making the probing sequence more varied and reducing clustering.
o Advantages: Keeps data within the main hash table, avoiding additional memory for lists.
o Disadvantages: Sensitive to load factors, and performance may degrade as the table fills up.
3. Rehashing:
o When the hash table becomes too full or collisions occur frequently, the hash table is resized, and all
existing entries are rehashed into a larger table.

Q4. Define Hash Clash. Explain Primary Clustering, secondary clustering, rehashing and double hashing.

Hash Clash (Hash Collision) is a situation in hashing where two different keys produce the same hash value and are
mapped to the same index in a hash table. Collisions are common because a hash function maps a large set of possible keys
to a smaller set of indices.

Types of Clustering

1. Primary Clustering:
o Primary clustering occurs in hashing when multiple elements are hashed to the same index, and the
chosen collision resolution method causes them to occupy contiguous slots in the hash table.
o This commonly happens in linear probing, where, after a collision, the algorithm checks the next
sequential slot (index + 1) until it finds an empty space.
o Problem: Primary clustering creates "clusters" or groups of occupied slots that get larger as more
elements are inserted, increasing the likelihood of collisions for future inserts in nearby slots.
2. Secondary Clustering:
o Secondary clustering happens when keys that hash to the same initial position follow the same probing
sequence, leading to repeated clustering patterns for keys with similar hashes.
o This can occur in quadratic probing, where, despite a non-linear search pattern, keys with the same
hash still end up in predictable positions.
o Problem: It can still lead to clusters in specific areas of the hash table, though not as severely as primary
clustering.

Collision Resolution Techniques

1. Rehashing:
o Rehashing involves creating a new, larger hash table and rehashing all the existing elements into it when
the current table becomes too full or collisions are too frequent.
o How It Works: The hash function is usually adjusted, and each element in the old table is reinserted into
the new one according to the new hash function. This reduces the load factor, minimizing collisions.
o Advantages: Reduces clustering and improves access time.
o Disadvantages: Computationally expensive as all elements need to be rehashed.
2. Double Hashing:
o Double hashing uses a second hash function to calculate the interval for probing, reducing clustering
issues by introducing more randomness in slot selection after a collision.
o How It Works: The formula for finding the next slot after a collision is hash1(key) + i *
hash2(key), where i is the number of attempts and hash2(key) is the second hash function.
o Advantages: Significantly reduces both primary and secondary clustering by spreading out keys more
effectively.
o Disadvantages: Slightly more complex to implement, and finding a suitable second hash function can be
challenging.

Q5. What is Collision ? explain Collision Resolution techniques with suitable examples.

Collision in hashing occurs when two distinct keys produce the same hash value, leading them to map to the same index in a
hash table. Since only one item can be stored per index in a standard hash table, this conflict needs to be resolved to maintain
data integrity and retrieval efficiency.

Collision Resolution Techniques

1. Chaining (Separate Chaining):


o In chaining, each index in the hash table points to a linked list (or chain) of entries that hash to the same
index.
o When a collision occurs, the new entry is simply added to the list at that index.
o Example:
▪ Suppose we have a hash table of size 5 and a hash function h(k) = k % 5.
▪ If we insert keys 12, 22, and 32, they all hash to index 2 (12 % 5 = 2, 22 % 5 = 2, 32
% 5 = 2).
▪ Using chaining, all three keys (12, 22, and 32) will be stored at index 2 in a linked list.
o Pros: Simple to implement, handles many collisions effectively.
o Cons: If the linked list at any index grows too large, search operations can slow down.
2. Open Addressing:
o In open addressing, all elements are stored within the hash table itself. When a collision occurs, a
probing sequence is used to find the next available slot within the table.
o Types of Open Addressing:
▪ Linear Probing: When a collision occurs, the algorithm checks the next slot (index + 1, index
+ 2, etc.) until it finds an empty one.
▪ Example: Using a hash table of size 5 and h(k) = k % 5, inserting keys 12 and
22 both hash to index 2. If index 2 is occupied, linear probing checks index 3 next.
▪ Quadratic Probing: Instead of moving to the next slot in sequence, it searches for an empty
slot using a quadratic formula (index + 1², index + 2², etc.).
▪ Example: For keys that hash to index 2, quadratic probing would try slots at indices
3 (2 + 1²), 6 (2 + 2²), and so on.
▪ Double Hashing: Uses a second hash function to determine the interval for probing, creating a
more varied probing sequence.
▪ Example: If the primary hash function is h1(k) = k % 5 and a secondary hash
function is h2(k) = 1 + (k % 4), double hashing could produce a sequence
based on both h1 and h2.
o Pros: Keeps all data within the main hash table, avoiding extra memory.
o Cons: Performance degrades as the table fills up, and load factor becomes critical for efficient
operations.
3. Rehashing:
o When the hash table becomes too full (typically at a certain load factor), a new, larger table is created,
and all elements are rehashed and inserted into it.
o Example: If a hash table of size 5 reaches a 70% load factor, it might be resized to 10. All entries would
be rehashed and reinserted according to the new hash function.
o Pros: Reduces the load factor and clustering, improving efficiency.
o Cons: Expensive operation, as all elements must be rehashed, which can be computationally intensive.
4. Hash Function Modification:
o Modifying the hash function or using a second hash function can reduce the probability of collisions.
o Example: If a simple modulus function causes many collisions, a hash function with a multiplier or
more complex formula can distribute values more evenly.
o Pros: Fewer collisions if designed well.
o Cons: Complex to implement and may require fine-tuning for different datasets.
Q6. The keys 12, 18, 13, 2, 3, 23, 5 and 15 are inserted into an initially empty hash table of length 10 using open
addressing with hash function h(k) = k mod 10 and linear probing. What is the resultant hash table?

Let's insert each key into the hash table of length 10 using open addressing with linear probing and the hash function
h(k)=kmod 10h(k) = k \mod 10h(k)=kmod10. The table initially starts empty, and if a collision occurs, we use linear probing
to find the next available slot.

Given keys: 12, 18, 13, 2, 3, 23, 5, and 15


Hash function: h(k)=kmod 10h(k) = k \mod 10h(k)=kmod10

Let's insert each key step-by-step:

1. Insert 12:
o h(12)=12mod 10=2h(12) = 12 \mod 10 = 2h(12)=12mod10=2
o Slot 2 is empty, so insert 12 at index 2.
2. Insert 18:
o h(18)=18mod 10=8h(18) = 18 \mod 10 = 8h(18)=18mod10=8
o Slot 8 is empty, so insert 18 at index 8.
3. Insert 13:
o h(13)=13mod 10=3h(13) = 13 \mod 10 = 3h(13)=13mod10=3
o Slot 3 is empty, so insert 13 at index 3.
4. Insert 2:
o h(2)=2mod 10=2h(2) = 2 \mod 10 = 2h(2)=2mod10=2
o Slot 2 is already occupied by 12, so we use linear probing to find the next available slot.
o Slot 3 is also occupied, so we check slot 4, which is empty.
o Insert 2 at index 4.
5. Insert 3:
o h(3)=3mod 10=3h(3) = 3 \mod 10 = 3h(3)=3mod10=3
o Slot 3 is occupied by 13, so we use linear probing.
o Slots 4 and 5 are checked sequentially; slot 5 is empty.
o Insert 3 at index 5.
6. Insert 23:
o h(23)=23mod 10=3h(23) = 23 \mod 10 = 3h(23)=23mod10=3
o Slot 3 is occupied by 13. Using linear probing, we check slots 4, 5, and 6, which is empty.
o Insert 23 at index 6.
7. Insert 5:
o h(5)=5mod 10=5h(5) = 5 \mod 10 = 5h(5)=5mod10=5
o Slot 5 is occupied by 3, so we use linear probing.
o Slot 6 is occupied, so we check slot 7, which is empty.
o Insert 5 at index 7.
8. Insert 15:
o h(15)=15mod 10=5h(15) = 15 \mod 10 = 5h(15)=15mod10=5
o Slot 5 is occupied by 3. Using linear probing, we check slots 6, 7, 8, and finally slot 9, which is empty.
o Insert 15 at index 9.

Final Hash Table


Q7.The integers given below are to be inserted in a hash table with 5 locations using chaining to
resolve collisions. Construct hash table and use simplest hash function.
1,2,3,4,5,10,21,22,33,34,15,32,31,48,49,50

To construct the hash table with chaining to resolve collisions, we'll use the simplest hash function, which is:

h(k)=kmod 5h(k) = k \mod 5h(k)=kmod5

This means each integer will be placed in a slot based on the remainder when divided by 5. Since collisions are resolved by
chaining, each slot in the table will contain a linked list to store multiple values if needed.

Let's go through each integer and insert it into the appropriate slot in the hash table.

Given integers: 1, 2, 3, 4, 5, 10, 21, 22, 33, 34, 15, 32, 31, 48, 49, 50

Step-by-Step Insertion

1. h(1)=1mod 5=1
o Insert 1 in slot 1.
2. h(2)=2mod 5=2
o Insert 2 in slot 2.
3. h(3)=3mod 5=3
o Insert 3 in slot 3.
4. h(4)=4mod 5=4
o Insert 4 in slot 4.
5. h(5)=5mod 5=0
o Insert 5 in slot 0.
6. h(10)=10mod 5=0
o Slot 0 already has 5, so add 10 to the chain at slot 0.
7. h(21)=21mod 5=1
o Slot 1 already has 1, so add 21 to the chain at slot 1.
8. h(22)=22mod 5=2
o Slot 2 already has 2, so add 22 to the chain at slot 2.
9. h(33)=33mod 5=3
o Slot 3 already has 3, so add 33 to the chain at slot 3.
10. h(34)=34mod 5=4
o Slot 4 already has 4, so add 34 to the chain at slot 4.
11. h(15)=15mod 5=0
o Slot 0 has a chain of [5, 10], so add 15 to the chain at slot 0.
12. h(32)=32mod 5=2
o Slot 2 has a chain of [2, 22], so add 32 to the chain at slot 2.
13. h(31)=31mod 5=1
o Slot 1 has a chain of [1, 21], so add 31 to the chain at slot 1.
14. h(48)=48mod 5=3
o Slot 3 has a chain of [3, 33], so add 48 to the chain at slot 3.
15. h(49)=49mod 5=4
o Slot 4 has a chain of [4, 34], so add 49 to the chain at slot 4.
16. h(50)=50mod 5=0
o Slot 0 has a chain of [5, 10, 15], so add 50 to the chain at slot 0.

Final Hash Table with Chaining

You might also like

pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy