Discover the power of algorithms

Complexities of algorithm, thier applicability. Optimization techniques associated with diffrent algos.

Discover Power of Blogging

What is Blogging , is it a Dream or Passion or Award or Making Money ?

Think Diffrent, be creative

Let's see how much conclusion one can draw from it. This will help testing your creativity.

Discover the power of technology

Technolgy, Programming, Optimization , Gadgets and more...

Discover the power of Blogging

Google widgets and gadgets.

Showing posts with label Hashing. Show all posts
Showing posts with label Hashing. Show all posts

Sep 10, 2011

What is Hashing , HashTable, Hash Function and its collision resolution strategies

Hashing is the technique used for performing almost constant time search in case of insertion, deletion and find operation. Taking a very simple example of it, an array with its index as key is the example of hash table.
So each index (key) can be used for accessing the value in a constant search time. This mapping key must be simple to compute and must helping in identifying the associated value. Function which helps us in generating such kind of key-value mapping is known as Hash Function.

Hash Table a.k.a Hash Map is a data structure which uses hash function to generate key corresponding to the associated value.

lets look at some sample hash function for strings

Folding Method:-
int h(String x, int D)
{
     int i, sum;
     for (sum=0, i=0; i<x.length(); i++)
         sum+= (int)x.charAt(i);
     return (sum%D);
}

Cyclic Shift :-
static long hashCode(String key, int D)
{
  int h=0;
  for (int i=0, i<key.length(); i++)
  {
        h = (h << 4) | ( h >> 27);
        h += (int) key.charAt(i);
  }
  return h%D;
}



good link for hash function on string : click here

Coming to very important part of hashing , which is collision resolution. Since its always not possible to design perfect hash function with minimal overhead which would generate unique key. To address this problem following are the two main collision resolving techniques :-
1) Open Hashing also known as separate chaining 
2) Closed Hashing also known as open addressing

Lets understand the difference between them
1) Open Hashing :- In this strategy collision is resolved by keeping the conflicting element in a list. That is to keep all element in a list which generate same hash.
Open Hashing

From above figure its clear that how collision get resolved by keeping a linked list.

2) Closed Hashing :- In this strategy collision is resolved by placing the conflicting element near to the slot generated by the hash function. Associated with closed hashing is a rehash strategy:
     “If we try to place x in bucket h(x) and find it occupied, find alternative location h1(x), h2(x), etc. Try each in order, if none empty table is full,”
Lets take an example to understand it

HASH_TABLE_SIZE = 8
Input data :- a,b,c,d   Hash for them H(a) = 0, H(b) = 3, H(c) = 7 and  H(d) = 3

Now as 'c' and 'd' has same hash, where to insert 'd' then ?
Finding position using linear hashing :
h1(d) = (h(d)+1)%8 = 4%8 = 4

Adding 1 to hash function of h(d) we get new position 4, and slot 4 is currently non occupied. So entering d at position 4. In this way Closed hashing works.

Disadvantage of closed hashing is that it consumes more space as compared to open hashing 
also it has less flexibility in accommodating for duplicate hash element.
Major advantage of closed hashing is that it reduces the overhead of introducing new data structure and reduces cost of new memory allocation per new element insertion.