CMPS 340 File Processing
Extendible Hashing insertion/deletion examples

Suppose that we are using an extendible hash table with bucket size 2 and suppose that our hash function H is such that

   H(ANT)  = 1110...       H(DOG)   = 0101...       H(PIG)  = 1001...
   H(BEAR) = 0010...       H(ELK)   = 1000...       H(RAT)  = 0000...
   H(CAT)  = 1010...       H(GORN)  = 1010...       H(WOLF) = 0111...
   H(COW)  = 0001...       H(MOOSE) = 0001...  

At the present time, the hash table is as follows:

      Directory

         +----+                   +-----+-----+
(000)  0 |  *-+--->-------------> | COW | RAT | (000)
         +----+                   +-----+-----+
(001)  1 |  *-+--->-------------> +------+----+
         +----+                   | BEAR |    | (001) 
(010)  2 |  *-+--->---+           +------+----+
         +----+        \          +-----+-----+
(011)  3 |  *-+--->-----+-------> | DOG |     | (01)
         +----+                   +-----+-----+
(100)  4 |  *-+--->---+
         +----+        \          +-----+-----+
(101)  5 |  *-+--->-----+--->---> | CAT | ELK | (1)
         +----+        /          +-----+-----+
(110)  6 |  *-+--->---+  
         +----+      / 
(111)  7 |  *-+--->-+
         +----+ 

Each bucket has an associated label (or signature) indicating which cells in the directory point to it: namely, all those having an index whose binary representation has the label as a prefix.

For each of the following operations, apply it to the hash table above (not to the result of applying the previous operations) and show the hash table that results.

      (a)  Insert WOLF.              (b)  Insert ANT.
      (c)  Insert GORN.              (d)  Delete DOG. 
      (e)  Delete RAT.               (f)  Delete CAT.
      (g)  Insert MOOSE.

SOLUTIONS:

(a) Insert WOLF. WOLF fits quite nicely alongside DOG in the bucket with label 01. (Illustration omitted.)


(b) Insert ANT. This causes overflow of the bucket with label 1, and thus that bucket is split into buckets with labels 10 and 11, into which CAT and ELK are placed appropriately, after which we attempt to insert ANT again. Because 10 is a prefix of both H(CAT) and H(ELK), both of these animals are placed into the bucket with label 10, leaving the 11 bucket empty. Insertion of ANT now goes smoothly, as it belongs in the 11 bucket.

         +----+                  +-----+-----+
(000)  0 |  *-+--->------------> | COW | RAT | (000)
         +----+                  +-----+-----+
(001)  1 |  *-+--->------------> +------+---+
         +----+                  | BEAR |   |  (001)
(010)  2 |  *-+--->---+          +------+---+
         +----+        \         +-----+---+
(011)  3 |  *-+--->-----+------> | DOG |   | (01)
         +----+                  +-----+---+
(100)  4 |  *-+--->---+              
         +----+        \         +-----+-----+
(101)  5 |  *-+--->-----+------> | CAT | ELK | (10)
         +----+                  +-----+-----+
(110)  6 |  *-+--->---+
         +----+        \         +-----+---+
(111)  7 |  *-+--->-----+------> | ANT |   | (11)
         +----+                  +-----+---+

(c) Insert GORN. This causes overflow of the bucket with label 1, and thus that bucket is split into buckets with labels 10 and 11, into which CAT and ELK are placed appropriately, after which we attempt to insert GORN again. Because 10 is a prefix of both H(CAT) and H(ELK), both of these animals are placed into the bucket with label 10, leaving the 11 bucket empty. Attempting to insert GORN leads to splitting the 10 bucket into buckets with label 100 and 101. ELK is placed into the former and CAT into the latter. Attempting to insert GORN once again, we find room for him in the 101 bucket.

         +----+                  +-----+-----+
(000)  0 |  *-+--->------------> | COW | RAT | (000)
         +----+                  +-----+-----+
(001)  1 |  *-+--->------------> +------+---+
         +----+                  | BEAR |   |  (001)
(010)  2 |  *-+--->---+          +------+---+
         +----+        \         +-----+---+
(011)  3 |  *-+--->-----+------> | DOG |   | (01)
         +----+                  +-----+---+
(100)  4 |  *-+--->------------> +-----+---+
         +----+                  | ELK |   | (100)
(101)  5 |  *-+--->-----+        +-----+---+
         +----+          \       +-----+------+
(110)  6 |  *-+--->---+   +----> | CAT | GORN | (101)
         +----+        \         +-----+------+
(111)  7 |  *-+--->-----+------> +---+---+
         +----+                  |   |   | (11)
                                 +---+---+

(d) Delete DOG. Remove DOG from the 01 bucket. As there are no sibling buckets with which to combine it, we simply leave the 01 bucket empty. (Only a bucket with label 00 could be a "sibling" to the bucket with label 01, and there is no such bucket.) (Illustration omitted.)


(e) Delete RAT. Remove RAT from the 000 bucket. As the 000 and 001 buckets are "siblings" and the total # of entries in the two of them is now two, we can merge them into a 00 bucket containing COW and BEAR. Because now the maximum length of any bucket's label is two, we can halve the size of the directory, making its depth two. (In real life, we probably wouldn't merge two buckets unless the resulting bucket were somewhat less than full, because otherwise the resulting bucket would be likely to undergo a split in the near future.)

       +----+                  +-----+------+
(00) 0 |  *-+--->------------> | COW | BEAR | (00)
       +----+                  +-----+------+
(01) 1 |  *-+--->------------> +-----+---+
       +----+                  | DOG |   | (01)
(10) 2 |  *-+--->---+          +-----+---+
       +----+        \         +-----+-----+
(11) 3 |  *-+--->-----+------> | CAT | ELK | (1)
       +----+                  +-----+-----+

(f) Delete CAT. Remove CAT from the 1 bucket. There is no sibling bucket, so that is all we can do. (Illustration omitted.)


(g) Insert MOOSE. This causes overflow of the bucket with label 000. Because this bucket has depth 3, which corresponds to DIR_DEPTH, we double the size of the directory, making each entry in the new directory point to the correct bucket. Then we split the overflowing bucket into buckets with labels 0000 and 0001, into which COW and BEAR are placed appropriately. Then we attempt once more to insert MOOSE. This time, MOOSE fits nicely alongside COW in the 0001 bucket.

          +----+                   +-----+---+
(0000)  0 |  *-+--->-------------> | RAT |   | (0000)
          +----+                   +-----+---+
(0001)  1 |  *-+--->-------------> +-----+-------+
          +----+                   | COW | MOOSE | (0001) 
(0010)  2 |  *-+--->---+           +-----+-------+
          +----+        \          +------+---+
(0011)  3 |  *-+--->-----+-------> | BEAR |   | (001)
          +----+                   +------+---+
(0100)  4 |  *-+--->---+          
          +----+       v
(0101)  5 |  *-+--->---+  
          +----+       v           +-----+---+
(0110)  6 |  *-+--->---+---------> | DOG |   | (01)
          +----+       ^           +-----+---+
(0111)  7 |  *-+--->---+
          +----+
(1000)  8 |  *-+--->--+
          +----+      v
(1001)  9 |  *-+--->--+
          +----+      v
(1010) 10 |  *-+--->--+---->----> +-----+-----+
          +----+      ^           | CAT | ELK | (1)
(1011) 11 |  *-+--->--+           +-----+-----+
          +----+      ^   
(1100) 12 |  *-+--->--+ 
          +----+      ^
(1101) 13 |  *-+--->--+
          +----+      ^
(1110) 14 |  *-+--->--+
          +----+      ^
(1111) 15 |  *-+--->--+
          +----+