# Hash Tables

A hash table (hash map) is a data structure that implements an associative array abstract data type, a structure that can map keys to values. A hash table uses a hash function to compute an index, also called a hash code, into an array of buckets or slots, from which the desired value can be found. [Wikipedia (opens new window)]

A great introductory video on hash tables and hash functions can be found here: Hash Tables and Hash Functions (opens new window)

# Hash Function

address (position) = key (value) MOD n (number of available adresses)

Objectives:

  • Minimize collisions
  • Uniform distribution of hash vlaues
  • Easy to calculate
  • Collisions should be resolvable

# Hash Collision

Hash function returns same address for different keys.

The fuller the hash table is, the more probable is a collision.

Each collision is computationally expensive.

Load factor: items / total size

Design hash table bigger than required to probably have less collisions.

Dynamic implementation: automatically increase hash table size if the load factor reaches a certain threshold.

# Hash Collision Resolution

# Open addressing

The index at which an object is stored in the hash table is not completely determined by its hash code. Instead, the index may vary depending on what's already in the hash table.

Every location is open for every item.

Linear Probing

  • Insert: use linear search to find next free location where item will be placed
  • Lookup: same procedure, look up hash value and continue linear search until item is found.

Plus 3 rehash

Same as Linear Probing, but only look at every third location.

Quadratic probing

Next position to try insertion is (failed attempts)^2.

Double Hashing

Uses a second hash function in case of a collision to determine a new insertion position.

# Closed Addressing

Every object is stored at the index the hash function determines for this object.

Chaining

every position is a linked list of elements that hash to this position. chaining is a closed addressing scheme (every item goes to one particular position. // advantage: worst case is much better