Facebook Pixel

What is Hashing?

Keys and Values

A hash table stores data as key-value pairs. The key identifies what you're looking for. The value is the data you want.

  • Customer lookup: Key = customer ID, Value = customer name
  • Word count: Key = word, Value = count
  • Phone book: Key = name, Value = phone number

You search by key. The hash table returns the value instantly.

The Hash Function

A hash function transforms a key into a number. This number tells you where to find the value in an array.

Hash Function Black Box

The hash function takes any key—a number, string, or object—and returns an integer. This integer becomes an array index.

Example: Hash function receives "Alice" and returns 42. The hash table stores Alice's data at array index 42.

Hash Code to Index

Hash codes are large numbers. Arrays have limited size. The modulo operation (%) converts any hash code to a valid array index.

index = hash_code % array_capacity

If the array has 10 slots, % 10 gives indices 0 through 9. Hash code 5,789,604,752 becomes index 2.

Key to Index Flow

1# Python's built-in hash function
2key = "Alice"
3hash_code = hash(key)
4print(hash_code)  # Large number like 5789604752029982571
5
6# Use hash code to find array index
7capacity = 10
8index = hash_code % capacity  # Index between 0-9

Python provides hash() for built-in types. The % operator converts large numbers to array indices.

Direct Access via Hash

Now you can jump directly to data:

  1. Hash the key to get a number
  2. Convert the number to an array index with %
  3. Look up the value at that index

Searching for "Alice"? Hash "Alice" to get index 2. Check array[2]. Done in O(1) time.

Hash Table Array

The Magic of O(1) Lookup

No matter how many items you store, lookups take constant time:

  • 10 items: 1 step (hash + array access)
  • 10,000 items: 1 step
  • 10,000,000 items: 1 step

Compare this to linear search that takes 5,000,000 steps on average for 10,000,000 items.

Collisions

One problem: different keys can produce the same index. Such as "Alice" and "Bob" both hash to index 2. This is a collision.

Hash tables handle collisions. We'll cover collision handling in a later article. For now, know that hash tables still achieve O(1) average performance despite collisions.

Invest in Yourself
Your new job is waiting. 83% of people that complete the program get a job offer. Unlock unlimited access to all content and features.
Go Pro