The Lookup Problem

Finding Data Fast

You have a list of 10,000 customer records. A customer calls asking about order #7842. You need to find that order. How long does it take?

If your data is in an array, you check each element one by one until you find order #7842. This is linear search. On average, you check half the array. With 10,000 records, that's 5,000 comparisons.

Linear Search is Slow

Linear search takes O(n) time. As your data grows, search time grows proportionally:

1,000 items: ~500 comparisons average
10,000 items: ~5,000 comparisons average
1,000,000 items: ~500,000 comparisons average

This gets slow fast. Modern applications handle millions of records. Users expect instant results.

Performance Scaling

1# Linear search through customer orders
2orders = [1234, 5678, 7842, 9012, ...]  # 10,000 orders
3
4target = 7842
5for i in range(len(orders)):
6    if orders[i] == target:
7        print(f"Found at index {i}")
8        break
9# Checked 3 items before finding target

Python checks each element sequentially until finding the target.

The Direct Access Dream

What if you could jump directly to the data you need? Like knowing exactly which file cabinet drawer contains a document, instead of opening every drawer.

Arrays give you this power when you know the index. orders[2] is instant—O(1). But you need to know that order #7842 lives at index 2.

Direct Access

The problem: order numbers don't match array indices. Order #7842 might be at index 2, index 5,000, or anywhere.

Hash Tables: The Solution

Hash tables solve this problem. They transform your search key (like order #7842) into an array index. This makes lookups nearly instant—O(1) on average.

Instead of checking 5,000 items, you check one. The hash table calculates where to look.

The next articles explain how hash tables achieve this magic.