705. Design HashSet
Problem Description
The task is to design a data structure MyHashSet
that simulates the behavior of a set without using any built-in hash table libraries. A hash set is a collection of unique elements. The MyHashSet
class should support three operations:
add(key)
: Inserts the valuekey
into the HashSet.contains(key)
: Returnstrue
if thekey
exists in the HashSet andfalse
otherwise.remove(key)
: Removes thekey
from the HashSet. If thekey
does not exist, no action should be taken.
The challenge is to implement these functionalities manually, ensuring that the operations are as efficient as possible.
Intuition
To implement a HashSet, we can use an array where the indices represent the potential keys and the values at those indices represent whether the key is present in the set or not. Given the constraints, we can assume the set will only need to handle non-negative integer keys.
Here's the approach we can take:
-
Initialize: We create a large enough array to accommodate all possible keys. In this case, we initialize a Boolean array of size 1000001 (since the possible key range is from 0 to 1000000) and set all values to
False
indicating that initially, no keys are present in the HashSet. -
Add: To add a key, we simply set the value at the index corresponding to the key in the array to
True
. -
Remove: To remove a key, we set the value at the index corresponding to the key in the array back to
False
. -
Contains: To check if a key is in the set, we return the value at the index corresponding to the key in the array, which is
True
if the key is present andFalse
otherwise.
This approach is very direct and efficient, with all operations having a constant time complexity of O(1), which means the operation time does not depend on the size of the data in the HashSet.
Learn more about Linked List patterns.
Solution Approach
The solution involves three key steps that correspond to the three methods of the MyHashSet
class: add
, remove
, and contains
. Here’s how each method is implemented:
-
Initialization (
__init__
method): The solution begins by initializing an array (or list in Python) nameddata
to have a size of 1000001. This size is selected to ensure that any key within the expected range (0 to 1000000) can be directly mapped to an index of this array. Each element in thedata
array initially holds the valueFalse
, indicating that no keys are present in the HashSet to begin with.def __init__(self): self.data = [False] * 1000001
-
Adding a Key (
add
method): Adding a key simply involves updating the value at the index equal to the key toTrue
. This operation designates that the key is now present within the HashSet.def add(self, key: int) -> None: self.data[key] = True
-
Removing a Key (
remove
method): To remove a key, the solution sets the index equal to the key back toFalse
. This signifies that the key is no longer within the HashSet. If the key doesn't exist, this operation still sets the value toFalse
, which has no effect as the value is alreadyFalse
.def remove(self, key: int) -> None: self.data[key] = False
-
Checking Existence of a Key (
contains
method): To determine if a key is present in the HashSet, the method simply returns the Boolean value at the index equal to that key. If the value isTrue
, the key is present; otherwise, it's absent.def contains(self, key: int) -> bool: return self.data[key]
Each of the methods add
, remove
, and contains
operate in O(1) time which is constant time complexity. This efficiency is achieved as the solution directly accesses the array index without any iteration or searching overhead, making these operations extremely fast for any size of data held in the HashSet.
Ready to land your dream job?
Unlock your dream job with a 2-minute evaluator for a personalized learning plan!
Start EvaluatorExample Walkthrough
Let's illustrate the solution approach using a small example of operations being performed on the MyHashSet
data structure:
-
Initialization: Upon creation of a
MyHashSet
object, an arraydata
of size 1000001 is initialized with all values set toFalse
.myHashSet = MyHashSet()
initializes theMyHashSet
with an empty set.
-
Adding Keys: Suppose we want to add the keys
3
,5
, and8
to the HashSet.myHashSet.add(3)
setsdata[3]
toTrue
.myHashSet.add(5)
setsdata[5]
toTrue
.myHashSet.add(8)
setsdata[8]
toTrue
.- The array now has
True
at indices3
,5
, and8
, corresponding to the added keys.
-
Containing Key: We want to check if the keys
3
and7
are in the HashSet.myHashSet.contains(3)
returnsTrue
becausedata[3]
isTrue
, indicating that the key3
is present in the HashSet.myHashSet.contains(7)
returnsFalse
becausedata[7]
is stillFalse
, indicating that the key7
is not present in the HashSet.
-
Removing a Key: Now we decide to remove the key
5
from the HashSet.myHashSet.remove(5)
setsdata[5]
back toFalse
.- If we now call
myHashSet.contains(5)
, it will returnFalse
, indicating that the key5
is no longer in the HashSet.
This example shows how each operation is executed with constant time complexity as it involves a single step of accessing or setting a value at a specific index in the array.
Solution Implementation
1class MyHashSet:
2 def __init__(self):
3 # Initialize an array with 1000001 elements, setting all to False.
4 # The index represents the key, and the value at that index
5 # represents the presence (True) or absence (False) of the key.
6 self.data = [False] * 1000001
7
8 def add(self, key: int) -> None:
9 # Add the key to the hash set by setting the value at the index 'key' to True.
10 self.data[key] = True
11
12 def remove(self, key: int) -> None:
13 # Remove the key from the hash set by setting the value at the index 'key' to False.
14 self.data[key] = False
15
16 def contains(self, key: int) -> bool:
17 # Check if the key is in the hash set by returning the value at the index 'key'.
18 return self.data[key]
19
20# Usage:
21# Create an instance of MyHashSet
22# my_hash_set = MyHashSet()
23# Add a key
24# my_hash_set.add(some_key)
25# Remove a key
26# my_hash_set.remove(some_key)
27# Check if a key exists
28# is_present = my_hash_set.contains(some_key)
29
1// Class representing a simple hash set data structure for integers
2class MyHashSet {
3 // Boolean array to store the presence of an integer in the set
4 // The array is sized to store the maximum value + 1 as an index
5 private boolean[] data;
6
7 // Constructor initializes the data array
8 public MyHashSet() {
9 data = new boolean[1000001]; // Set all values to 'false' by default
10 }
11
12 // Method to add an integer to the set
13 // Sets the array value at the index 'key' to 'true'
14 public void add(int key) {
15 data[key] = true;
16 }
17
18 // Method to remove an integer from the set
19 // Sets the array value at the index 'key' to 'false'
20 public void remove(int key) {
21 data[key] = false;
22 }
23
24 // Method to check if an integer is present in the set
25 // Returns 'true' if the value at the index 'key' is 'true', otherwise 'false'
26 public boolean contains(int key) {
27 return data[key];
28 }
29}
30
31/**
32 * Usage:
33 * MyHashSet hashSet = new MyHashSet();
34 * hashSet.add(key); // Adds the item 'key' to the hash set
35 * hashSet.remove(key); // Removes 'key' from the set if it's present
36 * boolean doesContain = hashSet.contains(key); // Returns 'true' if 'key' is present in the set, otherwise 'false'
37 */
38
1class MyHashSet {
2private:
3 // Using a fixed-size array to store the presence of keys
4 bool data[1000001];
5
6public:
7 // Constructor initializes the hash set
8 MyHashSet() {
9 // Set all values in the data array to false initially
10 memset(data, false, sizeof(data));
11 }
12
13 // Inserts a key into the hash set
14 void add(int key) {
15 data[key] = true;
16 }
17
18 // Removes a key from the hash set, if it exists
19 void remove(int key) {
20 data[key] = false;
21 }
22
23 // Checks if a key exists in the hash set
24 bool contains(int key) const {
25 return data[key];
26 }
27};
28
29// Usage example (not part of the class definition):
30// MyHashSet* myHashSet = new MyHashSet();
31// myHashSet->add(key);
32// myHashSet->remove(key);
33// bool doesContain = myHashSet->contains(key);
34
1// Define a global data storage for the HashSet. The `!` indicates that
2// the variable will be definitely assigned later.
3let hashSetData: boolean[];
4
5// Initialize the data storage for the HashSet with false values.
6// This is a setup function that needs to be called to create the storage.
7function initializeMyHashSet(): void {
8 // Allocate an array of boolean values with default value `false`
9 // The size is 10^6 + 1 to hold values within the range [0, 10^6]
10 hashSetData = new Array(10 ** 6 + 1).fill(false);
11}
12
13// Add a key to the HashSet by setting the value at the index `key` to true
14function addKeyToHashSet(key: number): void {
15 hashSetData[key] = true;
16}
17
18// Remove a key from the HashSet by setting the value at the index `key` to false
19function removeKeyFromHashSet(key: number): void {
20 hashSetData[key] = false;
21}
22
23// Check if a key exists in the HashSet by returning the value at the index `key`
24function containsKeyInHashSet(key: number): boolean {
25 return hashSetData[key];
26}
27
28// Example usage of the global HashSet implementation:
29// Initialize the HashSet before making any operations
30initializeMyHashSet();
31
32// Add a key to the hash set
33addKeyToHashSet(1);
34addKeyToHashSet(2);
35
36// Check if the hash set contains a key
37let containsKey1 = containsKeyInHashSet(1); // should return true
38let containsKey3 = containsKeyInHashSet(3); // should return false
39
40// Remove a key from the hash set
41removeKeyFromHashSet(2);
42
Time and Space Complexity
Time Complexity
add
operation has a time complexity ofO(1)
because it accesses the array index directly and sets the value toTrue
.remove
operation also has a time complexity ofO(1)
for the same reason, accessing the array index directly and setting the value toFalse
.contains
operation has a time complexity ofO(1)
since it involves a single array lookup.
Each of the above operations perform in constant time irrespective of the number of elements in the hash set.
Space Complexity
- The space complexity is
O(N)
whereN = 1000001
. This is because a fixed array of size1000001
is allocated to store the elements of the hash set, independent of the actual number of elements stored at any time.
Learn more about how to find time and space complexity quickly using problem constraints.
Which of the following array represent a max heap?
Recommended Readings
Linked List Cycle Given a linked list with potentially a loop determine whether the linked list from the first node contains a cycle in it For bonus points do this with constant space Parameters nodes The first node of a linked list with potentially a loop Result Whether there is a loop contained
LeetCode Patterns Your Personal Dijkstra's Algorithm to Landing Your Dream Job The goal of AlgoMonster is to help you get a job in the shortest amount of time possible in a data driven way We compiled datasets of tech interview problems and broke them down by patterns This way we
Recursion Recursion is one of the most important concepts in computer science Simply speaking recursion is the process of a function calling itself Using a real life analogy imagine a scenario where you invite your friends to lunch https algomonster s3 us east 2 amazonaws com recursion jpg You first
Want a Structured Path to Master System Design Too? Don’t Miss This!