128. Longest Consecutive Sequence
Problem Description
You are given an unsorted array of integers called nums
. Your task is to find the length of the longest sequence of consecutive integers that can be formed using the elements from the array.
For example, if you have the array [100, 4, 200, 1, 3, 2]
, the longest consecutive sequence would be [1, 2, 3, 4]
, which has a length of 4.
Key requirements:
- The elements don't need to be consecutive in the original array - they just need to exist in the array
- You need to find consecutive integers (like 1, 2, 3, 4...)
- Return the length of the longest such sequence
- Your algorithm must run in
O(n)
time complexity, wheren
is the number of elements in the array
The challenge is to achieve this linear time complexity, which means you cannot sort the array first (as sorting would take O(n log n)
time).
Intuition
To achieve O(n)
time complexity, we need to avoid sorting. The key insight is to use a hash set for O(1)
lookups to check if a number exists in our array.
The main idea is to build consecutive sequences incrementally. For each number x
in the array, we can ask: "What's the longest consecutive sequence that starts from x
?" To find this, we keep checking if x+1
, x+2
, x+3
, and so on exist in our set until we find a number that doesn't exist.
However, there's a subtle optimization here. If we've already processed a number as part of a sequence, we don't need to process it again. So when we find consecutive numbers starting from x
, we remove them from our set to avoid redundant work.
The clever part of this solution is using dynamic programming with memoization. We use a hash table d
to store the length of the consecutive sequence for each processed endpoint. When we process a number x
and find its sequence extends to y-1
(meaning y
is the first number not in the sequence), we can say:
- The length of the sequence starting at
x
is(y - x)
plus any sequence that might continue fromy
- This is captured as
d[x] = d[y] + y - x
This way, if we later encounter a number that connects to a previously processed sequence, we can reuse the already computed length instead of recalculating it.
By processing each number at most once (removing it from the set when processed) and using hash operations that are O(1)
, we maintain the overall O(n)
time complexity.
Solution Approach
We implement the solution using hash tables to achieve O(n)
time complexity. Here's the step-by-step approach:
-
Initialize Data Structures:
- Create a hash set
s
containing all elements fromnums
forO(1)
lookup - Initialize
ans = 0
to track the maximum sequence length found - Create a hash table
d
usingdefaultdict(int)
to store the length of consecutive sequences
- Create a hash set
-
Process Each Element: For each element
x
in the array:a. Find the Sequence Endpoint:
- Start with
y = x
- While
y
exists in sets
:- Remove
y
froms
(to avoid reprocessing) - Increment
y
by 1
- Remove
- After the loop,
y
is the first number not in the sequence starting fromx
b. Calculate Sequence Length:
- The sequence from
x
extends toy-1
- The length is calculated as:
d[x] = d[y] + y - x
y - x
gives the length of the current sequenced[y]
adds any previously computed sequence starting fromy
c. Update Maximum:
- Update
ans = max(ans, d[x])
- Start with
-
Return Result: Return
ans
as the length of the longest consecutive sequence
Key Insights:
- By removing elements from set
s
as we process them, each element is examined exactly once, ensuringO(n)
time complexity - The hash table
d
enables us to chain sequences together efficiently. If we previously found a sequence starting at positiony
, we can reuse that result - The
defaultdict(int)
ensuresd[y]
returns 0 ify
hasn't been processed yet, which correctly handles standalone sequences
Example Walkthrough:
For nums = [100, 4, 200, 1, 3, 2]
:
- Processing
100
: finds sequence [100], length = 1 - Processing
4
: finds sequence [4], length = 1 (1,2,3 not processed yet) - Processing
200
: finds sequence [200], length = 1 - Processing
1
: finds sequence [1,2,3,4], removes all four from set,d[1] = d[5] + 5 - 1 = 0 + 4 = 4
- Processing
3
and2
: already removed from set, skipped in while loop - Result:
ans = 4
Ready to land your dream job?
Unlock your dream job with a 3-minute evaluator for a personalized learning plan!
Start EvaluatorExample Walkthrough
Let's walk through the solution with nums = [100, 4, 200, 1, 3, 2]
:
Initial Setup:
- Set
s = {100, 4, 200, 1, 3, 2}
- Hash table
d = {}
(empty) - Maximum length
ans = 0
Processing each element:
Step 1: Process x = 100
- Start with
y = 100
- Check if 100 is in set? Yes → Remove 100 from set, increment y to 101
- Check if 101 is in set? No → Stop
- Calculate:
d[100] = d[101] + (101 - 100) = 0 + 1 = 1
- Update:
ans = max(0, 1) = 1
- Set now:
s = {4, 200, 1, 3, 2}
Step 2: Process x = 4
- Start with
y = 4
- Check if 4 is in set? Yes → Remove 4 from set, increment y to 5
- Check if 5 is in set? No → Stop
- Calculate:
d[4] = d[5] + (5 - 4) = 0 + 1 = 1
- Update:
ans = max(1, 1) = 1
- Set now:
s = {200, 1, 3, 2}
Step 3: Process x = 200
- Start with
y = 200
- Check if 200 is in set? Yes → Remove 200 from set, increment y to 201
- Check if 201 is in set? No → Stop
- Calculate:
d[200] = d[201] + (201 - 200) = 0 + 1 = 1
- Update:
ans = max(1, 1) = 1
- Set now:
s = {1, 3, 2}
Step 4: Process x = 1
- Start with
y = 1
- Check if 1 is in set? Yes → Remove 1 from set, increment y to 2
- Check if 2 is in set? Yes → Remove 2 from set, increment y to 3
- Check if 3 is in set? Yes → Remove 3 from set, increment y to 4
- Check if 4 is in set? No (already removed) → Stop
- Calculate:
d[1] = d[4] + (4 - 1) = 1 + 3 = 4
- Note:
d[4] = 1
from Step 2, representing the sequence [4] - We're connecting sequences [1,2,3] + [4] = [1,2,3,4]
- Note:
- Update:
ans = max(1, 4) = 4
- Set now:
s = {}
(empty)
Step 5: Process x = 3
- Start with
y = 3
- Check if 3 is in set? No (already removed in Step 4) → Stop immediately
- No updates needed
Step 6: Process x = 2
- Start with
y = 2
- Check if 2 is in set? No (already removed in Step 4) → Stop immediately
- No updates needed
Final Result: ans = 4
The longest consecutive sequence is [1, 2, 3, 4] with length 4. The algorithm efficiently found this by processing each number once and connecting the sequences through the hash table d
.
Solution Implementation
1class Solution:
2 def longestConsecutive(self, nums: List[int]) -> int:
3 # Create a set for O(1) lookup of numbers
4 remaining_numbers = set(nums)
5
6 # Initialize the maximum sequence length
7 max_length = 0
8
9 # Dictionary to cache the length of sequences starting from each number
10 sequence_lengths = {}
11
12 # Process each number in the original array
13 for current_num in nums:
14 # Skip if this number was already processed as part of another sequence
15 if current_num not in remaining_numbers:
16 continue
17
18 # Find the end of the consecutive sequence starting from current_num
19 sequence_end = current_num
20 while sequence_end in remaining_numbers:
21 # Remove processed numbers to avoid reprocessing
22 remaining_numbers.remove(sequence_end)
23 sequence_end += 1
24
25 # Calculate the length of the sequence from current_num to sequence_end-1
26 # If sequence_end is already in the dictionary, we can reuse its cached length
27 current_length = (sequence_end - current_num) + sequence_lengths.get(sequence_end, 0)
28 sequence_lengths[current_num] = current_length
29
30 # Update the maximum length found so far
31 max_length = max(max_length, current_length)
32
33 return max_length
34
1class Solution {
2 public int longestConsecutive(int[] nums) {
3 // Create a set containing all numbers for O(1) lookup
4 Set<Integer> numSet = new HashSet<>();
5 for (int num : nums) {
6 numSet.add(num);
7 }
8
9 int maxLength = 0;
10 // Map to store the length of consecutive sequence starting from each number
11 Map<Integer, Integer> sequenceLengthMap = new HashMap<>();
12
13 for (int num : nums) {
14 // Start from current number and find the end of consecutive sequence
15 int currentNum = num;
16
17 // Keep incrementing while consecutive numbers exist in the set
18 while (numSet.contains(currentNum)) {
19 numSet.remove(currentNum);
20 currentNum++;
21 }
22
23 // currentNum is now the first number NOT in the sequence
24 // Calculate and store the length of sequence starting from num
25 // If currentNum already has a cached result, add it to avoid recalculation
26 int sequenceLength = (currentNum - num) + sequenceLengthMap.getOrDefault(currentNum, 0);
27 sequenceLengthMap.put(num, sequenceLength);
28
29 // Update the maximum length found so far
30 maxLength = Math.max(maxLength, sequenceLengthMap.get(num));
31 }
32
33 return maxLength;
34 }
35}
36
1class Solution {
2public:
3 int longestConsecutive(vector<int>& nums) {
4 // Create a set containing all unique numbers for O(1) lookup
5 unordered_set<int> numSet(nums.begin(), nums.end());
6
7 // Track the maximum consecutive sequence length found
8 int maxLength = 0;
9
10 // Map to store the length of consecutive sequence starting from each number
11 unordered_map<int, int> sequenceLengths;
12
13 // Process each number in the input array
14 for (int currentNum : nums) {
15 // Start from current number and find consecutive sequence
16 int nextNum = currentNum;
17
18 // Keep incrementing while consecutive numbers exist in the set
19 while (numSet.count(nextNum)) {
20 // Remove processed number from set to avoid reprocessing
21 numSet.erase(nextNum);
22 nextNum++;
23 }
24
25 // Calculate length of sequence starting from currentNum
26 // If nextNum already has a cached sequence length, add it to current calculation
27 // Otherwise, just use the difference (nextNum - currentNum)
28 int currentSequenceLength = (nextNum - currentNum);
29 if (sequenceLengths.count(nextNum)) {
30 currentSequenceLength += sequenceLengths[nextNum];
31 }
32
33 // Store the sequence length for current starting number
34 sequenceLengths[currentNum] = currentSequenceLength;
35
36 // Update maximum length if current sequence is longer
37 maxLength = max(maxLength, sequenceLengths[currentNum]);
38 }
39
40 return maxLength;
41 }
42};
43
1function longestConsecutive(nums: number[]): number {
2 // Create a set from the input array for O(1) lookup
3 const numSet = new Set(nums);
4
5 // Track the maximum length of consecutive sequence found
6 let maxLength = 0;
7
8 // Map to store the length of consecutive sequences starting from each number
9 const sequenceLengthMap = new Map<number, number>();
10
11 // Iterate through each number in the array
12 for (const num of nums) {
13 // Start from the current number and find consecutive sequence
14 let currentNum = num;
15
16 // Keep incrementing while consecutive numbers exist in the set
17 while (numSet.has(currentNum)) {
18 // Remove processed number from set to avoid reprocessing
19 numSet.delete(currentNum);
20 currentNum++;
21 }
22
23 // Calculate and store the length of sequence starting from 'num'
24 // If a sequence already exists starting from 'currentNum', add its length
25 const existingLength = sequenceLengthMap.get(currentNum) || 0;
26 const totalLength = existingLength + (currentNum - num);
27 sequenceLengthMap.set(num, totalLength);
28
29 // Update the maximum length found so far
30 maxLength = Math.max(maxLength, sequenceLengthMap.get(num)!);
31 }
32
33 return maxLength;
34}
35
Time and Space Complexity
The time complexity is O(n)
, where n
is the length of the array nums
.
Although there is a nested while loop structure, each element is processed at most twice - once when it's encountered in the outer loop and once when it's removed from the set in the inner while loop. Since each element can only be removed from the set once, the total number of operations across all iterations is bounded by 2n
, which simplifies to O(n)
.
The space complexity is O(n)
, where n
is the length of the array nums
.
This is because:
- The set
s
stores at mostn
unique elements from the input array - The dictionary
d
can store at mostn
key-value pairs (one for each unique starting point of a consecutive sequence) - Both data structures together require
O(n)
space
Common Pitfalls
Pitfall 1: Not Handling the Starting Point Correctly
The Problem: A common mistake is trying to find sequences by only looking forward from each number, without considering whether the current number is actually the start of a sequence. This can lead to redundant work and incorrect sequence building.
Incorrect Approach:
# WRONG: Processing every number as a potential sequence member
for num in nums:
count = 1
current = num
while current + 1 in num_set:
current += 1
count += 1
max_length = max(max_length, count)
This approach has a critical flaw: it processes subsequences multiple times. For example, with [1, 2, 3, 4]
, it would check sequences starting from 2, 3, and 4, not just from 1.
The Solution:
Only start building a sequence if the current number is actually the beginning of a sequence (i.e., num - 1
is not in the set):
for num in nums:
# Only start a sequence if this is the beginning
if num - 1 not in num_set:
current = num
count = 1
while current + 1 in num_set:
current += 1
count += 1
max_length = max(max_length, count)
Pitfall 2: Modifying the Set While Iterating Over It
The Problem: Attempting to iterate over the set while simultaneously removing elements from it:
# WRONG: This will raise RuntimeError for num in remaining_numbers: # Iterating over the set while num in remaining_numbers: remaining_numbers.remove(num) # Modifying during iteration num += 1
The Solution: Iterate over the original array instead of the set, then check membership and remove from the set:
# CORRECT: Iterate over the original array for num in nums: if num not in remaining_numbers: continue # Now safe to modify remaining_numbers while num in remaining_numbers: remaining_numbers.remove(num) num += 1
Pitfall 3: Overcomplicated Sequence Chaining Logic
The Problem:
The dictionary caching mechanism (sequence_lengths
) in the provided solution adds unnecessary complexity. The line:
current_length = (sequence_end - current_num) + sequence_lengths.get(sequence_end, 0)
This attempts to chain sequences, but since we're removing elements from the set as we process them, sequence_lengths.get(sequence_end, 0)
will almost always return 0, making the dictionary redundant.
The Solution: Simplify by removing the unnecessary dictionary:
class Solution:
def longestConsecutive(self, nums: List[int]) -> int:
if not nums:
return 0
num_set = set(nums)
max_length = 0
for num in num_set:
# Only start counting if this is the beginning of a sequence
if num - 1 not in num_set:
current_num = num
current_length = 1
# Count consecutive numbers
while current_num + 1 in num_set:
current_num += 1
current_length += 1
max_length = max(max_length, current_length)
return max_length
This cleaner approach maintains O(n) complexity while being more readable and less error-prone.
What is an advantages of top-down dynamic programming vs bottom-up dynamic programming?
Recommended Readings
Union Find Disjoint Set Union Data Structure Introduction Prerequisite Depth First Search Review problems dfs_intro Once we have a strong grasp of recursion and Depth First Search we can now introduce Disjoint Set Union DSU This data structure is motivated by the following problem Suppose we have sets of elements
Coding Interview Patterns Your Personal Dijkstra's Algorithm to Landing Your Dream Job The goal of AlgoMonster is to help you get a job in the shortest amount of time possible in a data driven way We compiled datasets of tech interview problems and broke them down by patterns This way
Recursion Recursion is one of the most important concepts in computer science Simply speaking recursion is the process of a function calling itself Using a real life analogy imagine a scenario where you invite your friends to lunch https assets algo monster recursion jpg You first call Ben and ask
Want a Structured Path to Master System Design Too? Don’t Miss This!