Facebook Pixel

340. Longest Substring with At Most K Distinct Characters 🔒

MediumHash TableStringSliding Window
Leetcode Link

Problem Description

You are given a string s and an integer k. Your task is to find the length of the longest substring within s that contains at most k distinct characters.

A substring is a contiguous sequence of characters within the string. For example, if s = "eceba" and k = 2, you need to find the longest substring that has no more than 2 different characters.

Let's break down what this means:

  • You need to examine all possible substrings of the input string s
  • Count the number of distinct/unique characters in each substring
  • Keep only those substrings where the distinct character count is at most k (less than or equal to k)
  • Return the length of the longest valid substring

For instance:

  • If s = "eceba" and k = 2, the substring "ece" has 2 distinct characters ('e' and 'c'), making it valid with length 3
  • If s = "aa" and k = 1, the entire string has only 1 distinct character ('a'), so the answer would be 2

The solution uses a sliding window technique with a hash table to efficiently track the distinct characters in the current window. The window expands by moving the right boundary and contracts from the left when the distinct character count exceeds k. The final answer is calculated as the length of the string minus the final position of the left boundary, which gives us the maximum valid window size encountered.

Quick Interview Experience
Help others by sharing your interview experience
Have you seen this problem before?

Intuition

When we need to find the longest substring with a certain property, the sliding window technique often comes to mind. The key insight here is that we want to maintain a window that always satisfies our constraint (at most k distinct characters).

Think about it this way: if we have a valid window with at most k distinct characters, we want to expand it as much as possible to maximize its length. We keep expanding the right boundary by including more characters. However, once we include a character that causes us to exceed k distinct characters, we need to shrink from the left.

The clever part of this solution is recognizing that we don't need to track the maximum length explicitly. Instead, we can maintain a window that grows to the maximum possible size and then slides along the string while maintaining that size (or growing larger if possible).

Here's the key observation: once we find a valid window of a certain size, we never need to consider windows smaller than that size again. So when we need to shrink the window due to having too many distinct characters, we only shrink it by exactly one position from the left. This ensures our window either maintains its current size or grows larger.

By the end of the iteration, the window size (which is len(s) - l where l is the left boundary) represents the maximum valid window we've encountered. The hash table cnt helps us efficiently track how many distinct characters are in our current window, and we remove characters from it when their count drops to zero as we slide the left boundary forward.

This approach is optimal because we visit each character at most twice (once when expanding right, once when shrinking from left), giving us O(n) time complexity.

Solution Approach

The solution uses a sliding window technique combined with a hash table to efficiently solve this problem. Let's walk through the implementation step by step:

Data Structures Used:

  • A hash table cnt (Counter in Python) to track the frequency of each character in the current window
  • A variable l to mark the left boundary of the sliding window

Algorithm Steps:

  1. Initialize the window: Start with l = 0 as the left boundary and an empty Counter cnt to track character frequencies.

  2. Expand the window: Iterate through each character c in the string s (this implicitly represents the right boundary of our window):

    for c in s:
        cnt[c] += 1

    Add each character to our frequency counter.

  3. Check constraint and shrink if needed: After adding a character, check if we've exceeded k distinct characters:

    if len(cnt) > k:
        cnt[s[l]] -= 1
        if cnt[s[l]] == 0:
            del cnt[s[l]]
        l += 1
    • If len(cnt) > k, we have too many distinct characters
    • Decrease the count of the character at position l (left boundary)
    • If that character's count becomes 0, remove it from the hash table entirely
    • Move the left boundary one position to the right
  4. Calculate the result: After processing all characters, the maximum window size is:

    return len(s) - l

    This works because the window maintains the maximum valid size it has seen. The final window spans from position l to the end of the string.

Why this works:

  • The window never shrinks below the maximum valid size found so far
  • When we encounter too many distinct characters, we only shrink by exactly one position
  • This ensures we're always maintaining or growing toward the maximum possible window size
  • The time complexity is O(n) since each character is visited at most twice
  • The space complexity is O(k) for storing at most k+1 distinct characters in the hash table

Ready to land your dream job?

Unlock your dream job with a 5-minute evaluator for a personalized learning plan!

Start Evaluator

Example Walkthrough

Let's walk through the solution with s = "eceba" and k = 2:

Initial State:

  • l = 0 (left boundary)
  • cnt = {} (empty counter)

Step 1: Process 'e' at index 0

  • Add 'e' to counter: cnt = {'e': 1}
  • Check: len(cnt) = 1 ≤ 2 ✓ (no shrinking needed)
  • Current window: "e" (from index 0 to 0)

Step 2: Process 'c' at index 1

  • Add 'c' to counter: cnt = {'e': 1, 'c': 1}
  • Check: len(cnt) = 2 ≤ 2 ✓ (no shrinking needed)
  • Current window: "ec" (from index 0 to 1)

Step 3: Process 'e' at index 2

  • Add 'e' to counter: cnt = {'e': 2, 'c': 1}
  • Check: len(cnt) = 2 ≤ 2 ✓ (no shrinking needed)
  • Current window: "ece" (from index 0 to 2)

Step 4: Process 'b' at index 3

  • Add 'b' to counter: cnt = {'e': 2, 'c': 1, 'b': 1}
  • Check: len(cnt) = 3 > 2 ✗ (need to shrink!)
  • Shrink from left:
    • Decrease count of s[0] which is 'e': cnt = {'e': 1, 'c': 1, 'b': 1}
    • Move l from 0 to 1
  • Current window: "ceb" (from index 1 to 3)

Step 5: Process 'a' at index 4

  • Add 'a' to counter: cnt = {'e': 1, 'c': 1, 'b': 1, 'a': 1}
  • Check: len(cnt) = 4 > 2 ✗ (need to shrink!)
  • Shrink from left:
    • Decrease count of s[1] which is 'c': cnt = {'e': 1, 'c': 0, 'b': 1, 'a': 1}
    • Since cnt['c'] = 0, delete 'c': cnt = {'e': 1, 'b': 1, 'a': 1}
    • Move l from 1 to 2
  • Current window: "eba" (from index 2 to 4)
  • Check again: len(cnt) = 3 > 2 ✗ (still need to shrink!)
  • Shrink from left again:
    • Decrease count of s[2] which is 'e': cnt = {'e': 0, 'b': 1, 'a': 1}
    • Since cnt['e'] = 0, delete 'e': cnt = {'b': 1, 'a': 1}
    • Move l from 2 to 3
  • Final window: "ba" (from index 3 to 4)

Result Calculation:

  • Final l = 3
  • Result = len(s) - l = 5 - 3 = 2

Wait, this doesn't match our expected answer of 3. Let me reconsider the algorithm...

Actually, I need to correct my understanding. The algorithm only shrinks by exactly one position when we exceed k distinct characters. Let me redo step 5:

Step 5: Process 'a' at index 4

  • Add 'a' to counter: cnt = {'e': 1, 'c': 1, 'b': 1, 'a': 1}
  • Check: len(cnt) = 4 > 2 ✗ (need to shrink!)
  • Shrink from left (only once):
    • Decrease count of s[1] which is 'c': cnt = {'e': 1, 'c': 0, 'b': 1, 'a': 1}
    • Since cnt['c'] = 0, delete 'c': cnt = {'e': 1, 'b': 1, 'a': 1}
    • Move l from 1 to 2
  • Current window: "eba" (from index 2 to 4)

Result Calculation:

  • Final l = 2
  • Result = len(s) - l = 5 - 2 = 3 ✓

The algorithm found that the longest substring with at most 2 distinct characters has length 3, which corresponds to "ece" (one of the valid substrings).

Solution Implementation

1from collections import Counter
2
3class Solution:
4    def lengthOfLongestSubstringKDistinct(self, s: str, k: int) -> int:
5        # Initialize left pointer for sliding window
6        left = 0
7      
8        # Counter to track character frequencies in current window
9        char_count = Counter()
10      
11        # Iterate through string with implicit right pointer
12        for char in s:
13            # Add current character to window
14            char_count[char] += 1
15          
16            # If window has more than k distinct characters, shrink from left
17            if len(char_count) > k:
18                # Decrease count of leftmost character
19                char_count[s[left]] -= 1
20              
21                # Remove character from counter if count reaches 0
22                if char_count[s[left]] == 0:
23                    del char_count[s[left]]
24              
25                # Move left pointer forward
26                left += 1
27      
28        # Return length of longest valid substring (from left to end)
29        return len(s) - left
30
1class Solution {
2    public int lengthOfLongestSubstringKDistinct(String s, int k) {
3        // Map to store character frequencies in the current window
4        Map<Character, Integer> charFrequency = new HashMap<>();
5      
6        // Left pointer of the sliding window
7        int leftPointer = 0;
8      
9        // Convert string to char array for easier access
10        char[] characters = s.toCharArray();
11      
12        // Iterate through each character with right pointer (implicitly)
13        for (char currentChar : characters) {
14            // Add current character to the window and increment its frequency
15            charFrequency.merge(currentChar, 1, Integer::sum);
16          
17            // If we have more than k distinct characters, shrink the window from left
18            if (charFrequency.size() > k) {
19                // Decrement the frequency of the character at left pointer
20                char leftChar = characters[leftPointer];
21                if (charFrequency.merge(leftChar, -1, Integer::sum) == 0) {
22                    // Remove the character from map if its frequency becomes 0
23                    charFrequency.remove(leftChar);
24                }
25                // Move left pointer to shrink the window
26                leftPointer++;
27            }
28        }
29      
30        // The maximum window size is the total length minus the left pointer position
31        // This works because we only move left pointer when necessary to maintain at most k distinct chars
32        return characters.length - leftPointer;
33    }
34}
35
1class Solution {
2public:
3    int lengthOfLongestSubstringKDistinct(string s, int k) {
4        // Map to store character frequencies in the current window
5        unordered_map<char, int> charCount;
6      
7        // Left pointer of the sliding window
8        int left = 0;
9      
10        // Iterate through each character in the string (right pointer)
11        for (char& currentChar : s) {
12            // Add current character to the window
13            ++charCount[currentChar];
14          
15            // If we have more than k distinct characters, shrink the window from left
16            if (charCount.size() > k) {
17                // Decrease count of the leftmost character
18                if (--charCount[s[left]] == 0) {
19                    // Remove character from map if its count becomes 0
20                    charCount.erase(s[left]);
21                }
22                // Move left pointer to the right
23                ++left;
24            }
25        }
26      
27        // The longest substring length is the remaining window size
28        return s.size() - left;
29    }
30};
31
1/**
2 * Finds the length of the longest substring that contains at most k distinct characters
3 * @param s - The input string to search within
4 * @param k - The maximum number of distinct characters allowed
5 * @returns The length of the longest valid substring
6 */
7function lengthOfLongestSubstringKDistinct(s: string, k: number): number {
8    // Map to store character frequencies in the current window
9    const charFrequencyMap: Map<string, number> = new Map();
10  
11    // Left pointer of the sliding window
12    let leftPointer: number = 0;
13  
14    // Iterate through each character using the right pointer (implicit)
15    for (const currentChar of s) {
16        // Add current character to the window and update its frequency
17        charFrequencyMap.set(currentChar, (charFrequencyMap.get(currentChar) ?? 0) + 1);
18      
19        // If we have more than k distinct characters, shrink the window from left
20        if (charFrequencyMap.size > k) {
21            // Get the character at the left pointer
22            const leftChar: string = s[leftPointer];
23          
24            // Decrease frequency of the leftmost character
25            charFrequencyMap.set(leftChar, charFrequencyMap.get(leftChar)! - 1);
26          
27            // Remove character from map if its frequency becomes 0
28            if (charFrequencyMap.get(leftChar) === 0) {
29                charFrequencyMap.delete(leftChar);
30            }
31          
32            // Move left pointer to shrink the window
33            leftPointer++;
34        }
35    }
36  
37    // Return the length of the valid window (total length - left pointer position)
38    return s.length - leftPointer;
39}
40

Time and Space Complexity

Time Complexity: O(n), where n is the length of the string s.

The algorithm uses a single pass through the string with the right pointer (implicit in the for loop). The left pointer l moves at most n times throughout the entire execution. Each character is visited at most twice - once when it enters the window and once when it exits. All operations inside the loop (adding to counter, checking counter size, deleting from counter) take O(1) time when using a hash map. Therefore, the overall time complexity is O(n).

Space Complexity: O(k), where k is the maximum number of distinct characters allowed.

The space is primarily used by the Counter (hash map) cnt, which stores at most k + 1 distinct characters at any point. When the counter exceeds k distinct characters, the algorithm immediately removes one character to maintain at most k distinct characters. In practice, the counter size is bounded by min(k, 26) for lowercase English letters or min(k, 256) for extended ASCII, but asymptotically it's O(k).

Common Pitfalls

Pitfall 1: Misunderstanding the Window Size Calculation

The Problem: Many developers incorrectly assume they need to track the maximum window size explicitly with a variable like max_length, updating it every time they find a valid window:

# Incorrect approach that seems intuitive but adds unnecessary complexity
max_length = 0
for right, char in enumerate(s):
    char_count[char] += 1
    while len(char_count) > k:
        # shrink window
        ...
    max_length = max(max_length, right - left + 1)  # Extra tracking

Why This Happens: This stems from the standard sliding window template where we explicitly track the maximum. However, this specific solution uses a clever optimization.

The Solution: The given approach maintains an invariant: the window never shrinks below the maximum size seen so far. When we violate the constraint (too many distinct characters), we only move the left pointer by one position, effectively sliding the entire window forward rather than shrinking it. This means len(s) - left at the end gives us the maximum window size automatically.

Pitfall 2: Incorrectly Handling Character Removal from Counter

The Problem: Forgetting to remove characters with zero count from the counter:

# Incorrect - leaves zero-count entries in counter
if len(char_count) > k:
    char_count[s[left]] -= 1
    # Missing: del char_count[s[left]] when count is 0
    left += 1

Why This Happens: The counter will still consider characters with count 0 as "distinct" when checking len(char_count), leading to incorrect distinct character counts.

The Solution: Always remove entries with zero count to maintain accurate distinct character tracking:

if char_count[s[left]] == 0:
    del char_count[s[left]]

Pitfall 3: Using While Loop Instead of If Statement

The Problem: Using a while loop to shrink the window until valid:

# Less optimal approach
while len(char_count) > k:
    char_count[s[left]] -= 1
    if char_count[s[left]] == 0:
        del char_count[s[left]]
    left += 1

Why This Matters: While this still works correctly, it may shrink the window more than necessary, potentially missing the maximum window size. The single if statement ensures we maintain the largest window size seen so far by only incrementing left by 1 when needed.

The Solution: Use if instead of while to maintain the sliding window at its maximum valid size:

if len(char_count) > k:  # Only shrink by one position
    # shrink logic
Discover Your Strengths and Weaknesses: Take Our 5-Minute Quiz to Tailor Your Study Plan:

What are the two properties the problem needs to have for dynamic programming to be applicable? (Select 2)


Recommended Readings

Want a Structured Path to Master System Design Too? Don’t Miss This!

Load More