340. Longest Substring with At Most K Distinct Characters
Problem Description
In this problem, we are given a string s
and an integer k
. Our task is to find the length of the longest substring (which is a contiguous block of characters in the string) within s
that contains at most k
distinct characters.
A distinct character means that no matter how many times the character appears in the substring, it is counted only once. For example, if k = 2
, the substring "aabbc" has 3 distinct characters ('a', 'b', and 'c'), thus, it does not meet the requirement.
The goal here is to achieve this while maximizing the length of the substring. A substring could range from containing a single character up to the length of the entire string, if k
is sufficiently large to cover all distinct characters in the string s
.
Intuition
The core intuition behind the solution is to use the two-pointer technique, or more specifically, the sliding window approach. Here is the thinking process that leads us to this approach:
- We need to examine various substrings of
s
efficiently without having to re-scan the string repeatedly. - We can start with a window (or two pointers) at the beginning of the string and expand the window to the right until we exceed
k
distinct characters within the window. - Upon exceeding
k
distinct characters, we need to move the left side of the window to the right to potentially drop one of the distinct characters out of the count. - We keep track of the count of distinct characters in the current window using a data structure, such as a counter (dictionary), that is updated when the window is adjusted.
- As we expand and contract the window, we keep a record of the maximum window size (i.e., substring length) that has appeared so far that contains at most
k
distinct characters. This is the number we want to return.
Now, the implementation uses a counter and two indices, i
and j
, where i
is the end of the sliding window, and j
is the start. We iterate over the string with i
:
- We include the character at position
i
in the current window by incrementing its count in the counter. - If adding the current character has led to more than
k
distinct characters in the window, we incrementj
, effectively reducing the size of the window from the left, until the number of distinct characters drops tok
or lower. - At each step, we calculate the length of the current window (
i - j + 1
) and update the answer if it's the largest such length we have seen so far. - This process continues until
i
has reached the end of the string.
The sliding window is moved across the string efficiently to identify the longest substring that satisfies the condition of having at most k
distinct characters, thus leading us to the solution.
Learn more about Sliding Window patterns.
Solution Approach
The solution code implements the sliding window pattern using two pointers to keep track of the current window within the string s
. This pattern is completed using the Python Counter
from the collections module as the key data structure to keep counts of each character in the current window.
Here is how it is done step by step:
-
We initialize a
Counter
object, which is a dictionary that will keep track of the counts of individual characters in the current sliding window. -
Two pointers,
i
andj
, are created.i
is used to traverse the strings
character by character from the start to the end, whilej
is used to keep track of the start of the sliding window.j
starts at 0 and moves rightward to narrow the window whenever necessary. -
We initialize a variable
ans
to keep track of the maximum length of the substring found that meets the criteria. -
We start iterating over the characters of the string
s
, withi
acting as the end boundary of the window. For each characterc
at indexi
, we:-
Increment its count in the
Counter
by 1 (cnt[c] += 1
). -
Check if the number of distinct characters in the
Counter
has exceededk
. This is done by evaluatinglen(cnt) > k
. As long as this condition is true, meaning we have more thank
distinct characters, we need to shrink the window from the left side by increasingj
.- In the inner
while
loop, decrease the count of the character at the beginning of the window (cnt[s[j]] -= 1
). - If the count of that character drops to 0, we remove it from the
Counter
(cnt.pop(s[j])
), as it's no longer part of the window. - We increment
j
to shrink the window from the left.
- In the inner
-
-
After adjusting the size of the window (if it was needed), we calculate the length of the current window (
i - j + 1
) and updateans
to be the maximum of its current value and this newly calculated length. -
After iterating through all characters of
s
, the process concludes, and the final value ofans
contains the length of the longest substring with at mostk
distinct characters. This value is returned as the result.
Through this method, we avoid unnecessary re-scans of the string and implement an O(n) time complexity algorithm, where n
is the length of the string s
. By dynamically adjusting the sliding window and using the Counter
to track distinct characters, we achieve an efficient approach to solving the problem as outlined.
Ready to land your dream job?
Unlock your dream job with a 2-minute evaluator for a personalized learning plan!
Start EvaluatorExample Walkthrough
Let's assume we have a string s = "aabcabb"
, and the integer k = 2
. We want to find the length of the longest substring with at most k
distinct characters.
Here is a step-by-step walkthrough of the solution approach applied to this example:
-
Initialize the Counter and Pointers: Create a
Counter
object,cnt
, to count characters in the current window. Seti
,j
to 0 to represent the start and end of the sliding window, andans
to 0 as the longest substring length found so far. -
Start Iterating with Pointer
i
: Movei
from the start to the end of the strings
.i = 0
:cnt = {'a': 1}
, substring isa
,ans = 1
.i = 1
:cnt = {'a': 2}
, substring isaa
,ans = 2
.i = 2
:cnt = {'a': 2, 'b': 1}
, substring isaab
,ans = 3
. -
Exceeding
k
Distinct Characters: At the next character, we check if the number of distinct characters will exceedk
.i = 3
: We attempt to insertc
intocnt
. Doing so would increase distinct character count to3
(cnt = {'a': 2, 'b': 1, 'c': 1}
), which is greater thank
. We need to movej
right until we get at mostk
distinct characters. -
Shrinking the Window:
Start incrementing
j
, decreasing the count of the characters[j]
incnt
, and remove the character fromcnt
if its count reaches 0.j = 1
: We decrease the count ofa
and updatecnt = {'a': 1, 'b': 1, 'c': 1}
. We still have more thank
distinct characters.j = 2
: We decrease the count ofa
and remove it fromcnt
(cnt = {'b': 1, 'c': 1}
). Now we havek
distinct characters.After shrinking, the substring is now
bc
(s[2:4]
), andans
remains3
. -
Continue Process:
i = 4
: Adda
tocnt
,cnt = {'b': 1, 'c': 1, 'a': 1}
, substring isbca
,ans
remains3
.i = 5
: Addb
tocnt
,cnt = {'b': 2, 'c': 1, 'a': 1}
. We have exceededk
again, so start movingj
.j = 3
: Reduce count ofb
,cnt = {'b': 1, 'c': 1, 'a': 1}
. Still too many distinct characters.j = 4
: Removec
,cnt = {'b': 1, 'a': 1}
, and now the substring isab
(s[4:6]
), updateans
to2
.i = 6
: Addb
,cnt = {'b': 2, 'a': 1}
, and substring isabb
(s[4:7]
). Now, updateans
to3
, as this is the length of the current window. -
Finish Iteration:
After the final iteration,
ans
contains the length of the longest satisfactory substring, which is3
in this case, representing the substringaab
orabb
.
By using the sliding window technique, the algorithm efficiently finds the longest substring that satisfies the given constraint with a time complexity of O(n), where n is the length of the string s
.
Solution Implementation
1from collections import Counter
2
3class Solution:
4 def lengthOfLongestSubstringKDistinct(self, s: str, k: int) -> int:
5 # Initialize a counter to keep track of the frequency of each character
6 char_count = Counter()
7
8 # Length of the input string
9 n = len(s)
10
11 # Initialize the answer and the start index (j) of the current window
12 max_length = start_index = 0
13
14 # Enumerate over the characters of the string with index (i) and character (char)
15 for i, char in enumerate(s):
16 # Increment the count of the current character
17 char_count[char] += 1
18
19 # If the number of distinct characters exceeds k, shrink the window
20 while len(char_count) > k:
21 # Decrement the count of the character at the start index
22 char_count[s[start_index]] -= 1
23
24 # If the count goes to zero, remove the character from the counter
25 if char_count[s[start_index]] == 0:
26 del char_count[s[start_index]]
27
28 # Move the start index forward
29 start_index += 1
30
31 # Update the maximum length found so far
32 max_length = max(max_length, i - start_index + 1)
33
34 # Return the maximum length of substring with at most k distinct characters
35 return max_length
36
37# The code can now be run with an instance of the Solution class
38# For example: Solution().lengthOfLongestSubstringKDistinct("eceba", 2) should return 3
39
1import java.util.HashMap;
2import java.util.Map;
3
4class Solution {
5
6 public int lengthOfLongestSubstringKDistinct(String s, int k) {
7 // Map to store the frequency of each character in the current window
8 Map<Character, Integer> charCountMap = new HashMap<>();
9 int n = s.length();
10 int longestSubstringLength = 0; // variable to store the length of the longest substring
11 int left = 0; // left pointer for the sliding window
12
13 // Iterate through the string using the right pointer of the sliding window
14 for (int right = 0; right < n; ++right) {
15 // Step 1: Update the count of the current character
16 char currentChar = s.charAt(right);
17 charCountMap.put(currentChar, charCountMap.getOrDefault(currentChar, 0) + 1);
18
19 // Step 2: Shrink the window from the left if count map has more than 'k' distinct characters
20 while (charCountMap.size() > k) {
21 char leftChar = s.charAt(left);
22 charCountMap.put(leftChar, charCountMap.get(leftChar) - 1);
23 // Remove the character from map when count becomes zero
24 if (charCountMap.get(leftChar) == 0) {
25 charCountMap.remove(leftChar);
26 }
27 left++; // shrink the window from the left
28 }
29
30 // Step 3: Update the longest substring length if the current window is larger
31 longestSubstringLength = Math.max(longestSubstringLength, right - left + 1);
32 }
33
34 return longestSubstringLength; // Return the length of the longest substring found
35 }
36}
37
1#include <string>
2#include <unordered_map>
3#include <algorithm>
4
5class Solution {
6public:
7 // Finds the length of the longest substring with at most k distinct characters
8 int lengthOfLongestSubstringKDistinct(std::string s, int k) {
9 std::unordered_map<char, int> charCountMap; // Map to store character counts
10
11 int maxSubstringLength = 0; // Maximum length of substring found
12 int leftPointer = 0; // Left pointer for sliding window
13 int n = s.size(); // Length of the input string
14
15 for (int rightPointer = 0; rightPointer < n; ++rightPointer) {
16 // Increase char count for the current position
17 charCountMap[s[rightPointer]]++;
18
19 // If we have more than k distinct chars, contract the window from the left
20 while (charCountMap.size() > k) {
21 // Decrease the count of the char at the left pointer
22 charCountMap[s[leftPointer]]--;
23
24 // If the count drops to 0, remove it from the map
25 if (charCountMap[s[leftPointer]] == 0) {
26 charCountMap.erase(s[leftPointer]);
27 }
28
29 // Move the left pointer to the right
30 ++leftPointer;
31 }
32
33 // Update maxSubstringLength if we've found a larger window
34 maxSubstringLength = std::max(maxSubstringLength, rightPointer - leftPointer + 1);
35 }
36
37 return maxSubstringLength; // Return the max length found
38 }
39};
40
1// Imports a generic map class to emulate the unordered_map feature in C++
2import { Map } from "typescript-collections";
3
4// Finds the length of the longest substring with at most k distinct characters
5function lengthOfLongestSubstringKDistinct(s: string, k: number): number {
6 let charCountMap = new Map<char, number>(); // Map to store character counts
7
8 let maxSubstringLength = 0; // Maximum length of substring found
9 let leftPointer = 0; // Left pointer for sliding window
10 let n = s.length; // Length of the input string
11
12 for (let rightPointer = 0; rightPointer < n; rightPointer++) {
13 // Increase char count for the current character
14 let count = charCountMap.getValue(s[rightPointer]) || 0;
15 charCountMap.setValue(s[rightPointer], count + 1);
16
17 // If we have more than k distinct characters, contract the window from the left
18 while (charCountMap.size() > k) {
19 // Decrease the count of the character at the left pointer
20 let leftCharCount = charCountMap.getValue(s[leftPointer]) || 0;
21 charCountMap.setValue(s[leftPointer], leftCharCount - 1);
22
23 // If the count drops to 0, remove it from the map
24 if (charCountMap.getValue(s[leftPointer]) === 0) {
25 charCountMap.remove(s[leftPointer]);
26 }
27
28 // Move the left pointer to the right
29 leftPointer++;
30 }
31
32 // Update maxSubstringLength if we've found a larger window
33 maxSubstringLength = Math.max(maxSubstringLength, rightPointer - leftPointer + 1);
34 }
35
36 return maxSubstringLength; // Return the max length found
37}
38
Time and Space Complexity
The given code snippet defines a function that determines the length of the longest substring with no more than k
distinct characters in a given string s
.
Time Complexity
The time complexity of the function can be analyzed as follows:
- There are two pointers (
i
andj
) that traverse the strings
only once. The outer loop runs for all characters from position 0 ton - 1
wheren
is the length of the strings
, hence contributing O(n) to the time complexity. - Inside the loop, there is a
while
loop that shrinks the sliding window from the left when the number of distinct characters exceedsk
. Thiswhile
loop does not run for each element ins
multiple times. Each character is removed from the window only once, accounting for another O(n) over the whole run of the algorithm.
Therefore, the total time complexity of the algorithm is O(n)
where n
is the number of characters in the input string s
.
Space Complexity
The space complexity of the function can be analyzed as follows:
- A
Counter
object is used to keep track of the frequency of each character in the current window. In the worst case, the counter could store a distinct count for every character in the strings
. However, since the number of distinct characters is limited byk
, theCounter
would hold at mostk
elements.
Therefore, the space complexity of the algorithm is O(k)
where k
is the maximum number of distinct characters that the substring can have.
Learn more about how to find time and space complexity quickly using problem constraints.
You are given an array of intervals where intervals[i] = [start_i, end_i]
represent the start and end of the ith
interval. You need to merge all overlapping intervals and return an array of the non-overlapping intervals that cover all the intervals in the input.
Recommended Readings
https algomonster s3 us east 2 amazonaws com cover_photos stack svg Sliding Window Maximum Monotonic Stack We have an array and a sliding window defined by a start index and an end index The sliding window moves from left of the array to right There are always k elements in
LeetCode Patterns Your Personal Dijkstra's Algorithm to Landing Your Dream Job The goal of AlgoMonster is to help you get a job in the shortest amount of time possible in a data driven way We compiled datasets of tech interview problems and broke them down by patterns This way we
Recursion Recursion is one of the most important concepts in computer science Simply speaking recursion is the process of a function calling itself Using a real life analogy imagine a scenario where you invite your friends to lunch https algomonster s3 us east 2 amazonaws com recursion jpg You first
Want a Structured Path to Master System Design Too? Don’t Miss This!
The solution is wrong. It gives the longest string where each character does not occur more than k times. But that is not what we need to answer.
What we need is longest substring with at most k distinct characters. It means after k characters we need to start moving to the left, until in out unique characters set there are k elements again.