3144. Minimum Substring Partition of Equal Character Frequency
Problem Description
Given a string s
, you need to partition it into one or more balanced substrings. A balanced string is defined as a string where each character appears the same number of times.
For example, if s = "ababcc"
:
-
Valid partitions include:
("abab", "c", "c")
,("ab", "abc", "c")
, and("ababcc")
- In
"abab"
: 'a' appears 2 times, 'b' appears 2 times (balanced) - In
"c"
: 'c' appears 1 time (balanced) - In
"abc"
: each character appears exactly 1 time (balanced) - In
"ababcc"
: 'a' appears 2 times, 'b' appears 2 times, 'c' appears 2 times (balanced)
- In
-
Invalid partitions include:
("a", "bab", "cc")
,("aba", "bc", "c")
, and("ab", "abcc")
- In
"bab"
: 'b' appears 2 times, 'a' appears 1 time (not balanced) - In
"aba"
: 'a' appears 2 times, 'b' appears 1 time (not balanced) - In
"abcc"
: 'a' appears 1 time, 'b' appears 1 time, 'c' appears 2 times (not balanced)
- In
The task is to find the minimum number of balanced substrings that you can partition the string s
into. Each partition must cover the entire string without overlapping, and every substring in the partition must be balanced.
Intuition
The key insight is that this is an optimal partitioning problem where we need to make decisions about where to split the string. At each position, we face a choice: should we extend the current substring or start a new partition?
Think of it this way: starting from any position i
in the string, we can try to form a balanced substring by extending it to various endpoints. For each valid balanced substring from position i
to j
, we can make a cut at position j+1
and then solve the remaining subproblem starting from j+1
.
This naturally leads to a recursive approach with overlapping subproblems. If we define dfs(i)
as the minimum number of partitions needed for the substring starting from index i
to the end, then:
- For each possible endpoint
j
fromi
ton-1
, we check ifs[i:j+1]
forms a balanced substring - If it does, we have found one partition, and we need
1 + dfs(j+1)
total partitions - We take the minimum across all valid choices
The challenge is efficiently checking if a substring is balanced. A substring is balanced when all characters in it appear the same number of times - meaning if we track the frequency of each frequency value, we should have exactly one unique frequency.
For example, in "aabbcc"
, we have frequencies: a:2, b:2, c:2
. All characters appear 2 times, so there's only one unique frequency value (2), making it balanced. In contrast, "aabc"
has frequencies: a:2, b:1, c:1
, with two different frequency values (1 and 2), so it's not balanced.
By maintaining two hash tables - one for character counts (cnt
) and another for frequency of frequencies (freq
) - we can efficiently update and check the balance condition as we extend our substring. When freq
has size 1, all characters in the current substring appear the same number of times, indicating a balanced substring.
Learn more about Dynamic Programming patterns.
Solution Approach
The solution uses memoized search (dynamic programming with recursion) combined with hash tables to efficiently find the minimum number of balanced partitions.
Implementation Details:
-
Main Recursive Function
dfs(i)
:- Represents the minimum number of balanced substrings needed to partition
s[i:]
(from indexi
to the end) - Base case: If
i >= n
, return0
(no more characters to process) - Uses
@cache
decorator for memoization to avoid recomputing the same subproblems
- Represents the minimum number of balanced substrings needed to partition
-
Two Hash Tables for Balance Checking:
cnt
: Tracks the frequency of each character in the current substringfreq
: Tracks how many characters have each frequency value
For example, if substring is
"aabbcc"
:cnt = {'a': 2, 'b': 2, 'c': 2}
freq = {2: 3}
(three characters appear 2 times each)
-
Iterative Extension Process:
- For each starting position
i
, try all possible endpointsj
fromi
ton-1
- For each new character
s[j]
added to the current substring:- Update frequency tracking:
- If
cnt[s[j]]
already exists, decrementfreq[cnt[s[j]]]
- If
freq[cnt[s[j]]]
becomes 0, remove it fromfreq
- Increment
cnt[s[j]]
- Increment
freq[cnt[s[j]]]
(the new frequency count)
- If
- Update frequency tracking:
- Check balance condition: If
len(freq) == 1
, the substring is balanced- This means all characters appear with the same frequency
- Calculate the cost:
1 + dfs(j + 1)
(one partition plus the minimum partitions for the rest) - Track the minimum cost across all valid endpoints
- For each starting position
-
Optimization Strategy:
- Initialize
ans = n - i
(worst case: each character forms its own partition) - Only update
ans
when finding a better solution:if t < ans: ans = t
- The memoization ensures each state
dfs(i)
is computed only once
- Initialize
-
Final Answer:
- Call
dfs(0)
to get the minimum partitions for the entire string
- Call
The time complexity is O(n²)
for trying all possible substrings, with each substring taking O(1)
amortized time to check balance using the frequency tables. The space complexity is O(n)
for the recursion stack and memoization cache.
Ready to land your dream job?
Unlock your dream job with a 3-minute evaluator for a personalized learning plan!
Start EvaluatorExample Walkthrough
Let's walk through the solution with s = "aabbcc"
.
Initial Setup:
- String:
"aabbcc"
(indices 0-5) - We start with
dfs(0)
to find minimum partitions for the entire string
Step 1: dfs(0) - Starting from index 0
We'll try extending from index 0 to create balanced substrings:
-
Try substring [0:1] = "a":
cnt = {'a': 1}
,freq = {1: 1}
- Balanced? Yes (only one frequency value)
- Cost:
1 + dfs(1)
(need to compute dfs(1))
-
Try substring [0:2] = "aa":
cnt = {'a': 2}
,freq = {2: 1}
- Balanced? Yes
- Cost:
1 + dfs(2)
(need to compute dfs(2))
-
Try substring [0:3] = "aab":
cnt = {'a': 2, 'b': 1}
,freq = {2: 1, 1: 1}
- Balanced? No (two different frequencies)
- Skip this option
-
Try substring [0:4] = "aabb":
cnt = {'a': 2, 'b': 2}
,freq = {2: 2}
- Balanced? Yes (both chars appear 2 times)
- Cost:
1 + dfs(4)
(need to compute dfs(4))
-
Try substring [0:5] = "aabbc":
cnt = {'a': 2, 'b': 2, 'c': 1}
,freq = {2: 2, 1: 1}
- Balanced? No
- Skip this option
-
Try substring [0:6] = "aabbcc":
cnt = {'a': 2, 'b': 2, 'c': 2}
,freq = {2: 3}
- Balanced? Yes (all three chars appear 2 times)
- Cost:
1 + dfs(6) = 1 + 0 = 1
(base case: dfs(6) = 0)
Step 2: Compute dfs(4) - Starting from index 4 ("cc")
-
Try substring [4:5] = "c":
- Balanced? Yes
- Cost:
1 + dfs(5)
-
Try substring [4:6] = "cc":
- Balanced? Yes
- Cost:
1 + dfs(6) = 1 + 0 = 1
So dfs(4) = 1
Step 3: Compute dfs(2) - Starting from index 2 ("bbcc")
- Try substring [2:3] = "b": Balanced, cost =
1 + dfs(3)
- Try substring [2:4] = "bb": Balanced, cost =
1 + dfs(4) = 1 + 1 = 2
- Try substring [2:6] = "bbcc":
cnt = {'b': 2, 'c': 2}
, Balanced, cost =1 + 0 = 1
So dfs(2) = 1
Step 4: Compute dfs(1) - Starting from index 1 ("abbcc")
- Various attempts would show that the best option is likely partitioning into balanced substrings
- After checking all possibilities,
dfs(1) = 2
Final Result:
Going back to dfs(0)
:
- Option 1: "a" + rest →
1 + dfs(1) = 1 + 2 = 3
- Option 2: "aa" + rest →
1 + dfs(2) = 1 + 1 = 2
- Option 3: "aabb" + rest →
1 + dfs(4) = 1 + 1 = 2
- Option 4: "aabbcc" →
1 + dfs(6) = 1 + 0 = 1
The minimum is 1, achieved by taking the entire string as one balanced partition.
This demonstrates how the algorithm:
- Explores all possible balanced substrings starting from each position
- Uses memoization to avoid recomputing subproblems
- Tracks character frequencies efficiently using two hash tables
- Finds the optimal partition by choosing the minimum cost at each step
Solution Implementation
1from functools import cache
2from collections import defaultdict
3
4class Solution:
5 def minimumSubstringsInPartition(self, s: str) -> int:
6 @cache
7 def dfs(start_index: int) -> int:
8 """
9 Find minimum number of balanced substrings starting from start_index.
10 A balanced substring has all characters appearing the same number of times.
11
12 Args:
13 start_index: Starting position in string s
14
15 Returns:
16 Minimum number of balanced substrings needed from start_index to end
17 """
18 # Base case: reached end of string
19 if start_index >= string_length:
20 return 0
21
22 # Track character counts in current substring
23 char_count = defaultdict(int)
24 # Track frequency of each count value (how many chars have each count)
25 count_frequency = defaultdict(int)
26
27 # Initialize with worst case (each remaining char is its own substring)
28 min_partitions = string_length - start_index
29
30 # Try all possible end positions for current substring
31 for end_index in range(start_index, string_length):
32 current_char = s[end_index]
33
34 # Update frequency map when character count changes
35 if char_count[current_char] > 0:
36 # Remove old count from frequency map
37 count_frequency[char_count[current_char]] -= 1
38 if count_frequency[char_count[current_char]] == 0:
39 del count_frequency[char_count[current_char]]
40
41 # Increment character count
42 char_count[current_char] += 1
43
44 # Add new count to frequency map
45 count_frequency[char_count[current_char]] += 1
46
47 # Check if current substring is balanced (all chars have same count)
48 if len(count_frequency) == 1:
49 # Calculate partitions: 1 for current + minimum for remaining
50 partitions_with_current = 1 + dfs(end_index + 1)
51 if partitions_with_current < min_partitions:
52 min_partitions = partitions_with_current
53
54 return min_partitions
55
56 string_length = len(s)
57 return dfs(0)
58
1class Solution {
2 private int stringLength;
3 private char[] charArray;
4 private Integer[] memoization;
5
6 public int minimumSubstringsInPartition(String s) {
7 stringLength = s.length();
8 memoization = new Integer[stringLength];
9 this.charArray = s.toCharArray();
10 return findMinPartitions(0);
11 }
12
13 private int findMinPartitions(int startIndex) {
14 // Base case: if we've processed the entire string
15 if (startIndex >= stringLength) {
16 return 0;
17 }
18
19 // Check memoization cache
20 if (memoization[startIndex] != null) {
21 return memoization[startIndex];
22 }
23
24 // Track character counts in current substring
25 int[] characterCounts = new int[26];
26
27 // Track frequency of each count value (how many characters have each count)
28 Map<Integer, Integer> countFrequency = new HashMap<>(26);
29
30 // Initialize with worst case (each remaining character as separate substring)
31 int minPartitions = stringLength - startIndex;
32
33 // Try all possible end positions for current substring
34 for (int endIndex = startIndex; endIndex < stringLength; ++endIndex) {
35 int charIndex = charArray[endIndex] - 'a';
36
37 // Update frequency map for the old count of this character
38 if (characterCounts[charIndex] > 0) {
39 // Decrease frequency of the old count
40 if (countFrequency.merge(characterCounts[charIndex], -1, Integer::sum) == 0) {
41 countFrequency.remove(characterCounts[charIndex]);
42 }
43 }
44
45 // Increment character count
46 ++characterCounts[charIndex];
47
48 // Update frequency map for the new count of this character
49 countFrequency.merge(characterCounts[charIndex], 1, Integer::sum);
50
51 // Check if all characters in current substring have same frequency
52 if (countFrequency.size() == 1) {
53 // Valid partition: recurse for remaining string
54 minPartitions = Math.min(minPartitions, 1 + findMinPartitions(endIndex + 1));
55 }
56 }
57
58 // Store result in memoization cache and return
59 return memoization[startIndex] = minPartitions;
60 }
61}
62
1class Solution {
2public:
3 int minimumSubstringsInPartition(string s) {
4 int n = s.size();
5
6 // dp[i] = minimum number of substrings needed to partition s[i..n-1]
7 // -1 indicates not yet computed
8 int dp[n];
9 memset(dp, -1, sizeof(dp));
10
11 // Recursive function with memoization to find minimum partitions
12 auto dfs = [&](this auto&& dfs, int startIdx) -> int {
13 // Base case: reached end of string
14 if (startIdx >= n) {
15 return 0;
16 }
17
18 // Return memoized result if already computed
19 if (dp[startIdx] != -1) {
20 return dp[startIdx];
21 }
22
23 // Initialize with worst case: each character is a separate substring
24 dp[startIdx] = n - startIdx;
25
26 // charCount[i] = frequency of character ('a' + i) in current substring
27 int charCount[26]{};
28
29 // frequencyMap: maps frequency value to count of characters with that frequency
30 // Used to check if all characters have the same frequency
31 unordered_map<int, int> frequencyMap;
32
33 // Try all possible end positions for current substring
34 for (int endIdx = startIdx; endIdx < n; ++endIdx) {
35 int charIndex = s[endIdx] - 'a';
36
37 // Update frequency map: remove old frequency count
38 if (charCount[charIndex] > 0) {
39 frequencyMap[charCount[charIndex]]--;
40 if (frequencyMap[charCount[charIndex]] == 0) {
41 frequencyMap.erase(charCount[charIndex]);
42 }
43 }
44
45 // Update character count and frequency map with new frequency
46 charCount[charIndex]++;
47 frequencyMap[charCount[charIndex]]++;
48
49 // If all characters have same frequency (frequencyMap has only 1 unique value)
50 // this is a valid balanced substring
51 if (frequencyMap.size() == 1) {
52 // Try partitioning here: 1 (current substring) + remaining partitions
53 dp[startIdx] = min(dp[startIdx], 1 + dfs(endIdx + 1));
54 }
55 }
56
57 return dp[startIdx];
58 };
59
60 // Start partitioning from index 0
61 return dfs(0);
62 }
63};
64
1/**
2 * Finds the minimum number of substrings needed to partition string s
3 * where each substring has all characters appearing the same number of times
4 * @param s - Input string containing only lowercase letters
5 * @returns Minimum number of balanced substrings in partition
6 */
7function minimumSubstringsInPartition(s: string): number {
8 const stringLength: number = s.length;
9 // Memoization array for dynamic programming (-1 indicates uncomputed)
10 const memo: number[] = Array(stringLength).fill(-1);
11
12 /**
13 * Recursive function to find minimum partitions starting from index
14 * @param startIndex - Starting index for current substring
15 * @returns Minimum number of partitions from startIndex to end
16 */
17 const findMinPartitions = (startIndex: number): number => {
18 // Base case: reached end of string
19 if (startIndex >= stringLength) {
20 return 0;
21 }
22
23 // Return memoized result if already computed
24 if (memo[startIndex] !== -1) {
25 return memo[startIndex];
26 }
27
28 // Map to track count of each character in current substring
29 const charCountMap: Map<number, number> = new Map();
30 // Map to track frequency of each count value (how many chars have that count)
31 const countFrequencyMap: Map<number, number> = new Map();
32
33 // Initialize with worst case (each character as separate substring)
34 memo[startIndex] = stringLength - startIndex;
35
36 // Try all possible endpoints for substring starting at startIndex
37 for (let endIndex = startIndex; endIndex < stringLength; ++endIndex) {
38 // Convert character to 0-based index (a=0, b=1, ..., z=25)
39 const charIndex = s.charCodeAt(endIndex) - 97;
40
41 // Update count frequency map: remove old count frequency
42 const oldCount = charCountMap.get(charIndex);
43 if (oldCount !== undefined && countFrequencyMap.has(oldCount)) {
44 const oldFrequency = countFrequencyMap.get(oldCount)!;
45 countFrequencyMap.set(oldCount, oldFrequency - 1);
46 // Remove entry if frequency becomes zero
47 if (countFrequencyMap.get(oldCount) === 0) {
48 countFrequencyMap.delete(oldCount);
49 }
50 }
51
52 // Update character count for current character
53 const newCount = (charCountMap.get(charIndex) || 0) + 1;
54 charCountMap.set(charIndex, newCount);
55
56 // Update count frequency map: add new count frequency
57 const newFrequency = (countFrequencyMap.get(newCount) || 0) + 1;
58 countFrequencyMap.set(newCount, newFrequency);
59
60 // Check if current substring is balanced (all chars have same count)
61 if (countFrequencyMap.size === 1) {
62 // Try partitioning here and recurse for remaining string
63 memo[startIndex] = Math.min(
64 memo[startIndex],
65 1 + findMinPartitions(endIndex + 1)
66 );
67 }
68 }
69
70 return memo[startIndex];
71 };
72
73 // Start the recursive search from index 0
74 return findMinPartitions(0);
75}
76
Time and Space Complexity
Time Complexity: O(n²)
The algorithm uses dynamic programming with memoization. The dfs(i)
function is called for each starting position from 0
to n-1
, resulting in at most n
unique states. For each state dfs(i)
, the function iterates through all possible ending positions from i
to n-1
, which takes O(n-i)
time. Within each iteration, the operations on the cnt
and freq
dictionaries take O(1)
time since we're dealing with at most 26 distinct characters. Therefore, the total time complexity is:
- Sum from
i=0
ton-1
of(n-i)
=n + (n-1) + ... + 1
=n(n+1)/2
=O(n²)
Space Complexity: O(n × |Σ|)
The space complexity consists of:
- Memoization cache: The
@cache
decorator stores results for each unique starting positioni
, requiringO(n)
space for the cached return values. - Recursion stack: The maximum recursion depth is
O(n)
in the worst case when each character forms its own partition. - Local variables per recursive call: Each call maintains:
cnt
dictionary: stores character counts, at most|Σ| = 26
entriesfreq
dictionary: stores frequency counts, at most|Σ| = 26
entries
Since the recursion depth can be O(n)
and each level maintains dictionaries of size O(|Σ|)
, the total space complexity is O(n × |Σ|)
, where |Σ| = 26
for lowercase English letters.
Learn more about how to find time and space complexity quickly.
Common Pitfalls
1. Incorrect Balance Check - Using Character Count Instead of Frequency Uniformity
A common mistake is checking if a substring is balanced by only looking at character counts directly, rather than verifying all characters have the same count.
Incorrect approach:
# WRONG: Just checking if counts exist
if len(char_count) > 0: # This doesn't ensure balance!
partitions = 1 + dfs(end_index + 1)
Why it fails: For substring "aabbc", char_count = {'a': 2, 'b': 2, 'c': 1}
. The substring is NOT balanced because 'c' appears 1 time while others appear 2 times.
Correct approach:
# RIGHT: Check if all characters have the same frequency
if len(count_frequency) == 1: # All chars appear same number of times
partitions = 1 + dfs(end_index + 1)
2. Forgetting to Update Frequency Map When Character Count Changes
When adding a character to the current substring, its count changes. You must remove the old frequency before adding the new one.
Incorrect approach:
# WRONG: Only adding new frequency without removing old char_count[current_char] += 1 count_frequency[char_count[current_char]] += 1 # Old frequency still exists!
Why it fails: If 'a' had count 2 and becomes 3, count_frequency
would incorrectly show both {2: 1, 3: 1}
instead of just {3: 1}
.
Correct approach:
# RIGHT: Remove old frequency first, then add new if char_count[current_char] > 0: count_frequency[char_count[current_char]] -= 1 if count_frequency[char_count[current_char]] == 0: del count_frequency[char_count[current_char]] char_count[current_char] += 1 count_frequency[char_count[current_char]] += 1
3. Not Cleaning Up Zero-Frequency Entries
Leaving zero-valued entries in count_frequency
will break the balance check.
Incorrect approach:
# WRONG: Leaving zero entries in frequency map count_frequency[char_count[current_char]] -= 1 # Not removing when it becomes 0!
Why it fails: With count_frequency = {2: 0, 3: 2}
, len(count_frequency)
returns 2, incorrectly indicating imbalance when all characters actually appear 3 times.
Correct approach:
# RIGHT: Remove zero entries count_frequency[char_count[current_char]] -= 1 if count_frequency[char_count[current_char]] == 0: del count_frequency[char_count[current_char]]
4. Resetting Data Structures Outside the Loop
Creating char_count
and count_frequency
outside the inner loop causes them to accumulate data incorrectly across different substring attempts.
Incorrect approach:
def dfs(start_index):
char_count = defaultdict(int)
count_frequency = defaultdict(int)
for end_index in range(start_index, string_length):
# char_count keeps accumulating even when trying different partitions!
Correct approach:
def dfs(start_index):
# Fresh data structures for each starting position
char_count = defaultdict(int)
count_frequency = defaultdict(int)
for end_index in range(start_index, string_length):
# Builds substring incrementally from start_index
What is an advantages of top-down dynamic programming vs bottom-up dynamic programming?
Recommended Readings
What is Dynamic Programming Prerequisite DFS problems dfs_intro Backtracking problems backtracking Memoization problems memoization_intro Pruning problems backtracking_pruning Dynamic programming is an algorithmic optimization technique that breaks down a complicated problem into smaller overlapping sub problems in a recursive manner and uses solutions to the sub problems to construct a solution
Coding Interview Patterns Your Personal Dijkstra's Algorithm to Landing Your Dream Job The goal of AlgoMonster is to help you get a job in the shortest amount of time possible in a data driven way We compiled datasets of tech interview problems and broke them down by patterns This way
Recursion Recursion is one of the most important concepts in computer science Simply speaking recursion is the process of a function calling itself Using a real life analogy imagine a scenario where you invite your friends to lunch https assets algo monster recursion jpg You first call Ben and ask
Want a Structured Path to Master System Design Too? Don’t Miss This!