Facebook Pixel

2405. Optimal Partition of String

MediumGreedyHash TableString
Leetcode Link

Problem Description

You are given a string s that you need to split into substrings. Each substring must have all unique characters - meaning no character can appear more than once within a single substring.

Your task is to find the minimum number of substrings needed to partition the entire string following this rule. Every character in the original string must belong to exactly one substring in your partition.

For example, if you have a string like "abacaba", you could partition it as ["ab", "ac", "aba"] which gives 3 substrings, where each substring contains only unique characters.

The solution uses a greedy approach with bit manipulation. It tracks which characters have appeared in the current substring using a bitmask. The variable mask uses its bits to represent the 26 lowercase letters - if the i-th bit is 1, it means the i-th letter of the alphabet has appeared in the current substring.

As we iterate through the string:

  • Convert each character to a number from 0-25 (representing a-z)
  • Check if this character has already appeared in the current substring by examining the corresponding bit in mask
  • If the character is already present (bit is 1), we must start a new substring: increment the answer counter and reset the mask
  • Otherwise, mark this character as present by setting its corresponding bit in the mask using bitwise OR operation

The algorithm ensures we create the minimum number of substrings by making each substring as long as possible while maintaining the uniqueness constraint.

Quick Interview Experience
Help others by sharing your interview experience
Have you seen this problem before?

Intuition

The key insight is that we want to minimize the number of substrings, which means we should make each substring as long as possible. Think of it like packing items into boxes - to use the fewest boxes, we should fill each box as much as we can before moving to the next one.

When building a substring, we can keep adding characters as long as they haven't appeared in the current substring yet. The moment we encounter a duplicate character, we have no choice but to start a new substring. This greedy strategy works because delaying the start of a new substring would only make our substrings shorter without any benefit.

To efficiently track which characters have appeared in the current substring, we can use a bitmask. Since there are only 26 lowercase letters, we can represent them using 26 bits in an integer. Each bit position corresponds to a letter (a = bit 0, b = bit 1, etc.). This gives us O(1) time to check if a character has appeared and O(1) time to mark a character as seen.

The bit manipulation approach is elegant here:

  • Checking if character x exists: mask >> x & 1 shifts the mask right by x positions and checks if the least significant bit is 1
  • Adding character x to our set: mask |= 1 << x sets the x-th bit to 1 using bitwise OR

By greedily extending each substring until we hit a duplicate and using efficient bit operations to track characters, we achieve an optimal solution that runs in O(n) time with O(1) space (since the bitmask size is fixed at 26 bits regardless of input size).

Learn more about Greedy patterns.

Solution Approach

Following the greedy strategy, we implement the solution using bit manipulation to track characters efficiently.

Initialization:

  • Set ans = 1 to count the number of substrings (we start with at least one substring)
  • Set mask = 0 as our bitmask to track which characters have appeared in the current substring

Processing each character:

  1. Convert character to index: For each character c in the string, we convert it to an integer x between 0 and 25 using ord(c) - ord("a"). This gives us the bit position to check in our mask.

  2. Check for duplicate: We check if character x already exists in the current substring using mask >> x & 1:

    • mask >> x shifts the mask right by x positions
    • & 1 checks if the least significant bit is 1
    • If the result is 1, the character is a duplicate
  3. Handle duplicate case: When we find a duplicate:

    • Increment ans by 1 (start a new substring)
    • Reset mask = 0 to clear all tracked characters for the new substring
  4. Add character to current substring: Whether it's a duplicate or not, we mark the current character as seen in our (potentially new) substring using mask |= 1 << x:

    • 1 << x creates a number with only the x-th bit set to 1
    • |= performs bitwise OR to set that bit in our mask

Example walkthrough with "abacaba":

  • Process 'a' (x=0): mask=0, no duplicate, set bit 0 β†’ mask=1
  • Process 'b' (x=1): mask=1, no duplicate, set bit 1 β†’ mask=3
  • Process 'a' (x=0): mask=3, bit 0 is set (duplicate!), ans=2, reset mask=0, set bit 0 β†’ mask=1
  • Process 'c' (x=2): mask=1, no duplicate, set bit 2 β†’ mask=5
  • Process 'a' (x=0): mask=5, bit 0 is set (duplicate!), ans=3, reset mask=0, set bit 0 β†’ mask=1
  • Process 'b' (x=1): mask=1, no duplicate, set bit 1 β†’ mask=3
  • Process 'a' (x=0): mask=3, bit 0 is set (duplicate!), ans=4, reset mask=0, set bit 0 β†’ mask=1

The algorithm returns ans = 4 as the minimum number of substrings needed.

Ready to land your dream job?

Unlock your dream job with a 3-minute evaluator for a personalized learning plan!

Start Evaluator

Example Walkthrough

Let's walk through the solution with the string s = "abaca".

Initial State:

  • ans = 1 (we start counting from 1 substring)
  • mask = 0 (binary: 00000, no characters seen yet)

Step 1: Process 'a'

  • Convert 'a' to index: x = 0 (since 'a' is the 0th letter)
  • Check if 'a' exists: mask >> 0 & 1 = 0 & 1 = 0 (not a duplicate)
  • Add 'a' to mask: mask |= 1 << 0 β†’ mask = 1 (binary: 00001)
  • Current substring: "a"

Step 2: Process 'b'

  • Convert 'b' to index: x = 1
  • Check if 'b' exists: mask >> 1 & 1 = 0 & 1 = 0 (not a duplicate)
  • Add 'b' to mask: mask |= 1 << 1 β†’ mask = 3 (binary: 00011)
  • Current substring: "ab"

Step 3: Process 'a'

  • Convert 'a' to index: x = 0
  • Check if 'a' exists: mask >> 0 & 1 = 3 >> 0 & 1 = 3 & 1 = 1 (duplicate found!)
  • Since duplicate: ans = 2, reset mask = 0
  • Add 'a' to new mask: mask |= 1 << 0 β†’ mask = 1 (binary: 00001)
  • Started new substring: "a"

Step 4: Process 'c'

  • Convert 'c' to index: x = 2
  • Check if 'c' exists: mask >> 2 & 1 = 1 >> 2 & 1 = 0 & 1 = 0 (not a duplicate)
  • Add 'c' to mask: mask |= 1 << 2 β†’ mask = 5 (binary: 00101)
  • Current substring: "ac"

Step 5: Process 'a'

  • Convert 'a' to index: x = 0
  • Check if 'a' exists: mask >> 0 & 1 = 5 >> 0 & 1 = 5 & 1 = 1 (duplicate found!)
  • Since duplicate: ans = 3, reset mask = 0
  • Add 'a' to new mask: mask |= 1 << 0 β†’ mask = 1 (binary: 00001)
  • Started new substring: "a"

Result: The string "abaca" is partitioned into 3 substrings: ["ab", "ac", "a"], each containing only unique characters.

Solution Implementation

1class Solution:
2    def partitionString(self, s: str) -> int:
3        # Initialize partition count to 1 (minimum partitions needed)
4        partition_count = 1
5      
6        # Bitmask to track which characters are in current partition
7        # Each bit position represents a letter (a=0, b=1, ..., z=25)
8        character_bitmask = 0
9      
10        # Process each character in the string
11        for character in s:
12            # Convert character to its position (0-25)
13            char_position = ord(character) - ord('a')
14          
15            # Check if this character already exists in current partition
16            # Right shift the mask by char_position bits and check if LSB is 1
17            if (character_bitmask >> char_position) & 1:
18                # Character already exists, need to start a new partition
19                partition_count += 1
20                # Reset the bitmask for the new partition
21                character_bitmask = 0
22          
23            # Add current character to the partition by setting its bit
24            # Left shift 1 by char_position and OR with existing mask
25            character_bitmask |= (1 << char_position)
26      
27        # Return total number of partitions needed
28        return partition_count
29
1class Solution {
2    public int partitionString(String s) {
3        // Initialize partition count to 1 (at least one partition needed)
4        int partitionCount = 1;
5      
6        // Bitmask to track which characters are in the current partition
7        // Each bit position represents a letter (0='a', 1='b', ..., 25='z')
8        int characterBitmask = 0;
9      
10        // Iterate through each character in the string
11        for (int i = 0; i < s.length(); i++) {
12            // Convert character to its position (0-25)
13            int charPosition = s.charAt(i) - 'a';
14          
15            // Check if this character already exists in current partition
16            // by checking if the bit at charPosition is set to 1
17            if ((characterBitmask >> charPosition & 1) == 1) {
18                // Character found in current partition, start a new partition
19                partitionCount++;
20              
21                // Reset bitmask for the new partition
22                characterBitmask = 0;
23            }
24          
25            // Add current character to the partition by setting its bit to 1
26            characterBitmask |= (1 << charPosition);
27        }
28      
29        return partitionCount;
30    }
31}
32
1class Solution {
2public:
3    int partitionString(string s) {
4        // Initialize partition count to 1 (at least one partition needed)
5        int partitionCount = 1;
6      
7        // Bitmask to track which characters are present in the current partition
8        // Each bit position represents a character ('a' = bit 0, 'b' = bit 1, etc.)
9        int characterMask = 0;
10      
11        // Iterate through each character in the string
12        for (char& currentChar : s) {
13            // Convert character to its position (0-25 for 'a'-'z')
14            int charPosition = currentChar - 'a';
15          
16            // Check if this character already exists in the current partition
17            // by checking if the corresponding bit is set
18            if ((characterMask >> charPosition) & 1) {
19                // Character already exists, start a new partition
20                partitionCount++;
21              
22                // Reset the mask for the new partition
23                characterMask = 0;
24            }
25          
26            // Add the current character to the partition by setting its bit
27            characterMask |= (1 << charPosition);
28        }
29      
30        return partitionCount;
31    }
32};
33
1/**
2 * Partitions a string into minimum number of substrings where each substring contains unique characters
3 * @param s - The input string containing only lowercase English letters
4 * @returns The minimum number of substrings needed
5 */
6function partitionString(s: string): number {
7    // Initialize the count of partitions and a bitmask to track seen characters
8    let partitionCount: number = 1;
9    let characterBitmask: number = 0;
10  
11    // Iterate through each character in the string
12    for (const character of s) {
13        // Calculate the bit position for the current character (0 for 'a', 1 for 'b', etc.)
14        const bitPosition: number = character.charCodeAt(0) - 97;
15      
16        // Check if the current character has already been seen in the current partition
17        if ((characterBitmask >> bitPosition) & 1) {
18            // Character is repeated, start a new partition
19            partitionCount++;
20            // Reset the bitmask for the new partition
21            characterBitmask = 0;
22        }
23      
24        // Mark the current character as seen by setting its corresponding bit
25        characterBitmask |= 1 << bitPosition;
26    }
27  
28    return partitionCount;
29}
30

Time and Space Complexity

Time Complexity: O(n), where n is the length of the string s.

The algorithm iterates through each character in the string exactly once. For each character, it performs constant-time operations:

  • Converting character to its position (ord(c) - ord("a")) - O(1)
  • Checking if a bit is set (mask >> x & 1) - O(1)
  • Setting a bit (mask |= 1 << x) - O(1)
  • Potentially incrementing counter and resetting mask - O(1)

Since we visit each of the n characters once with O(1) operations per character, the total time complexity is O(n).

Space Complexity: O(1)

The algorithm uses only a fixed amount of extra space:

  • ans: an integer counter - O(1)
  • mask: an integer used as a bit mask - O(1)
  • x: a temporary variable for the character position - O(1)

The bit mask mask uses at most 26 bits (for lowercase English letters a-z), which is a constant amount of space regardless of input size. The space usage does not scale with the input string length n, resulting in O(1) space complexity.

Learn more about how to find time and space complexity quickly.

Common Pitfalls

1. Forgetting to Add Character After Starting New Partition

A critical mistake is forgetting to add the current character to the new partition's bitmask after detecting a duplicate and resetting the mask.

Incorrect Implementation:

if (character_bitmask >> char_position) & 1:
    partition_count += 1
    character_bitmask = 0
    # MISTAKE: Forgetting to add current character to new partition

Why This Fails: When we encounter a duplicate character and start a new partition, that character should be the first character of the new partition. If we don't add it, the bitmask remains at 0, and the next occurrence of the same character won't be detected as a duplicate.

Example with "aab":

  • Process 'a': mask=0, no duplicate, set bit 0 β†’ mask=1
  • Process 'a': mask=1, duplicate found, reset mask=0 (but forget to add 'a')
  • Process 'b': mask=0, no duplicate, set bit 1 β†’ mask=2
  • Result: We'd incorrectly partition as ["a", "ab"] instead of ["a", "a", "b"]

Correct Solution: Always add the current character to the bitmask, regardless of whether it caused a new partition:

if (character_bitmask >> char_position) & 1:
    partition_count += 1
    character_bitmask = 0

# This line MUST execute for every character
character_bitmask |= (1 << char_position)

2. Initializing Answer to 0 Instead of 1

Starting with partition_count = 0 is another common error.

Why This Fails: The string always needs at least one partition. Even a single character string like "a" requires one partition. Starting from 0 would give incorrect results for all inputs.

Correct Approach: Always initialize to 1 since we're counting partitions, not partition boundaries:

partition_count = 1  # Start with one partition

3. Using Wrong Bit Operations for Checking

Confusing the bit check operation can lead to incorrect duplicate detection.

Incorrect Approaches:

# Wrong: Using AND without shifting first
if character_bitmask & char_position:  

# Wrong: Shifting in wrong direction
if (character_bitmask << char_position) & 1:

# Wrong: Using wrong mask value
if character_bitmask & (1 << char_position) == 0:  # Logic inverted

Correct Approach: The proper way to check if bit at position x is set:

if (character_bitmask >> char_position) & 1:

Or alternatively:

if character_bitmask & (1 << char_position):
Discover Your Strengths and Weaknesses: Take Our 3-Minute Quiz to Tailor Your Study Plan:

Consider the classic dynamic programming of fibonacci numbers, what is the recurrence relation?


Recommended Readings

Want a Structured Path to Master System Design Too? Don’t Miss This!

Load More