Facebook Pixel

1239. Maximum Length of a Concatenated String with Unique Characters

MediumBit ManipulationArrayStringBacktracking
Leetcode Link

Problem Description

You are given an array of strings called arr. Your task is to create a new string s by concatenating a subsequence of strings from arr, with the constraint that all characters in the final string s must be unique (no character can appear more than once).

A subsequence means you can pick any subset of strings from arr while maintaining their original order. For example, if arr = ["un", "iq", "ue"], you could pick ["un", "iq"] or ["iq", "ue"] or just ["iq"], but you cannot rearrange them.

The goal is to find the maximum possible length of such a string s.

For example:

  • If arr = ["un", "iq", "ue"], you could concatenate all three to get "unique" with length 6, where all characters are distinct.
  • If arr = ["cha", "r", "act", "ers"], you could concatenate "cha" and "ers" to get "chaers" with length 6, but you cannot use "cha" and "act" together because they both contain the character 'a'.
  • Individual strings in arr that already have duplicate characters within themselves (like "aa") cannot be used at all.

The solution uses bit manipulation where each string is represented as a 26-bit integer. Each bit position corresponds to a letter (bit 0 for 'a', bit 1 for 'b', etc.). A bit is set to 1 if that character appears in the string. The algorithm builds up all possible valid combinations by checking if strings share any common characters using bitwise AND operations, and combines valid strings using bitwise OR operations.

Flowchart Walkthrough

First, let's pin down the algorithm using the Flowchart. Here's a step-by-step walkthrough:

Is it a graph?

  • No: The problem involves an array of strings and subsequence selection, not nodes and edges as in graph problems.

Need to solve for kth smallest/largest?

  • No: We're looking for the maximum length, not the kth smallest or largest element.

Involves Linked Lists?

  • No: The problem works with an array of strings, not linked lists.

Does the problem have small constraints?

  • Yes: The problem typically has small constraints. We need to explore all possible subsequences of the array, and since we're dealing with selecting or not selecting each string (2^n possibilities), this suggests the input size is limited. Additionally, we're only dealing with lowercase letters (26 characters maximum).

Brute force / Backtracking?

  • Yes: We need to explore all possible combinations of strings from the array while maintaining the constraint that all characters must be unique. This is a classic backtracking scenario where we:
    • Try including each string in our subsequence
    • Check if it maintains unique characters
    • If valid, continue exploring with remaining strings
    • If not valid or we've explored all options, backtrack and try different combinations

Conclusion: The flowchart correctly leads us to use a Backtracking approach. The problem requires us to explore all possible subsequences of strings while maintaining the uniqueness constraint, which is perfectly suited for backtracking where we build solutions incrementally and abandon paths that violate our constraints.

Quick Interview Experience
Help others by sharing your interview experience
Have you seen this problem before?

Intuition

When we think about this problem, we need to find all valid combinations of strings where no character repeats across the concatenated result. This screams backtracking - we want to explore all possibilities but smartly prune invalid paths.

The key insight is that we can represent each string as a set of characters. If a string has duplicate characters within itself (like "aa"), we immediately know it's invalid and can skip it. For valid strings, we need to check if combining them with our current selection creates any character conflicts.

But here's where the bit manipulation optimization comes in: instead of using actual sets or arrays to track characters, we can use a 26-bit integer as a "character mask". Why? Because we only have 26 lowercase letters. Each bit position represents a letter (bit 0 for 'a', bit 1 for 'b', etc.). A 1 in a position means that character is present.

This representation gives us powerful operations:

  • To check if two strings share any characters: use bitwise AND (x & y). If the result is 0, they have no common characters.
  • To combine two strings: use bitwise OR (x | y).
  • To count characters in a string: count the number of 1-bits (using bit_count()).

The algorithm builds up all possible valid combinations iteratively. We start with an empty combination (represented as 0). For each string in the array:

  1. Convert it to its bit representation
  2. Check if it's valid (no duplicate characters within itself)
  3. For each existing valid combination, check if we can add this string (no character conflicts)
  4. If we can add it, create a new combination and add it to our collection

This approach is essentially dynamic programming with bit manipulation - we build larger solutions from smaller ones, using bits to efficiently track and check character usage. The maximum number of set bits across all combinations gives us our answer.

Learn more about Backtracking patterns.

Solution Approach

The solution uses state compression with bit manipulation to efficiently track and combine character sets. Let's walk through the implementation step by step:

1. Initialize the state array:

s = [0]

We start with an array s containing only 0, representing an empty subsequence with no characters selected.

2. Process each string in the array: For each string t in arr, we first convert it to its bit representation:

x = 0
for b in map(lambda c: ord(c) - 97, t):
    if x >> b & 1:
        x = 0
        break
    x |= 1 << b
  • Convert each character to its position (0-25) using ord(c) - 97
  • Check if bit b is already set using x >> b & 1
  • If the character already exists (duplicate within the string), set x = 0 to mark it as invalid
  • Otherwise, set the bit at position b using x |= 1 << b

3. Combine with existing valid states: If the string is valid (x != 0), we try to combine it with all existing valid combinations:

if x:
    s.extend((x | y) for y in s if (x & y) == 0)
  • For each existing state y in s
  • Check if x and y have no common characters using (x & y) == 0
  • If they don't overlap, create a new state x | y (union of both character sets)
  • Add all new valid combinations to s

4. Find the maximum length:

return max(x.bit_count() for x in s)
  • For each state in s, count the number of set bits using bit_count()
  • The number of set bits equals the number of unique characters
  • Return the maximum count

Example walkthrough: If arr = ["un", "iq", "ue"]:

  • Start: s = [0]
  • Process "un": x = 0b10000010000000000000 (bits for 'u' and 'n' set)
    • Combine with 0: s = [0, 0b10000010000000000000]
  • Process "iq": x = 0b1000100000000000 (bits for 'i' and 'q' set)
    • Combine with 0 and previous state: s now has 4 states
  • Process "ue": x = 0b10010000000 (bits for 'u' and 'e' set)
    • Can combine with states not containing 'u' or 'e'
  • Final: Count bits in each state, return maximum (6 for "unique")

The time complexity is O(2^n) in the worst case where n is the length of arr, as we might generate all possible subsequences. The space complexity is also O(2^n) for storing all valid states.

Ready to land your dream job?

Unlock your dream job with a 5-minute evaluator for a personalized learning plan!

Start Evaluator

Example Walkthrough

Let's trace through the algorithm with arr = ["a", "abc", "d", "de", "def"]:

Initial State:

  • s = [0] (empty subsequence, no characters selected)

Process "a":

  • Convert to bits: x = 0b1 (only bit 0 set for 'a')
  • No duplicates within string, so valid
  • Combine with existing states:
    • Can combine with 0: 0 | 0b1 = 0b1
  • s = [0, 0b1]
  • States represent: ["", "a"]

Process "abc":

  • Convert to bits: x = 0b111 (bits 0,1,2 set for 'a','b','c')
  • No duplicates, valid
  • Combine with existing states:
    • Can combine with 0: 0 | 0b111 = 0b111
    • Cannot combine with 0b1 (both have 'a'): 0b111 & 0b1 = 0b1 ≠ 0
  • s = [0, 0b1, 0b111]
  • States represent: ["", "a", "abc"]

Process "d":

  • Convert to bits: x = 0b1000 (bit 3 set for 'd')
  • Valid string
  • Combine with existing states:
    • Can combine with 0: gives 0b1000
    • Can combine with 0b1: gives 0b1001 ("a" + "d" = "ad")
    • Can combine with 0b111: gives 0b1111 ("abc" + "d" = "abcd")
  • s = [0, 0b1, 0b111, 0b1000, 0b1001, 0b1111]
  • States represent: ["", "a", "abc", "d", "ad", "abcd"]

Process "de":

  • Convert to bits: x = 0b11000 (bits 3,4 set for 'd','e')
  • Valid string
  • Combine with existing states:
    • Can combine with 0: gives 0b11000
    • Can combine with 0b1: gives 0b11001 ("a" + "de" = "ade")
    • Can combine with 0b111: gives 0b11111 ("abc" + "de" = "abcde")
    • Cannot combine with 0b1000 (both have 'd')
    • Cannot combine with 0b1001 (both have 'd')
    • Cannot combine with 0b1111 (both have 'd')
  • s = [0, 0b1, 0b111, 0b1000, 0b1001, 0b1111, 0b11000, 0b11001, 0b11111]

Process "def":

  • Convert to bits: x = 0b111000 (bits 3,4,5 set for 'd','e','f')
  • Valid string
  • Combine with existing states:
    • Can combine with 0: gives 0b111000
    • Can combine with 0b1: gives 0b111001 ("a" + "def" = "adef")
    • Can combine with 0b111: gives 0b111111 ("abc" + "def" = "abcdef")
    • Cannot combine with others (all contain 'd' or 'e')
  • Final states include 0b111111 representing "abcdef"

Find Maximum:

  • Count bits in each state
  • 0b111111 has 6 bits set (characters a,b,c,d,e,f)
  • Answer: 6 (string "abcdef" from concatenating "abc" and "def")

Solution Implementation

1class Solution:
2    def maxLength(self, arr: List[str]) -> int:
3        # Initialize with empty string represented as 0 (no characters selected)
4        valid_combinations = [0]
5      
6        for string in arr:
7            # Convert string to bitmask representation
8            bitmask = 0
9          
10            # Check if string has duplicate characters while building bitmask
11            for char in string:
12                bit_position = ord(char) - ord('a')  # Get position (0-25) for character
13              
14                # If this bit is already set, string has duplicate characters
15                if (bitmask >> bit_position) & 1:
16                    bitmask = 0  # Mark as invalid
17                    break
18              
19                # Set the bit for this character
20                bitmask |= (1 << bit_position)
21          
22            # If string is valid (no duplicates), try combining with existing combinations
23            if bitmask:
24                # Create new combinations by merging with existing valid combinations
25                new_combinations = []
26                for existing_mask in valid_combinations:
27                    # Check if there's no character overlap (bitwise AND should be 0)
28                    if (bitmask & existing_mask) == 0:
29                        new_combinations.append(bitmask | existing_mask)
30              
31                # Add all new valid combinations to our list
32                valid_combinations.extend(new_combinations)
33      
34        # Return the maximum number of unique characters (count of set bits)
35        return max(bin(mask).count('1') for mask in valid_combinations)
36
1class Solution {
2    public int maxLength(List<String> arr) {
3        // List to store valid character bitmasks of concatenated strings
4        // Each integer represents a set of unique characters using bit manipulation
5        List<Integer> validBitmasks = new ArrayList<>();
6        validBitmasks.add(0); // Start with empty set (no characters selected)
7      
8        int maxLength = 0;
9      
10        // Process each string in the input array
11        for (String currentString : arr) {
12            // Convert current string to bitmask representation
13            int currentBitmask = 0;
14          
15            // Check if current string has all unique characters
16            for (char ch : currentString.toCharArray()) {
17                int bitPosition = ch - 'a'; // Get bit position (0-25 for 'a'-'z')
18              
19                // If bit is already set, string has duplicate characters
20                if ((currentBitmask >> bitPosition & 1) == 1) {
21                    currentBitmask = 0; // Mark as invalid
22                    break;
23                }
24              
25                // Set the bit for this character
26                currentBitmask |= 1 << bitPosition;
27            }
28          
29            // Only process strings with unique characters
30            if (currentBitmask > 0) {
31                // Try combining with all existing valid bitmasks
32                // Iterate backwards to avoid modifying list while iterating
33                for (int i = validBitmasks.size() - 1; i >= 0; i--) {
34                    int existingBitmask = validBitmasks.get(i);
35                  
36                    // Check if no common characters (bitwise AND should be 0)
37                    if ((currentBitmask & existingBitmask) == 0) {
38                        // Combine the two bitmasks
39                        int combinedBitmask = currentBitmask | existingBitmask;
40                        validBitmasks.add(combinedBitmask);
41                      
42                        // Update maximum length (count of set bits)
43                        maxLength = Math.max(maxLength, Integer.bitCount(combinedBitmask));
44                    }
45                }
46            }
47        }
48      
49        return maxLength;
50    }
51}
52
1class Solution {
2public:
3    int maxLength(vector<string>& arr) {
4        // Store all possible combinations of unique character sets as bitmasks
5        // Initially contains 0 (empty set)
6        vector<int> validCombinations = {0};
7        int maxLen = 0;
8      
9        // Process each string in the input array
10        for (const string& str : arr) {
11            // Convert current string to bitmask representation
12            int currentMask = 0;
13          
14            // Check if string has all unique characters and build its bitmask
15            for (char ch : str) {
16                int bitPosition = ch - 'a';
17              
18                // If character already exists in current string, it's invalid
19                if ((currentMask >> bitPosition & 1) == 1) {
20                    currentMask = 0;  // Mark as invalid
21                    break;
22                }
23              
24                // Set the bit for this character
25                currentMask |= 1 << bitPosition;
26            }
27          
28            // Only process valid strings (with unique characters)
29            if (currentMask > 0) {
30                // Try to combine with all existing valid combinations
31                // Iterate backwards to avoid processing newly added combinations
32                for (int i = validCombinations.size() - 1; i >= 0; --i) {
33                    int existingMask = validCombinations[i];
34                  
35                    // Check if no common characters exist between current and existing
36                    if ((currentMask & existingMask) == 0) {
37                        // Create new combination by merging the two masks
38                        int newCombination = currentMask | existingMask;
39                        validCombinations.push_back(newCombination);
40                      
41                        // Update maximum length (count set bits)
42                        maxLen = max(maxLen, __builtin_popcount(newCombination));
43                    }
44                }
45            }
46        }
47      
48        return maxLen;
49    }
50};
51
1/**
2 * Finds the maximum length of a concatenated string with unique characters
3 * @param arr - Array of strings to concatenate
4 * @returns Maximum length of concatenated string with all unique characters
5 */
6function maxLength(arr: string[]): number {
7    // Array to store valid character bitmasks, initialized with 0 (empty string)
8    const validBitmasks: number[] = [0];
9    let maxLen = 0;
10  
11    // Process each string in the input array
12    for (const str of arr) {
13        let bitmask = 0;
14      
15        // Convert string to bitmask representation
16        for (const char of str) {
17            // Get bit position (0-25) for character 'a' to 'z'
18            const bitPosition = char.charCodeAt(0) - 97;
19          
20            // Check if character already exists in current string (duplicate)
21            if ((bitmask >> bitPosition) & 1) {
22                bitmask = 0;
23                break;
24            }
25          
26            // Set bit for current character
27            bitmask |= 1 << bitPosition;
28        }
29      
30        // If string has all unique characters (valid bitmask)
31        if (bitmask > 0) {
32            // Try combining with all existing valid bitmasks
33            for (let i = validBitmasks.length - 1; i >= 0; i--) {
34                const existingBitmask = validBitmasks[i];
35              
36                // Check if no common characters (no bit overlap)
37                if ((bitmask & existingBitmask) === 0) {
38                    // Create combined bitmask
39                    const combinedBitmask = bitmask | existingBitmask;
40                    validBitmasks.push(combinedBitmask);
41                  
42                    // Update maximum length
43                    maxLen = Math.max(maxLen, bitCount(combinedBitmask));
44                }
45            }
46        }
47    }
48  
49    return maxLen;
50}
51
52/**
53 * Counts the number of set bits (1s) in a number using Brian Kernighan's algorithm
54 * @param num - Integer to count bits for
55 * @returns Number of set bits
56 */
57function bitCount(num: number): number {
58    // Subtract pairs of bits
59    num = num - ((num >>> 1) & 0x55555555);
60  
61    // Add pairs of bits
62    num = (num & 0x33333333) + ((num >>> 2) & 0x33333333);
63  
64    // Add nibbles (4-bit groups)
65    num = (num + (num >>> 4)) & 0x0f0f0f0f;
66  
67    // Add bytes
68    num = num + (num >>> 8);
69  
70    // Add 16-bit values
71    num = num + (num >>> 16);
72  
73    // Mask to get final count (max 32 bits, so result fits in 6 bits)
74    return num & 0x3f;
75}
76

Time and Space Complexity

Time Complexity: O(2^n + L)

The algorithm iterates through each string in the array (n strings). For each string, it:

  • Processes each character to build a bitmask, taking O(length of string) time
  • For valid strings (with unique characters), it extends the list s by iterating through all existing bitmasks and checking compatibility using bitwise AND operation

In the worst case where all strings have unique characters and are mutually compatible, the list s can grow exponentially. After processing k strings, s can contain up to 2^k elements. When processing the (k+1)-th string, we potentially add another 2^k elements, iterating through all existing ones.

The total operations are:

  • Processing all characters across all strings: O(L) where L is the sum of lengths of all strings
  • Building all possible combinations: O(2^n) as we can have at most 2^n different subsequences
  • The final bit_count() operation on all elements: O(2^n)

Therefore, the time complexity is O(2^n + L).

Space Complexity: O(2^n)

The space is dominated by the list s which stores all possible valid combinations as bitmasks. In the worst case, where all strings can be combined with each other, the list s can grow to contain 2^n elements (representing all possible subsequences of the array). Each element is an integer bitmask, taking O(1) space.

Therefore, the space complexity is O(2^n).

Learn more about how to find time and space complexity quickly.

Common Pitfalls

1. Forgetting to Handle Strings with Internal Duplicates

One of the most common mistakes is not properly checking if individual strings in the input array contain duplicate characters within themselves. For example, if arr = ["aa", "bb"], both strings should be completely excluded from consideration since they have internal duplicates.

Incorrect approach:

# Wrong: Only checking duplicates between different strings
for string in arr:
    bitmask = 0
    for char in string:
        bit_position = ord(char) - ord('a')
        bitmask |= (1 << bit_position)  # Just sets bits without checking
    # This would incorrectly allow "aa" to be processed

Correct approach:

for string in arr:
    bitmask = 0
    for char in string:
        bit_position = ord(char) - ord('a')
        if (bitmask >> bit_position) & 1:  # Check if bit already set
            bitmask = 0  # Mark entire string as invalid
            break
        bitmask |= (1 << bit_position)

2. Modifying the List While Iterating Over It

Another common pitfall is trying to modify the valid_combinations list while iterating through it, which can lead to unexpected behavior or infinite loops.

Incorrect approach:

if bitmask:
    for existing_mask in valid_combinations:
        if (bitmask & existing_mask) == 0:
            valid_combinations.append(bitmask | existing_mask)  # Dangerous!

Correct approach:

if bitmask:
    new_combinations = []
    for existing_mask in valid_combinations:
        if (bitmask & existing_mask) == 0:
            new_combinations.append(bitmask | existing_mask)
    valid_combinations.extend(new_combinations)  # Add after iteration

3. Not Including the Empty Subsequence Initially

The algorithm relies on starting with [0] to represent the empty subsequence. Without this, you won't be able to include individual strings as valid subsequences.

Incorrect:

valid_combinations = []  # Wrong: No base case

Correct:

valid_combinations = [0]  # Start with empty subsequence

4. Using Wrong Bit Counting Methods

When calculating the final result, using string manipulation instead of efficient bit counting can lead to errors or inefficiency.

Less reliable approach:

# Could have issues with leading zeros or formatting
return max(len(format(mask, 'b').replace('0', '')) for mask in valid_combinations)

Better approach:

# More direct and reliable
return max(bin(mask).count('1') for mask in valid_combinations)
# Or in Python 3.10+: mask.bit_count()

5. Integer Overflow Concerns (Language-Specific)

While not an issue in Python, in languages with fixed-size integers, using 32-bit integers might cause issues if you accidentally shift bits beyond position 25. Always ensure you're working within the valid range (0-25 for lowercase letters).

Prevention:

bit_position = ord(char) - ord('a')
assert 0 <= bit_position <= 25, "Character out of range"
Discover Your Strengths and Weaknesses: Take Our 5-Minute Quiz to Tailor Your Study Plan:

Which data structure is used to implement priority queue?


Recommended Readings

Want a Structured Path to Master System Design Too? Don’t Miss This!

Load More