1239. Maximum Length of a Concatenated String with Unique Characters
Problem Description
You are given an array of strings called arr
. Your task is to create a new string s
by concatenating a subsequence of strings from arr
, with the constraint that all characters in the final string s
must be unique (no character can appear more than once).
A subsequence means you can pick any subset of strings from arr
while maintaining their original order. For example, if arr = ["un", "iq", "ue"]
, you could pick ["un", "iq"]
or ["iq", "ue"]
or just ["iq"]
, but you cannot rearrange them.
The goal is to find the maximum possible length of such a string s
.
For example:
- If
arr = ["un", "iq", "ue"]
, you could concatenate all three to get"unique"
with length 6, where all characters are distinct. - If
arr = ["cha", "r", "act", "ers"]
, you could concatenate"cha"
and"ers"
to get"chaers"
with length 6, but you cannot use"cha"
and"act"
together because they both contain the character'a'
. - Individual strings in
arr
that already have duplicate characters within themselves (like"aa"
) cannot be used at all.
The solution uses bit manipulation where each string is represented as a 26-bit integer. Each bit position corresponds to a letter (bit 0 for 'a', bit 1 for 'b', etc.). A bit is set to 1 if that character appears in the string. The algorithm builds up all possible valid combinations by checking if strings share any common characters using bitwise AND operations, and combines valid strings using bitwise OR operations.
Flowchart Walkthrough
First, let's pin down the algorithm using the Flowchart. Here's a step-by-step walkthrough:
Is it a graph?
- No: The problem involves an array of strings and subsequence selection, not nodes and edges as in graph problems.
Need to solve for kth smallest/largest?
- No: We're looking for the maximum length, not the kth smallest or largest element.
Involves Linked Lists?
- No: The problem works with an array of strings, not linked lists.
Does the problem have small constraints?
- Yes: The problem typically has small constraints. We need to explore all possible subsequences of the array, and since we're dealing with selecting or not selecting each string (2^n possibilities), this suggests the input size is limited. Additionally, we're only dealing with lowercase letters (26 characters maximum).
Brute force / Backtracking?
- Yes: We need to explore all possible combinations of strings from the array while maintaining the constraint that all characters must be unique. This is a classic backtracking scenario where we:
- Try including each string in our subsequence
- Check if it maintains unique characters
- If valid, continue exploring with remaining strings
- If not valid or we've explored all options, backtrack and try different combinations
Conclusion: The flowchart correctly leads us to use a Backtracking approach. The problem requires us to explore all possible subsequences of strings while maintaining the uniqueness constraint, which is perfectly suited for backtracking where we build solutions incrementally and abandon paths that violate our constraints.
Intuition
When we think about this problem, we need to find all valid combinations of strings where no character repeats across the concatenated result. This screams backtracking - we want to explore all possibilities but smartly prune invalid paths.
The key insight is that we can represent each string as a set of characters. If a string has duplicate characters within itself (like "aa"
), we immediately know it's invalid and can skip it. For valid strings, we need to check if combining them with our current selection creates any character conflicts.
But here's where the bit manipulation optimization comes in: instead of using actual sets or arrays to track characters, we can use a 26-bit integer as a "character mask". Why? Because we only have 26 lowercase letters. Each bit position represents a letter (bit 0
for 'a', bit 1
for 'b', etc.). A 1
in a position means that character is present.
This representation gives us powerful operations:
- To check if two strings share any characters: use bitwise AND (
x & y
). If the result is 0, they have no common characters. - To combine two strings: use bitwise OR (
x | y
). - To count characters in a string: count the number of 1-bits (using
bit_count()
).
The algorithm builds up all possible valid combinations iteratively. We start with an empty combination (represented as 0
). For each string in the array:
- Convert it to its bit representation
- Check if it's valid (no duplicate characters within itself)
- For each existing valid combination, check if we can add this string (no character conflicts)
- If we can add it, create a new combination and add it to our collection
This approach is essentially dynamic programming with bit manipulation - we build larger solutions from smaller ones, using bits to efficiently track and check character usage. The maximum number of set bits across all combinations gives us our answer.
Learn more about Backtracking patterns.
Solution Approach
The solution uses state compression with bit manipulation to efficiently track and combine character sets. Let's walk through the implementation step by step:
1. Initialize the state array:
s = [0]
We start with an array s
containing only 0
, representing an empty subsequence with no characters selected.
2. Process each string in the array:
For each string t
in arr
, we first convert it to its bit representation:
x = 0
for b in map(lambda c: ord(c) - 97, t):
if x >> b & 1:
x = 0
break
x |= 1 << b
- Convert each character to its position (0-25) using
ord(c) - 97
- Check if bit
b
is already set usingx >> b & 1
- If the character already exists (duplicate within the string), set
x = 0
to mark it as invalid - Otherwise, set the bit at position
b
usingx |= 1 << b
3. Combine with existing valid states:
If the string is valid (x != 0
), we try to combine it with all existing valid combinations:
if x: s.extend((x | y) for y in s if (x & y) == 0)
- For each existing state
y
ins
- Check if
x
andy
have no common characters using(x & y) == 0
- If they don't overlap, create a new state
x | y
(union of both character sets) - Add all new valid combinations to
s
4. Find the maximum length:
return max(x.bit_count() for x in s)
- For each state in
s
, count the number of set bits usingbit_count()
- The number of set bits equals the number of unique characters
- Return the maximum count
Example walkthrough:
If arr = ["un", "iq", "ue"]
:
- Start:
s = [0]
- Process "un":
x = 0b10000010000000000000
(bits for 'u' and 'n' set)- Combine with 0:
s = [0, 0b10000010000000000000]
- Combine with 0:
- Process "iq":
x = 0b1000100000000000
(bits for 'i' and 'q' set)- Combine with 0 and previous state:
s
now has 4 states
- Combine with 0 and previous state:
- Process "ue":
x = 0b10010000000
(bits for 'u' and 'e' set)- Can combine with states not containing 'u' or 'e'
- Final: Count bits in each state, return maximum (6 for "unique")
The time complexity is O(2^n) in the worst case where n is the length of arr
, as we might generate all possible subsequences. The space complexity is also O(2^n) for storing all valid states.
Ready to land your dream job?
Unlock your dream job with a 5-minute evaluator for a personalized learning plan!
Start EvaluatorExample Walkthrough
Let's trace through the algorithm with arr = ["a", "abc", "d", "de", "def"]
:
Initial State:
s = [0]
(empty subsequence, no characters selected)
Process "a":
- Convert to bits:
x = 0b1
(only bit 0 set for 'a') - No duplicates within string, so valid
- Combine with existing states:
- Can combine with
0
:0 | 0b1 = 0b1
- Can combine with
s = [0, 0b1]
- States represent:
["", "a"]
Process "abc":
- Convert to bits:
x = 0b111
(bits 0,1,2 set for 'a','b','c') - No duplicates, valid
- Combine with existing states:
- Can combine with
0
:0 | 0b111 = 0b111
- Cannot combine with
0b1
(both have 'a'):0b111 & 0b1 = 0b1 ≠ 0
- Can combine with
s = [0, 0b1, 0b111]
- States represent:
["", "a", "abc"]
Process "d":
- Convert to bits:
x = 0b1000
(bit 3 set for 'd') - Valid string
- Combine with existing states:
- Can combine with
0
: gives0b1000
- Can combine with
0b1
: gives0b1001
("a" + "d" = "ad") - Can combine with
0b111
: gives0b1111
("abc" + "d" = "abcd")
- Can combine with
s = [0, 0b1, 0b111, 0b1000, 0b1001, 0b1111]
- States represent:
["", "a", "abc", "d", "ad", "abcd"]
Process "de":
- Convert to bits:
x = 0b11000
(bits 3,4 set for 'd','e') - Valid string
- Combine with existing states:
- Can combine with
0
: gives0b11000
- Can combine with
0b1
: gives0b11001
("a" + "de" = "ade") - Can combine with
0b111
: gives0b11111
("abc" + "de" = "abcde") - Cannot combine with
0b1000
(both have 'd') - Cannot combine with
0b1001
(both have 'd') - Cannot combine with
0b1111
(both have 'd')
- Can combine with
s = [0, 0b1, 0b111, 0b1000, 0b1001, 0b1111, 0b11000, 0b11001, 0b11111]
Process "def":
- Convert to bits:
x = 0b111000
(bits 3,4,5 set for 'd','e','f') - Valid string
- Combine with existing states:
- Can combine with
0
: gives0b111000
- Can combine with
0b1
: gives0b111001
("a" + "def" = "adef") - Can combine with
0b111
: gives0b111111
("abc" + "def" = "abcdef") - Cannot combine with others (all contain 'd' or 'e')
- Can combine with
- Final states include
0b111111
representing "abcdef"
Find Maximum:
- Count bits in each state
0b111111
has 6 bits set (characters a,b,c,d,e,f)- Answer: 6 (string "abcdef" from concatenating "abc" and "def")
Solution Implementation
1class Solution:
2 def maxLength(self, arr: List[str]) -> int:
3 # Initialize with empty string represented as 0 (no characters selected)
4 valid_combinations = [0]
5
6 for string in arr:
7 # Convert string to bitmask representation
8 bitmask = 0
9
10 # Check if string has duplicate characters while building bitmask
11 for char in string:
12 bit_position = ord(char) - ord('a') # Get position (0-25) for character
13
14 # If this bit is already set, string has duplicate characters
15 if (bitmask >> bit_position) & 1:
16 bitmask = 0 # Mark as invalid
17 break
18
19 # Set the bit for this character
20 bitmask |= (1 << bit_position)
21
22 # If string is valid (no duplicates), try combining with existing combinations
23 if bitmask:
24 # Create new combinations by merging with existing valid combinations
25 new_combinations = []
26 for existing_mask in valid_combinations:
27 # Check if there's no character overlap (bitwise AND should be 0)
28 if (bitmask & existing_mask) == 0:
29 new_combinations.append(bitmask | existing_mask)
30
31 # Add all new valid combinations to our list
32 valid_combinations.extend(new_combinations)
33
34 # Return the maximum number of unique characters (count of set bits)
35 return max(bin(mask).count('1') for mask in valid_combinations)
36
1class Solution {
2 public int maxLength(List<String> arr) {
3 // List to store valid character bitmasks of concatenated strings
4 // Each integer represents a set of unique characters using bit manipulation
5 List<Integer> validBitmasks = new ArrayList<>();
6 validBitmasks.add(0); // Start with empty set (no characters selected)
7
8 int maxLength = 0;
9
10 // Process each string in the input array
11 for (String currentString : arr) {
12 // Convert current string to bitmask representation
13 int currentBitmask = 0;
14
15 // Check if current string has all unique characters
16 for (char ch : currentString.toCharArray()) {
17 int bitPosition = ch - 'a'; // Get bit position (0-25 for 'a'-'z')
18
19 // If bit is already set, string has duplicate characters
20 if ((currentBitmask >> bitPosition & 1) == 1) {
21 currentBitmask = 0; // Mark as invalid
22 break;
23 }
24
25 // Set the bit for this character
26 currentBitmask |= 1 << bitPosition;
27 }
28
29 // Only process strings with unique characters
30 if (currentBitmask > 0) {
31 // Try combining with all existing valid bitmasks
32 // Iterate backwards to avoid modifying list while iterating
33 for (int i = validBitmasks.size() - 1; i >= 0; i--) {
34 int existingBitmask = validBitmasks.get(i);
35
36 // Check if no common characters (bitwise AND should be 0)
37 if ((currentBitmask & existingBitmask) == 0) {
38 // Combine the two bitmasks
39 int combinedBitmask = currentBitmask | existingBitmask;
40 validBitmasks.add(combinedBitmask);
41
42 // Update maximum length (count of set bits)
43 maxLength = Math.max(maxLength, Integer.bitCount(combinedBitmask));
44 }
45 }
46 }
47 }
48
49 return maxLength;
50 }
51}
52
1class Solution {
2public:
3 int maxLength(vector<string>& arr) {
4 // Store all possible combinations of unique character sets as bitmasks
5 // Initially contains 0 (empty set)
6 vector<int> validCombinations = {0};
7 int maxLen = 0;
8
9 // Process each string in the input array
10 for (const string& str : arr) {
11 // Convert current string to bitmask representation
12 int currentMask = 0;
13
14 // Check if string has all unique characters and build its bitmask
15 for (char ch : str) {
16 int bitPosition = ch - 'a';
17
18 // If character already exists in current string, it's invalid
19 if ((currentMask >> bitPosition & 1) == 1) {
20 currentMask = 0; // Mark as invalid
21 break;
22 }
23
24 // Set the bit for this character
25 currentMask |= 1 << bitPosition;
26 }
27
28 // Only process valid strings (with unique characters)
29 if (currentMask > 0) {
30 // Try to combine with all existing valid combinations
31 // Iterate backwards to avoid processing newly added combinations
32 for (int i = validCombinations.size() - 1; i >= 0; --i) {
33 int existingMask = validCombinations[i];
34
35 // Check if no common characters exist between current and existing
36 if ((currentMask & existingMask) == 0) {
37 // Create new combination by merging the two masks
38 int newCombination = currentMask | existingMask;
39 validCombinations.push_back(newCombination);
40
41 // Update maximum length (count set bits)
42 maxLen = max(maxLen, __builtin_popcount(newCombination));
43 }
44 }
45 }
46 }
47
48 return maxLen;
49 }
50};
51
1/**
2 * Finds the maximum length of a concatenated string with unique characters
3 * @param arr - Array of strings to concatenate
4 * @returns Maximum length of concatenated string with all unique characters
5 */
6function maxLength(arr: string[]): number {
7 // Array to store valid character bitmasks, initialized with 0 (empty string)
8 const validBitmasks: number[] = [0];
9 let maxLen = 0;
10
11 // Process each string in the input array
12 for (const str of arr) {
13 let bitmask = 0;
14
15 // Convert string to bitmask representation
16 for (const char of str) {
17 // Get bit position (0-25) for character 'a' to 'z'
18 const bitPosition = char.charCodeAt(0) - 97;
19
20 // Check if character already exists in current string (duplicate)
21 if ((bitmask >> bitPosition) & 1) {
22 bitmask = 0;
23 break;
24 }
25
26 // Set bit for current character
27 bitmask |= 1 << bitPosition;
28 }
29
30 // If string has all unique characters (valid bitmask)
31 if (bitmask > 0) {
32 // Try combining with all existing valid bitmasks
33 for (let i = validBitmasks.length - 1; i >= 0; i--) {
34 const existingBitmask = validBitmasks[i];
35
36 // Check if no common characters (no bit overlap)
37 if ((bitmask & existingBitmask) === 0) {
38 // Create combined bitmask
39 const combinedBitmask = bitmask | existingBitmask;
40 validBitmasks.push(combinedBitmask);
41
42 // Update maximum length
43 maxLen = Math.max(maxLen, bitCount(combinedBitmask));
44 }
45 }
46 }
47 }
48
49 return maxLen;
50}
51
52/**
53 * Counts the number of set bits (1s) in a number using Brian Kernighan's algorithm
54 * @param num - Integer to count bits for
55 * @returns Number of set bits
56 */
57function bitCount(num: number): number {
58 // Subtract pairs of bits
59 num = num - ((num >>> 1) & 0x55555555);
60
61 // Add pairs of bits
62 num = (num & 0x33333333) + ((num >>> 2) & 0x33333333);
63
64 // Add nibbles (4-bit groups)
65 num = (num + (num >>> 4)) & 0x0f0f0f0f;
66
67 // Add bytes
68 num = num + (num >>> 8);
69
70 // Add 16-bit values
71 num = num + (num >>> 16);
72
73 // Mask to get final count (max 32 bits, so result fits in 6 bits)
74 return num & 0x3f;
75}
76
Time and Space Complexity
Time Complexity: O(2^n + L)
The algorithm iterates through each string in the array (n
strings). For each string, it:
- Processes each character to build a bitmask, taking
O(length of string)
time - For valid strings (with unique characters), it extends the list
s
by iterating through all existing bitmasks and checking compatibility using bitwise AND operation
In the worst case where all strings have unique characters and are mutually compatible, the list s
can grow exponentially. After processing k
strings, s
can contain up to 2^k
elements. When processing the (k+1)
-th string, we potentially add another 2^k
elements, iterating through all existing ones.
The total operations are:
- Processing all characters across all strings:
O(L)
whereL
is the sum of lengths of all strings - Building all possible combinations:
O(2^n)
as we can have at most2^n
different subsequences - The final
bit_count()
operation on all elements:O(2^n)
Therefore, the time complexity is O(2^n + L)
.
Space Complexity: O(2^n)
The space is dominated by the list s
which stores all possible valid combinations as bitmasks. In the worst case, where all strings can be combined with each other, the list s
can grow to contain 2^n
elements (representing all possible subsequences of the array). Each element is an integer bitmask, taking O(1)
space.
Therefore, the space complexity is O(2^n)
.
Learn more about how to find time and space complexity quickly.
Common Pitfalls
1. Forgetting to Handle Strings with Internal Duplicates
One of the most common mistakes is not properly checking if individual strings in the input array contain duplicate characters within themselves. For example, if arr = ["aa", "bb"]
, both strings should be completely excluded from consideration since they have internal duplicates.
Incorrect approach:
# Wrong: Only checking duplicates between different strings
for string in arr:
bitmask = 0
for char in string:
bit_position = ord(char) - ord('a')
bitmask |= (1 << bit_position) # Just sets bits without checking
# This would incorrectly allow "aa" to be processed
Correct approach:
for string in arr:
bitmask = 0
for char in string:
bit_position = ord(char) - ord('a')
if (bitmask >> bit_position) & 1: # Check if bit already set
bitmask = 0 # Mark entire string as invalid
break
bitmask |= (1 << bit_position)
2. Modifying the List While Iterating Over It
Another common pitfall is trying to modify the valid_combinations
list while iterating through it, which can lead to unexpected behavior or infinite loops.
Incorrect approach:
if bitmask: for existing_mask in valid_combinations: if (bitmask & existing_mask) == 0: valid_combinations.append(bitmask | existing_mask) # Dangerous!
Correct approach:
if bitmask: new_combinations = [] for existing_mask in valid_combinations: if (bitmask & existing_mask) == 0: new_combinations.append(bitmask | existing_mask) valid_combinations.extend(new_combinations) # Add after iteration
3. Not Including the Empty Subsequence Initially
The algorithm relies on starting with [0]
to represent the empty subsequence. Without this, you won't be able to include individual strings as valid subsequences.
Incorrect:
valid_combinations = [] # Wrong: No base case
Correct:
valid_combinations = [0] # Start with empty subsequence
4. Using Wrong Bit Counting Methods
When calculating the final result, using string manipulation instead of efficient bit counting can lead to errors or inefficiency.
Less reliable approach:
# Could have issues with leading zeros or formatting
return max(len(format(mask, 'b').replace('0', '')) for mask in valid_combinations)
Better approach:
# More direct and reliable
return max(bin(mask).count('1') for mask in valid_combinations)
# Or in Python 3.10+: mask.bit_count()
5. Integer Overflow Concerns (Language-Specific)
While not an issue in Python, in languages with fixed-size integers, using 32-bit integers might cause issues if you accidentally shift bits beyond position 25. Always ensure you're working within the valid range (0-25 for lowercase letters).
Prevention:
bit_position = ord(char) - ord('a')
assert 0 <= bit_position <= 25, "Character out of range"
Which data structure is used to implement priority queue?
Recommended Readings
Backtracking Template Prereq DFS with States problems dfs_with_states Combinatorial search problems Combinatorial search problems involve finding groupings and assignments of objects that satisfy certain conditions Finding all permutations combinations subsets and solving Sudoku are classic combinatorial problems The time complexity of combinatorial problems often grows rapidly with the size of
Coding Interview Patterns Your Personal Dijkstra's Algorithm to Landing Your Dream Job The goal of AlgoMonster is to help you get a job in the shortest amount of time possible in a data driven way We compiled datasets of tech interview problems and broke them down by patterns This way
Recursion Recursion is one of the most important concepts in computer science Simply speaking recursion is the process of a function calling itself Using a real life analogy imagine a scenario where you invite your friends to lunch https assets algo monster recursion jpg You first call Ben and ask
Want a Structured Path to Master System Design Too? Don’t Miss This!