1239. Maximum Length of a Concatenated String with Unique Characters
Problem Description
In this problem, you're given an array of strings named arr
. Your task is to create the longest possible string s
by concatenating some or all strings from arr
. However, there's an important rule: the string s
must consist of unique characters only.
In other words, you need to pick a subsequence of strings from arr
such that when these strings are combined, no character appears more than once. The length of the resulting string s
is your main objective, and you must return the maximum possible length. Remember that a subsequence is formed from the original array by potentially removing some elements, but the order of the remaining elements must be preserved.
Flowchart Walkthrough
Let's analyze the problem using the the Flowchart. Here's a step-by-step walkthrough:
-
Is it a graph?
- No: The problem does not involve direct graph concepts like traversal or node edges relationship. It revolves around strings and character arrays.
-
Need to solve for kth smallest/largest?
- No: The question does not ask for kth smallest or largest element. It wants the maximum length of the concatenated string that contains unique characters.
-
Involves Linked Lists?
- No: This problem is related to arrays of strings, not linked lists.
-
Does the problem have small constraints?
- Yes: The problem can have small constraints because the construction of different combinations from a list falls into exponential growth with respect to input size but can still be considered small enough for backtracking.
-
Brute force / Backtracking?
- Yes: Given the requirement for uniqueness and the combination of strings, backtracking is suitable as it allows exploration of all combinations of strings to ensure the maximum length where all characters are unique.
Conclusion: The flowchart leads us to conclude using a backtracking approach, where we recursively try to add each string to the existing set and backtrack when the maximum cannot be enhanced or duplicates are introduced.
Intuition
To find the optimal solution for the maximum length of s
, we use a bit manipulation technique known as "state compression". This technique involves using a 32-bit integer to represent the presence of each letter where each bit corresponds to a letter in the alphabet.
The intuition behind this approach is that it allows us to efficiently check whether two strings contain any common characters. We can do this by performing an AND operation (&
) on their respective compressed states (masks). If the result is zero, it means there are no common characters.
Here's a step-by-step breakdown of how we arrive at the solution:
- Initialize a variable
ans
to zero. This will hold the maximum length found. - Create a list
masks
with a single element, zero, to keep track of all unique character combinations we've seen so far. - Iterate through each string
s
in the arrayarr
.- For each string, create a bit mask
mask
that represents the unique characters in the string. - If the string has duplicate characters, we set
mask
to zero and skip it since such a string cannot contribute to a valids
.
- For each string, create a bit mask
- Iterate through the current masks in
masks
.- We use the AND operation to check if the current string's mask has any overlap with
mask
. If it doesn't (i.e., the result is zero), it means we can concatenate this string without repeating any character. - In such a case, we combine (OR operation
|
) the current mask withmask
and add the new mask to ourmasks
list. - Update
ans
with the count of set bits in the new mask, which represents the length of the unique character string formed up to this point. This is done using the built-inbit_count()
method on the new mask.
- We use the AND operation to check if the current string's mask has any overlap with
- Finally, return the maximum length
ans
found during the process.
The use of state compression and bit manipulation makes the solution efficient, as it simplifies the process of tracking which characters are present in the strings and eliminates the need for complex data structures or string operations.
Learn more about Backtracking patterns.
Solution Approach
The solution uses two significant concepts: bit manipulation for state compression and backtracking to explore all combinations. The algorithm follows these steps:
-
Initialize an integer
ans
with the value0
, which will track the maximum length of a string with unique characters found during the process. -
We begin with an array
masks
that starts with a single element,0
, to record the baseline state (the empty string). -
Loop through each string
s
in the input arrayarr
.-
For each string
s
, we create an integermask
that serves as its unique character identifier. The binary representation ofmask
will have a1
in the position corresponding to a letter (wherea
is the least significant bit, andz
is the most significant). -
As we iterate over each character
c
in the strings
, we calculate the difference between the ASCII value ofc
and that of'a'
to find the corresponding bit position.
i = ord(c) - ord('a')
- We then check if the bit at that position is already set. If so, it means the character has appeared before, and we break the loop, setting
mask
to0
, as this string cannot be part of the result due to the duplicate character.
if mask >> i & 1: mask = 0 break
- Otherwise, we set the bit corresponding to the character in the
mask
.
mask |= 1 << i
- If the current string
s
has any duplicate characters, we disregard thismask
and move on to the next string inarr
.
-
-
Next, for each
mask
computed from the current string, we examine the masks collected so far.- For each existing
m
inmasks
, we ensure there's no overlap betweenm
and the currentmask
using the bitwise AND operation.
if m & mask == 0:
-
If there's no overlap, it means that adding the current string's unique characters to the characters represented by
m
would still result in a string with all unique characters. -
In that case, we combine
m
andmask
using the bitwise OR operation. The result is a new mask representing a unique combination of characters from both masks. We add this new mask tomasks
.
masks.append(m | mask)
- We then calculate the total number of unique characters we have so far with the new mask by counting the number of set bits. In Python, this can be done with the
.bit_count()
method. We update our answerans
if this count is greater than the previous maximum.
ans = max(ans, (m | mask).bit_count())
- For each existing
-
Once we've checked all strings and recorded all possible unique character combinations, we return the value of
ans
, which represents the maximum length of a string with all unique characters we can create by concatenating a subsequence ofarr
.
The use of bit masks elegantly handles the uniqueness constraint, and the iterative approach ensures that all potential combinations are considered without duplication, leading to an efficient solution.
Ready to land your dream job?
Unlock your dream job with a 2-minute evaluator for a personalized learning plan!
Start EvaluatorExample Walkthrough
Let's walk through a small example to illustrate the solution approach.
Consider the input array of strings arr
as ["un","iq","ue"]
.
-
Start by initializing
ans
to0
andmasks
to[0]
. -
The first string is
un
.- For
u
,ord('u') - ord('a')
gives us20
. So,mask
becomes0 | (1 << 20)
. - For
n
,ord('n') - ord('a')
gives us13
. So,mask
becomesmask | (1 << 13)
.
After processing
un
, ourmask
is10000100000000000000
in binary, which is1048576
decimal. - For
-
There are no duplicates in
un
, so we move on and create a new combination by OR-ing thismask
with0
frommasks
. We update ourmasks
to be[0, 1048576]
. -
The next string is
iq
.- For
i
,ord('i') - ord('a')
gives us8
. So,mask
becomes0 | (1 << 8)
. - For
q
,ord('q') - ord('a')
gives us16
. So,mask
becomesmask | (1 << 16)
.
After processing
iq
, ourmask
is1000000010000
in binary, which is65792
decimal. - For
-
Check this
mask
against all inmasks
. There's no common bit set with1048576
(previousmask
). Therefore, we can combine them to form a new mask:1048576 | 65792
which in binary is1000010010000
, and in decimal is1111368
. -
Update
ans
with the bit count of the new mask. It has 4 bits set, representing 4 unique characters.ans
becomes4
. -
The masks array becomes
[0, 1048576, 65792, 1111368]
. -
Lastly, we have the string
ue
.- For
u
, there's already a bit set in the previous mask, so we skip the creation of a new mask.
- For
-
There are no more strings to process. We return the
ans
, which is4
. This is the maximum length of a string with all unique characters we can create by concatenating strings fromarr
.
This process of using bit masks allows us to efficiently manage and combine the unique characters from various strings without having to deal with actual string concatenation and character counting.
Solution Implementation
1from typing import List
2
3class Solution:
4 def maxLength(self, arr: List[str]) -> int:
5 max_length = 0 # Variable to store the maximum length of unique characters
6 masks = [0] # List to store the unique character sets as bit masks
7
8 # Iterate through each string in the input list
9 for string in arr:
10 mask = 0 # Initialize mask for current string
11
12 # Check each character in the string
13 for char in string:
14 char_index = ord(char) - ord('a') # Map 'a'-'z' to 0-25
15
16 # If the character is already in the mask, reset mask and break
17 if mask >> char_index & 1:
18 mask = 0
19 break
20 # Add the character to the mask
21 mask |= 1 << char_index
22
23 # If mask is not zero, it means the string had unique characters
24 if mask != 0:
25 # Check the new mask with existing masks for no overlap of characters
26 for existing_mask in masks.copy():
27 if existing_mask & mask == 0:
28 new_mask = existing_mask | mask # Combine the masks
29 masks.append(new_mask) # Append the new mask to masks
30 max_length = max(max_length, bin(new_mask).count('1')) # Update max length
31
32 return max_length # return the maximum length found
33
1class Solution {
2
3 public int maxLength(List<String> arr) {
4 int maxLen = 0; // This will hold the maximum length of unique characters.
5 List<Integer> bitMasks = new ArrayList<>(); // This list will store unique character combinations using bit masking.
6 bitMasks.add(0); // Initialize the list with a bitmask of 0 (no characters).
7
8 // Iterate over each string in the list.
9 for (String s : arr) {
10 int bitMask = 0;
11 // Iterate over the characters in the string to create a bitmask.
12 for (int i = 0; i < s.length(); ++i) {
13 int bitIndex = s.charAt(i) - 'a'; // Convert character to a bitmask index (0-25).
14 // If the character is already in the bitmask (duplicate character),
15 // set the bitmask to 0 and break out of the loop.
16 if (((bitMask >> bitIndex) & 1) == 1) {
17 bitMask = 0;
18 break;
19 }
20 bitMask |= 1 << bitIndex; // Add the character into the bitmask.
21 }
22
23 // If bitmask is 0, the string contains duplicates and should be ignored.
24 if (bitMask == 0) {
25 continue;
26 }
27
28 int currentSize = bitMasks.size();
29 // Iterate over existing masks and combine them with the new mask if possible.
30 for (int i = 0; i < currentSize; ++i) {
31 int combinedMask = bitMasks.get(i);
32 // If there is no collision between the current mask and the new mask,
33 // we can combine them into a new mask.
34 if ((combinedMask & bitMask) == 0) {
35 bitMasks.add(combinedMask | bitMask); // Combine masks by OR operation.
36 maxLen = Math.max(maxLen, Integer.bitCount(combinedMask | bitMask)); // Update maxLen if necessary.
37 }
38 }
39 }
40
41 return maxLen; // Return the maximum length found.
42 }
43}
44
1#include <vector>
2#include <string>
3#include <algorithm>
4
5class Solution {
6public:
7 int maxLength(std::vector<std::string>& arr) {
8 int max_length = 0; // to store the max length of unique characters
9 // masks initially contains only one element: "0", which represents an empty string
10 std::vector<int> masks = {0};
11
12 // Iterate over all strings in the input vector
13 for (std::string& str : arr) {
14
15 int mask = 0; // Bitmask to represent the current string
16
17 // Iterate over each character in the current string
18 for (char& ch : str) {
19 int bitIndex = ch - 'a';
20
21 // Check if this character has already appeared in the string (mask)
22 if ((mask >> bitIndex) & 1) {
23 // If the character repeats, discard this string by setting the mask to 0 and break
24 mask = 0;
25 break;
26 }
27
28 // Add the current character to the mask
29 mask |= 1 << bitIndex;
30 }
31
32 // Only proceed if mask is not zero (valid string without any repeating character)
33 if (mask == 0) continue;
34
35 int currentMasksCount = masks.size();
36 // Iterate over existing combinations of strings represented by masks
37 for (int i = 0; i < currentMasksCount; ++i) {
38 int combinedMask = masks[i];
39
40 // Check if current mask and combined mask have no characters in common
41 if ((combinedMask & mask) == 0) {
42 // Combine current string with the string represented by combinedMask
43 masks.push_back(combinedMask | mask);
44 // Update max_length using the number of unique characters in the new combination
45 max_length = std::max(max_length, __builtin_popcount(combinedMask | mask));
46 }
47 }
48 }
49
50 // Return the maximum length of a string with all unique characters
51 return max_length;
52 }
53};
54
1// Import necessary functions from built-in modules
2import { max } from 'lodash';
3
4// Utility function to count the number of set bits (1s) in the binary representation of a number
5const popCount = (n: number): number => {
6 let count = 0;
7 while (n) {
8 count += n & 1;
9 n >>= 1;
10 }
11 return count;
12};
13
14// Function to compute the max length of a concatenated string of unique characters
15const maxLength = (arr: string[]): number => {
16 let maxLength = 0; // To store the max length of unique characters
17 // Masks initially contains only one element: 0, which represents an empty string
18 let masks: number[] = [0];
19
20 // Iterate over all strings in the input array
21 arr.forEach(str => {
22 let mask = 0; // Bitmask to represent the current string
23
24 // Iterate over each character in the current string
25 for (const ch of str) {
26 let bitIndex = ch.charCodeAt(0) - 'a'.charCodeAt(0);
27
28 // Check if this character has already appeared in the string (mask)
29 if ((mask >> bitIndex) & 1) {
30 // If the character repeats, discard this string by setting the mask to 0 and break
31 mask = 0;
32 break;
33 }
34
35 // Add the current character to the mask
36 mask |= 1 << bitIndex;
37 }
38
39 // Only proceed if mask is not zero (valid string without any repeating character)
40 if (mask === 0) return;
41
42 // Create a copy of the current masks to iterate over
43 const currentMasks = [...masks];
44
45 // Iterate over existing combinations of strings represented by masks
46 currentMasks.forEach(combinedMask => {
47 // Check if the current mask and combined mask have no characters in common
48 if ((combinedMask & mask) === 0) {
49 // Combine current string with the string represented by combinedMask
50 masks.push(combinedMask | mask);
51 // Update maxLength using the number of unique characters in the new combination
52 maxLength = max([maxLength, popCount(combinedMask | mask)])!;
53 }
54 });
55 });
56
57 // Return the maximum length of a string with all unique characters
58 return maxLength;
59};
60
61// Example usage:
62// console.log(maxLength(["un", "iq", "ue"])); // Should print 4
63
Time and Space Complexity
Time Complexity
The time complexity of the code primarily stems from two nested loops: the outer loop iterating over each string s
in arr
, and the inner loop iterating over the set of bitmasks masks
. For every character c
in each string s
, we perform a constant-time operation to check if that character has already been seen by using a bitmask. If it has, we break early and the mask is set to 0
. The worst-case scenario for this operation is O(k)
where k
is the length of the longest string.
Considering the generation of new masks, for every mask
variable generated from string s
, the code iterates through the masks
list to check if there's any overlap with existing masks (m & mask == 0
). In the worst-case scenario, if all characters in arr
are unique and n
is the length of arr
, there could be up to 2^n
combinations as every element in arr
could be included or excluded from the combination.
The bit count operation (.bit_count()
) is generally considered a constant-time operation, but since it's applied for each newly created mask, it doesn't dominate the time complexity.
Therefore, the overall time complexity is O(n * 2^n * k)
.
Space Complexity
The space complexity is dictated by the masks
list which stores the unique combinations of characters that have been seen. In the worst-case scenario, this could be as large as 2^n
where n
is the length of arr
. No other data structure in the code uses space that scales with the size of the input to the same degree.
Therefore, the space complexity of the algorithm is O(2^n)
.
Learn more about how to find time and space complexity quickly using problem constraints.
Which of the following problems can be solved with backtracking (select multiple)
Recommended Readings
Backtracking Template Prereq DFS with States problems dfs_with_states Combinatorial search problems Combinatorial search problems involve finding groupings and assignments of objects that satisfy certain conditions Finding all permutations combinations subsets and solving Sudoku are classic combinatorial problems The time complexity of combinatorial problems often grows rapidly with the size of
LeetCode Patterns Your Personal Dijkstra's Algorithm to Landing Your Dream Job The goal of AlgoMonster is to help you get a job in the shortest amount of time possible in a data driven way We compiled datasets of tech interview problems and broke them down by patterns This way we
Recursion Recursion is one of the most important concepts in computer science Simply speaking recursion is the process of a function calling itself Using a real life analogy imagine a scenario where you invite your friends to lunch https algomonster s3 us east 2 amazonaws com recursion jpg You first
Want a Structured Path to Master System Design Too? Don’t Miss This!