2901. Longest Unequal Adjacent Groups Subsequence II
Problem Description
The problem asks us to select the longest subsequence from an array of indices that represent words and groups. A subsequence in this context is an ordered sequence that may skip elements from the original sequence but maintains the relative order of the remaining elements. There are two primary requirements for the sequence:
- For any pair of adjacent indices in our subsequence (say
i_j
andi_(j + 1)
), the groups that correspond to these indices must be different. In other words,groups[i_j]
must not be equal togroups[i_(j + 1)]
. - The words at these indices must have the same length and a Hamming distance of 1. The Hamming distance is defined as the number of positions at which the two strings of equal length differ.
For example, if we have words ["wheel", "pearl", "world", "word"]
and corresponding groups [1, 2, 1, 3]
, a valid subsequence that follows our conditions might be [0, 2, 3]
which corresponds to the words ["wheel", "world", "word"]
.
The problem is to identify such the longest valid subsequence and return the words corresponding to that subsequence.
Intuition
The solution approach uses dynamic programming, which is a method of solving complex problems by breaking them down into simpler subproblems. The idea is to create an array f
where each f[i]
stores the length of the longest valid subsequence ending at index i. Moreover, g
is an array where each g[i]
stores the index of the predecessor in the subsequence ending with i
.
Initially, we assume that the longest subsequence that ends with each word has a length of 1 (because a subsequence with a single word is always valid) — this is our base case.
The algorithm then checks each word against all the words before it. For two words at index i
and j
, if they belong to different groups and the Hamming distance between them is 1, it means that the word at index i
could potentially follow the word at index j
in a valid subsequence.
The algorithm goes through all possible pairs of words and updates f[i]
, g[i]
, and a variable mx
(which represents the length of the longest subsequence found so far) accordingly. After considering all words, mx
will hold the length of the longest valid subsequence.
Finally, by tracing back from an index i
with the maximum length mx
in f
, using the g
array to find the predecessor indices, the algorithm constructs the subsequence in reverse order. Reversing this sequence gives us the longest required subsequence.
This approach uses the concept of comparing all possible pairs of words (hence the O(n^2 * L) time complexity), and utilizes additional arrays to store the subsequence lengths and the predecessor indices (which accounts for the O(n) space complexity).
Learn more about Dynamic Programming patterns.
Solution Approach
The solution approach employs dynamic programming (DP) to build up a solution by evaluating all sub-problems, represented by each possible subsequence ending at any index in the given arrays. The primary idea is to keep track of two things for each position i
: the length of the longest valid subsequence that ends with the word at position i
, and the index of the word that came just before i
in that subsequence.
Dynamic Arrays: f
and g
Two arrays are used in the DP algorithm:
f
: An array wheref[i]
contains the length of the longest subsequence ending at indexi
.g
: An array whereg[i]
contains the index of the predecessor ofi
in the subsequence.
Both arrays f
and g
are vital for reconstructing the longest subsequence once we've completed the iteration process. We start by initializing each f[i]
value to 1, because a sequence containing only the i-th
element is obviously valid, with a length of 1.
Updating the DP Arrays
We then iterate through all pairs of indices using nested loops. For each index i
, we look at all indices j
that come before it, i.e., j < i
. Two conditions must be satisfied to consider extending the subsequence ending at j
to include i
:
- The groups must be different:
groups[i] != groups[j]
. - The Hamming distance between
words[i]
andwords[j]
must be exactly 1:sum(a != b for a, b in zip(words[i], words[j])) == 1
.
When both conditions are met, we check if the length of the subsequence at j
plus one is greater than the current length at i
(f[i] < f[j] + 1
). If so, we update f[i]
to f[j] + 1
and record the index j
in g[i]
. We also update mx
to keep track of the current maximum length of any subsequence we've found.
Reconstructing the Longest Subsequence
Once we've processed all indices, we can then find out which index corresponds to the length mx
. Starting from this index, we trace back through the g
array, effectively reconstructing the longest subsequence in reverse order. This reverse order is necessary because we start from the end and go backwards to the beginning of the subsequence, following the stored predecessors.
Finally, we reverse the reconstructed subsequence to present it in the correct order, from the first to the last index of the subsequence.
Time Complexity: O(n^2 * L)
The time complexity is O(n^2 * L)
, where n
is the number of elements in the words
and groups
arrays and L
is the maximum length of the strings in words
. This time complexity arises because we perform a nested iteration over all pairs of elements (O(n^2)
) and, for each pair, potentially compare every character of the two strings involved (O(L)
).
Space Complexity: O(n)
The space complexity is O(n)
due to the additional arrays f
and g
that we maintain throughout the algorithm.
Ready to land your dream job?
Unlock your dream job with a 2-minute evaluator for a personalized learning plan!
Start EvaluatorExample Walkthrough
Let's use a small example to illustrate the solution approach with words ["ab", "bc", "ac", "ad"]
and corresponding groups [1, 1, 2, 3]
.
Step 1: Initialize DP arrays
f
: [1, 1, 1, 1] (Each word can be a subsequence by itself)g
: [-1, -1, -1, -1] (No predecessors yet)
Step 2: Evaluating Pairs
- Compare "ab" and "bc":
- Different groups? No, both in group 1. Do not update
f
org
.
- Different groups? No, both in group 1. Do not update
- Compare "ab" and "ac":
- Different groups? Yes.
- Hamming distance = 1? Yes (
"ab"
vs"ac"
). - Update:
f[2]
=f[0]
+ 1 = 2,g[2]
= 0.
- Update
f
: [1, 1, 2, 1], Updateg
: [-1, -1, 0, -1] - Compare "ab" and "ad":
- Different groups? Yes.
- Hamming distance = 1? Yes (
"ab"
vs"ad"
). - Update:
f[3]
=f[0]
+ 1 = 2,g[3]
= 0.
- Update
f
: [1, 1, 2, 2], Updateg
: [-1, -1, 0, 0] - Compare "bc" with all previous entries, no updates since "bc" is in the same group as "ab".
- Compare "ac" and "ad":
- Different groups? Yes.
- Hamming distance = 1? Yes (
"ac"
vs"ad"
). - Update:
f[3]
=f[2]
+ 1 = 3,g[3]
= 2 (since f[2] > f[0]).
- Final update
f
: [1, 1, 2, 3],g
: [-1, -1, 0, 2]
Step 3: Reconstruct the Longest Subsequence
mx
is 3, and it's at index 3, word "ad".- Trace back predecessors: for "ad" (
g[3]
= 2), the predecessor is "ac" (at index 2). - For "ac" (
g[2]
= 0), the predecessor is "ab" (at index 0). - We construct the subsequence in reverse order: [3, 2, 0].
Step 4: Reverse the Subsequence
- After reversing, we get [0, 2, 3].
Step 5: Return Words of Longest Subsequence
- Corresponding words: ["ab", "ac", "ad"].
Through this example, the solution approach successfully identified and constructed the longest valid subsequence where adjacent words are from different groups and have a Hamming distance of 1.
Solution Implementation
1from typing import List
2
3class Solution:
4 def getWordsInLongestSubsequence(self, n: int, words: List[str], groups: List[int]) -> List[str]:
5 # Helper function to check if two strings s and t differ by exactly one character
6 def is_one_letter_diff(s: str, t: str) -> bool:
7 return len(s) == len(t) and sum(a != b for a, b in zip(s, t)) == 1
8
9 # Initialize a list to store the length of the longest subsequence ending at each word
10 lengths = [1] * n
11 # Initialize a list to store the previous index for the longest subsequence
12 previous_index = [-1] * n
13 # Variable to store the length of the longest subsequence found so far
14 max_length = 1
15
16 # Iterate through each word
17 for i, current_group in enumerate(groups):
18 # Compare with every other word before it
19 for j, previous_group in enumerate(groups[:i]):
20 # If the groups are different, and the length can be increased, and the words differ by one letter
21 if (current_group != previous_group and
22 lengths[i] < lengths[j] + 1 and
23 is_one_letter_diff(words[i], words[j])):
24 # Update the length of the longest subsequence ending at the current word
25 lengths[i] = lengths[j] + 1
26 # Update the previous index to the current longest subsequence
27 previous_index[i] = j
28 # Update the maximum length of the longest subsequence found
29 max_length = max(max_length, lengths[i])
30
31 # Initialize a list to store the words of the longest subsequence
32 longest_subsequence = []
33
34 # Iterate through each word to find the end of the longest subsequence
35 for i in range(n):
36 if lengths[i] == max_length:
37 j = i
38 # Backtrack through the previous indices to construct the longest subsequence
39 while j >= 0:
40 longest_subsequence.append(words[j])
41 j = previous_index[j]
42 break
43
44 # Return the longest subsequence in the correct order
45 return longest_subsequence[::-1]
46
1class Solution {
2
3 /**
4 * Finds the longest string subsequence with alternating group numbers and where each adjacent pair of words can only have one character different.
5 *
6 * @param n The number of words in the 'words' array.
7 * @param words An array of strings, representing the words.
8 * @param groups An array of integers, with each element indicating the group number of the corresponding word in the 'words' array.
9 * @return A list of words in the longest subsequence found.
10 */
11 public List<String> getWordsInLongestSubsequence(int n, String[] words, int[] groups) {
12 int[] longestSubsequenceLength = new int[n];
13 int[] previousIndex = new int[n];
14 Arrays.fill(longestSubsequenceLength, 1);
15 Arrays.fill(previousIndex, -1);
16 int maxLength = 1;
17
18 // Dynamic programming to find the longest subsequence
19 for (int i = 0; i < n; ++i) {
20 for (int j = 0; j < i; ++j) {
21 // Check if alternating groups and if it's beneficial to extend the sequence from j to i
22 if (groups[i] != groups[j] && longestSubsequenceLength[i] < longestSubsequenceLength[j] + 1 && canTransform(words[i], words[j])) {
23 longestSubsequenceLength[i] = longestSubsequenceLength[j] + 1;
24 previousIndex[i] = j;
25 maxLength = Math.max(maxLength, longestSubsequenceLength[i]);
26 }
27 }
28 }
29
30 // Constructing the longest subsequence from the dynamic programming table
31 List<String> result = new ArrayList<>();
32 for (int i = 0; i < n; ++i) {
33 if (longestSubsequenceLength[i] == maxLength) {
34 for (int j = i; j >= 0; j = previousIndex[j]) {
35 result.add(words[j]);
36 }
37 break;
38 }
39 }
40
41 // The sequence is constructed in reverse so we need to reverse it to get the correct order
42 Collections.reverse(result);
43 return result;
44 }
45
46 /**
47 * Checks if one string can be transformed into another string by changing exactly one character.
48 *
49 * @param firstWord The first string to be compared.
50 * @param secondWord The second string to be compared.
51 * @return A boolean indicating if the transformation is possible (true) or not (false).
52 */
53 private boolean canTransform(String firstWord, String secondWord) {
54 if (firstWord.length() != secondWord.length()) {
55 return false;
56 }
57 int diffCount = 0;
58 for (int i = 0; i < firstWord.length(); ++i) {
59 if (firstWord.charAt(i) != secondWord.charAt(i)) {
60 diffCount++;
61 }
62 }
63 return diffCount == 1;
64 }
65}
66
1#include <vector>
2#include <string>
3#include <algorithm>
4
5using namespace std;
6
7class Solution {
8public:
9 vector<string> getWordsInLongestSubsequence(int n, vector<string>& words, vector<int>& groups) {
10 // Lambda function to check if two strings differ by exactly one character
11 auto canFormSequence = [](const string& left, const string& right) {
12 if (left.size() != right.size()) {
13 return false;
14 }
15 int diffCount = 0;
16 for (int i = 0; i < left.size(); ++i) {
17 diffCount += left[i] != right[i];
18 }
19 return diffCount == 1;
20 };
21
22 // Vector to store the length of the longest subsequence at each index
23 vector<int> longestSubsequenceLength(n, 1);
24 // Vector to trace back the longest subsequence
25 vector<int> previousIndex(n, -1);
26 int maxLength = 1; // To keep track of the overall max length of subsequences
27
28 // Iterate through words and compare them for potential subsequences
29 for (int i = 0; i < n; ++i) {
30 for (int j = 0; j < i; ++j) {
31 // Check for group mismatch, subsequence length, and if strings can form a sequence
32 if (groups[i] != groups[j] && longestSubsequenceLength[i] < longestSubsequenceLength[j] + 1 && canFormSequence(words[i], words[j])) {
33 // Update the longest subsequence length and previous index at current position
34 longestSubsequenceLength[i] = longestSubsequenceLength[j] + 1;
35 previousIndex[i] = j;
36 // Update the overall max length
37 maxLength = max(maxLength, longestSubsequenceLength[i]);
38 }
39 }
40 }
41
42 // Vector to store the answer (the words in the longest subsequence)
43 vector<string> result;
44 // Iterate over words to find the ones that are part of the longest subsequence
45 for (int i = 0; i < n; ++i) {
46 // If the current word is at the end of the longest subsequence
47 if (longestSubsequenceLength[i] == maxLength) {
48 // Trace back through the indices to build the sequence
49 for (int j = i; j != -1; j = previousIndex[j]) {
50 result.push_back(words[j]);
51 }
52 break; // only need one such sequence, break after finding the first one
53 }
54 }
55
56 // As we traced back, the sequence will be in reverse; we need to reverse it again
57 reverse(result.begin(), result.end());
58
59 return result; // Return the final sequence of words
60 }
61};
62
1// Checks if two strings differ by exactly one character.
2const checkOneCharDiff = (s: string, t: string): boolean => {
3 if (s.length !== t.length) {
4 return false;
5 }
6
7 let diffCount: number = 0;
8 for (let i = 0; i < s.length; ++i) {
9 if (s[i] !== t[i]) {
10 diffCount++;
11 }
12 }
13
14 return diffCount === 1;
15};
16
17// Gets the longest subsequence of words following the specific rules.
18function getWordsInLongestSubsequence(n: number, words: string[], groups: number[]): string[] {
19 // Initialize the longest subsequence length array with 1 for each word.
20 const longestSubSeqLength: number[] = Array(n).fill(1);
21 // Array to track the previous word in the sequence for each word.
22 const prevWordIndex: number[] = Array(n).fill(-1);
23 let maxSequenceLength: number = 1; // Length of the maximum subsequence found.
24
25 // Building up the longest subsequence length and previous word for each word
26 for (let i = 0; i < n; ++i) {
27 for (let j = 0; j < i; ++j) {
28 if (groups[i] !== groups[j] &&
29 longestSubSeqLength[i] < longestSubSeqLength[j] + 1 &&
30 checkOneCharDiff(words[i], words[j])) {
31 longestSubSeqLength[i] = longestSubSeqLength[j] + 1;
32 prevWordIndex[i] = j;
33 maxSequenceLength = Math.max(maxSequenceLength, longestSubSeqLength[i]);
34 }
35 }
36 }
37
38 // Finding and constructing the longest subsequence by tracing back from the end.
39 const longestSubsequence: string[] = [];
40 for (let i = 0; i < n; ++i) {
41 if (longestSubSeqLength[i] === maxSequenceLength) {
42 for (let j = i; j !== -1; j = prevWordIndex[j]) {
43 longestSubsequence.push(words[j]);
44 }
45 break;
46 }
47 }
48
49 // The words are pushed in reverse order, so we need to reverse the array before returning.
50 return longestSubsequence.reverse();
51}
52
Time and Space Complexity
Time Complexity
The time complexity of the given code is O(n^2 * L)
. The code consists of two nested loops where n
is the number of words, and it compares pairs of words where each word has a length of L
. The function check
is called for each pair, which operates in O(L)
time to compare two words and check if their Hamming distance is exactly 1.
However, as mentioned in the reference answer, we can optimize the process by using a wildcard hash table. When using a wildcard hash table, we create all possible variants of a word by replacing each letter once with a wildcard. We use these variants as keys in a hash table that map to indices of words in the original list. This operation takes O(L)
time for each word (because we're iterating through each letter and creating variants) and is done for all n
words, leading to O(n * L)
for this preprocessing step. Subsequently, for each word, we look up its variants in the hash table in O(L)
time to find all potential matches with a Hamming distance of 1. Since each lookup yields a constant number of potential matches on average (assuming close to uniform distribution of words across the different wildcard variants), the average time complexity for processing all words drops.
Thus, while the worst-case time complexity remains the same, in practice, we expect the average complexity to be lower due to fewer pairs of words being compared.
Space Complexity
The space complexity of the original solution is O(n)
. The code creates two lists, f
and g
, both of size n
. There are no additional data structures that grow with the input size; therefore, the space complexity is linear with respect to the number of words.
With the optimization mentioned in the reference answer, we would have to account for the space taken by the wildcard hash table. The hash table would have potentially O(n * L)
keys (since each of n
words contributes L
wildcard variants). However, because each word leads to at most L
keys and the space for indices is shared among keys, the size of the list of indices is O(n)
.
Thus, the optimized solution has a space complexity of O(n + L * n)
, which simplifies to O(n * L)
since L
could be larger than 1 and in this case, dominates the n
term. This represents a trade-off compared to the original solution: better average time efficiency at the cost of increased space usage.
Learn more about how to find time and space complexity quickly using problem constraints.
You are given an array of intervals where intervals[i] = [start_i, end_i]
represent the start and end of the ith
interval. You need to merge all overlapping intervals and return an array of the non-overlapping intervals that cover all the intervals in the input.
Recommended Readings
What is Dynamic Programming Prerequisite DFS problems dfs_intro Backtracking problems backtracking Memoization problems memoization_intro Pruning problems backtracking_pruning Dynamic programming is an algorithmic optimization technique that breaks down a complicated problem into smaller overlapping sub problems in a recursive manner and uses solutions to the sub problems to construct a solution
LeetCode Patterns Your Personal Dijkstra's Algorithm to Landing Your Dream Job The goal of AlgoMonster is to help you get a job in the shortest amount of time possible in a data driven way We compiled datasets of tech interview problems and broke them down by patterns This way we
Recursion Recursion is one of the most important concepts in computer science Simply speaking recursion is the process of a function calling itself Using a real life analogy imagine a scenario where you invite your friends to lunch https algomonster s3 us east 2 amazonaws com recursion jpg You first
Want a Structured Path to Master System Design Too? Don’t Miss This!