2901. Longest Unequal Adjacent Groups Subsequence II

Problem Description

The problem asks us to select the longest subsequence from an array of indices that represent words and groups. A subsequence in this context is an ordered sequence that may skip elements from the original sequence but maintains the relative order of the remaining elements. There are two primary requirements for the sequence:

For any pair of adjacent indices in our subsequence (say i_j and i_(j + 1)), the groups that correspond to these indices must be different. In other words, groups[i_j] must not be equal to groups[i_(j + 1)].
The words at these indices must have the same length and a Hamming distance of 1. The Hamming distance is defined as the number of positions at which the two strings of equal length differ.

For example, if we have words ["wheel", "pearl", "world", "word"] and corresponding groups [1, 2, 1, 3], a valid subsequence that follows our conditions might be [0, 2, 3] which corresponds to the words ["wheel", "world", "word"].

The problem is to identify such the longest valid subsequence and return the words corresponding to that subsequence.

Intuition

The solution approach uses dynamic programming, which is a method of solving complex problems by breaking them down into simpler subproblems. The idea is to create an array f where each f[i] stores the length of the longest valid subsequence ending at index i. Moreover, g is an array where each g[i] stores the index of the predecessor in the subsequence ending with i.

Initially, we assume that the longest subsequence that ends with each word has a length of 1 (because a subsequence with a single word is always valid) — this is our base case.

The algorithm then checks each word against all the words before it. For two words at index i and j, if they belong to different groups and the Hamming distance between them is 1, it means that the word at index i could potentially follow the word at index j in a valid subsequence.

The algorithm goes through all possible pairs of words and updates f[i], g[i], and a variable mx (which represents the length of the longest subsequence found so far) accordingly. After considering all words, mx will hold the length of the longest valid subsequence.

Finally, by tracing back from an index i with the maximum length mx in f, using the g array to find the predecessor indices, the algorithm constructs the subsequence in reverse order. Reversing this sequence gives us the longest required subsequence.

This approach uses the concept of comparing all possible pairs of words (hence the O(n^2 * L) time complexity), and utilizes additional arrays to store the subsequence lengths and the predecessor indices (which accounts for the O(n) space complexity).

Learn more about Dynamic Programming patterns.

Solution Approach

The solution approach employs dynamic programming (DP) to build up a solution by evaluating all sub-problems, represented by each possible subsequence ending at any index in the given arrays. The primary idea is to keep track of two things for each position i: the length of the longest valid subsequence that ends with the word at position i, and the index of the word that came just before i in that subsequence.

Dynamic Arrays: `f` and `g`

Two arrays are used in the DP algorithm:

f: An array where f[i] contains the length of the longest subsequence ending at index i.
g: An array where g[i] contains the index of the predecessor of i in the subsequence.

Both arrays f and g are vital for reconstructing the longest subsequence once we've completed the iteration process. We start by initializing each f[i] value to 1, because a sequence containing only the i-th element is obviously valid, with a length of 1.

Updating the DP Arrays

We then iterate through all pairs of indices using nested loops. For each index i, we look at all indices j that come before it, i.e., j < i. Two conditions must be satisfied to consider extending the subsequence ending at j to include i:

The groups must be different: groups[i] != groups[j].
The Hamming distance between words[i] and words[j] must be exactly 1: sum(a != b for a, b in zip(words[i], words[j])) == 1.

When both conditions are met, we check if the length of the subsequence at j plus one is greater than the current length at i (f[i] < f[j] + 1). If so, we update f[i] to f[j] + 1 and record the index j in g[i]. We also update mx to keep track of the current maximum length of any subsequence we've found.

Reconstructing the Longest Subsequence

Once we've processed all indices, we can then find out which index corresponds to the length mx. Starting from this index, we trace back through the g array, effectively reconstructing the longest subsequence in reverse order. This reverse order is necessary because we start from the end and go backwards to the beginning of the subsequence, following the stored predecessors.

Finally, we reverse the reconstructed subsequence to present it in the correct order, from the first to the last index of the subsequence.

Time Complexity: `O(n^2 * L)`

The time complexity is O(n^2 * L), where n is the number of elements in the words and groups arrays and L is the maximum length of the strings in words. This time complexity arises because we perform a nested iteration over all pairs of elements (O(n^2)) and, for each pair, potentially compare every character of the two strings involved (O(L)).

Space Complexity: `O(n)`

The space complexity is O(n) due to the additional arrays f and g that we maintain throughout the algorithm.

Ready to land your dream job?

Unlock your dream job with a 2-minute evaluator for a personalized learning plan!

Start Evaluator

Example Walkthrough

Let's use a small example to illustrate the solution approach with words ["ab", "bc", "ac", "ad"] and corresponding groups [1, 1, 2, 3].

Step 1: Initialize DP arrays

f: [1, 1, 1, 1] (Each word can be a subsequence by itself)
g: [-1, -1, -1, -1] (No predecessors yet)

Step 2: Evaluating Pairs

Compare "ab" and "bc":
- Different groups? No, both in group 1. Do not update f or g.
Compare "ab" and "ac":
- Different groups? Yes.
- Hamming distance = 1? Yes ("ab" vs "ac").
- Update: f[2] = f[0] + 1 = 2, g[2]= 0.
Update f: [1, 1, 2, 1], Update g: [-1, -1, 0, -1]
Compare "ab" and "ad":
- Different groups? Yes.
- Hamming distance = 1? Yes ("ab" vs "ad").
- Update: f[3] = f[0] + 1 = 2, g[3] = 0.
Update f: [1, 1, 2, 2], Update g: [-1, -1, 0, 0]
Compare "bc" with all previous entries, no updates since "bc" is in the same group as "ab".
Compare "ac" and "ad":
- Different groups? Yes.
- Hamming distance = 1? Yes ("ac" vs "ad").
- Update: f[3] = f[2] + 1 = 3, g[3] = 2 (since f[2] > f[0]).
Final update f: [1, 1, 2, 3], g: [-1, -1, 0, 2]

Step 3: Reconstruct the Longest Subsequence

mx is 3, and it's at index 3, word "ad".
Trace back predecessors: for "ad" (g[3] = 2), the predecessor is "ac" (at index 2).
For "ac" (g[2] = 0), the predecessor is "ab" (at index 0).
We construct the subsequence in reverse order: [3, 2, 0].

Step 4: Reverse the Subsequence

After reversing, we get [0, 2, 3].

Step 5: Return Words of Longest Subsequence

Corresponding words: ["ab", "ac", "ad"].

Through this example, the solution approach successfully identified and constructed the longest valid subsequence where adjacent words are from different groups and have a Hamming distance of 1.

Solution Implementation

1from typing import List
2
3class Solution:
4    def getWordsInLongestSubsequence(self, n: int, words: List[str], groups: List[int]) -> List[str]:
5        # Helper function to check if two strings s and t differ by exactly one character
6        def is_one_letter_diff(s: str, t: str) -> bool:
7            return len(s) == len(t) and sum(a != b for a, b in zip(s, t)) == 1
8
9        # Initialize a list to store the length of the longest subsequence ending at each word
10        lengths = [1] * n
11        # Initialize a list to store the previous index for the longest subsequence
12        previous_index = [-1] * n
13        # Variable to store the length of the longest subsequence found so far
14        max_length = 1
15      
16        # Iterate through each word
17        for i, current_group in enumerate(groups):
18            # Compare with every other word before it
19            for j, previous_group in enumerate(groups[:i]):
20                # If the groups are different, and the length can be increased, and the words differ by one letter
21                if (current_group != previous_group and 
22                    lengths[i] < lengths[j] + 1 and 
23                    is_one_letter_diff(words[i], words[j])):
24                    # Update the length of the longest subsequence ending at the current word
25                    lengths[i] = lengths[j] + 1
26                    # Update the previous index to the current longest subsequence
27                    previous_index[i] = j
28                    # Update the maximum length of the longest subsequence found
29                    max_length = max(max_length, lengths[i])
30
31        # Initialize a list to store the words of the longest subsequence
32        longest_subsequence = []
33      
34        # Iterate through each word to find the end of the longest subsequence
35        for i in range(n):
36            if lengths[i] == max_length:
37                j = i
38                # Backtrack through the previous indices to construct the longest subsequence
39                while j >= 0:
40                    longest_subsequence.append(words[j])
41                    j = previous_index[j]
42                break
43
44        # Return the longest subsequence in the correct order
45        return longest_subsequence[::-1]
46

1class Solution {
2
3    /**
4     * Finds the longest string subsequence with alternating group numbers and where each adjacent pair of words can only have one character different.
5     *
6     * @param n      The number of words in the 'words' array.
7     * @param words  An array of strings, representing the words.
8     * @param groups An array of integers, with each element indicating the group number of the corresponding word in the 'words' array.
9     * @return A list of words in the longest subsequence found.
10     */
11    public List<String> getWordsInLongestSubsequence(int n, String[] words, int[] groups) {
12        int[] longestSubsequenceLength = new int[n];
13        int[] previousIndex = new int[n];
14        Arrays.fill(longestSubsequenceLength, 1);
15        Arrays.fill(previousIndex, -1);
16        int maxLength = 1;
17
18        // Dynamic programming to find the longest subsequence
19        for (int i = 0; i < n; ++i) {
20            for (int j = 0; j < i; ++j) {
21                // Check if alternating groups and if it's beneficial to extend the sequence from j to i
22                if (groups[i] != groups[j] && longestSubsequenceLength[i] < longestSubsequenceLength[j] + 1 && canTransform(words[i], words[j])) {
23                    longestSubsequenceLength[i] = longestSubsequenceLength[j] + 1;
24                    previousIndex[i] = j;
25                    maxLength = Math.max(maxLength, longestSubsequenceLength[i]);
26                }
27            }
28        }
29
30        // Constructing the longest subsequence from the dynamic programming table
31        List<String> result = new ArrayList<>();
32        for (int i = 0; i < n; ++i) {
33            if (longestSubsequenceLength[i] == maxLength) {
34                for (int j = i; j >= 0; j = previousIndex[j]) {
35                    result.add(words[j]);
36                }
37                break;
38            }
39        }
40
41        // The sequence is constructed in reverse so we need to reverse it to get the correct order
42        Collections.reverse(result);
43        return result;
44    }
45
46    /**
47     * Checks if one string can be transformed into another string by changing exactly one character.
48     *
49     * @param firstWord  The first string to be compared.
50     * @param secondWord The second string to be compared.
51     * @return A boolean indicating if the transformation is possible (true) or not (false).
52     */
53    private boolean canTransform(String firstWord, String secondWord) {
54        if (firstWord.length() != secondWord.length()) {
55            return false;
56        }
57        int diffCount = 0;
58        for (int i = 0; i < firstWord.length(); ++i) {
59            if (firstWord.charAt(i) != secondWord.charAt(i)) {
60                diffCount++;
61            }
62        }
63        return diffCount == 1;
64    }
65}
66

1#include <vector>
2#include <string>
3#include <algorithm>
4
5using namespace std;
6
7class Solution {
8public:
9    vector<string> getWordsInLongestSubsequence(int n, vector<string>& words, vector<int>& groups) {
10        // Lambda function to check if two strings differ by exactly one character
11        auto canFormSequence = [](const string& left, const string& right) {
12            if (left.size() != right.size()) {
13                return false;
14            }
15            int diffCount = 0;
16            for (int i = 0; i < left.size(); ++i) {
17                diffCount += left[i] != right[i];
18            }
19            return diffCount == 1;
20        };
21
22        // Vector to store the length of the longest subsequence at each index
23        vector<int> longestSubsequenceLength(n, 1);
24         // Vector to trace back the longest subsequence
25        vector<int> previousIndex(n, -1);
26        int maxLength = 1; // To keep track of the overall max length of subsequences
27      
28        // Iterate through words and compare them for potential subsequences
29        for (int i = 0; i < n; ++i) {
30            for (int j = 0; j < i; ++j) {
31                // Check for group mismatch, subsequence length, and if strings can form a sequence
32                if (groups[i] != groups[j] && longestSubsequenceLength[i] < longestSubsequenceLength[j] + 1 && canFormSequence(words[i], words[j])) {
33                    // Update the longest subsequence length and previous index at current position
34                    longestSubsequenceLength[i] = longestSubsequenceLength[j] + 1;
35                    previousIndex[i] = j;
36                    // Update the overall max length
37                    maxLength = max(maxLength, longestSubsequenceLength[i]);
38                }
39            }
40        }
41
42        // Vector to store the answer (the words in the longest subsequence)
43        vector<string> result;
44        // Iterate over words to find the ones that are part of the longest subsequence
45        for (int i = 0; i < n; ++i) {
46            // If the current word is at the end of the longest subsequence
47            if (longestSubsequenceLength[i] == maxLength) {
48                // Trace back through the indices to build the sequence
49                for (int j = i; j != -1; j = previousIndex[j]) {
50                    result.push_back(words[j]);
51                }
52                break; // only need one such sequence, break after finding the first one
53            }
54        }
55
56        // As we traced back, the sequence will be in reverse; we need to reverse it again
57        reverse(result.begin(), result.end());
58      
59        return result; // Return the final sequence of words
60    }
61};
62

1// Checks if two strings differ by exactly one character. 
2const checkOneCharDiff = (s: string, t: string): boolean => {
3  if (s.length !== t.length) {
4    return false;
5  }
6
7  let diffCount: number = 0;
8  for (let i = 0; i < s.length; ++i) {
9    if (s[i] !== t[i]) {
10      diffCount++;
11    }
12  }
13
14  return diffCount === 1;
15};
16
17// Gets the longest subsequence of words following the specific rules.
18function getWordsInLongestSubsequence(n: number, words: string[], groups: number[]): string[] {
19  // Initialize the longest subsequence length array with 1 for each word.
20  const longestSubSeqLength: number[] = Array(n).fill(1);
21  // Array to track the previous word in the sequence for each word.
22  const prevWordIndex: number[] = Array(n).fill(-1);
23  let maxSequenceLength: number = 1; // Length of the maximum subsequence found.
24
25  // Building up the longest subsequence length and previous word for each word
26  for (let i = 0; i < n; ++i) {
27    for (let j = 0; j < i; ++j) {
28      if (groups[i] !== groups[j] && 
29          longestSubSeqLength[i] < longestSubSeqLength[j] + 1 && 
30          checkOneCharDiff(words[i], words[j])) {
31        longestSubSeqLength[i] = longestSubSeqLength[j] + 1;
32        prevWordIndex[i] = j;
33        maxSequenceLength = Math.max(maxSequenceLength, longestSubSeqLength[i]);
34      }
35    }
36  }
37
38  // Finding and constructing the longest subsequence by tracing back from the end.
39  const longestSubsequence: string[] = [];
40  for (let i = 0; i < n; ++i) {
41    if (longestSubSeqLength[i] === maxSequenceLength) {
42      for (let j = i; j !== -1; j = prevWordIndex[j]) {
43        longestSubsequence.push(words[j]);
44      }
45      break;
46    }
47  }
48
49  // The words are pushed in reverse order, so we need to reverse the array before returning.
50  return longestSubsequence.reverse();
51}
52

Time and Space Complexity

Time Complexity

The time complexity of the given code is O(n^2 * L). The code consists of two nested loops where n is the number of words, and it compares pairs of words where each word has a length of L. The function check is called for each pair, which operates in O(L) time to compare two words and check if their Hamming distance is exactly 1.

However, as mentioned in the reference answer, we can optimize the process by using a wildcard hash table. When using a wildcard hash table, we create all possible variants of a word by replacing each letter once with a wildcard. We use these variants as keys in a hash table that map to indices of words in the original list. This operation takes O(L) time for each word (because we're iterating through each letter and creating variants) and is done for all n words, leading to O(n * L) for this preprocessing step. Subsequently, for each word, we look up its variants in the hash table in O(L) time to find all potential matches with a Hamming distance of 1. Since each lookup yields a constant number of potential matches on average (assuming close to uniform distribution of words across the different wildcard variants), the average time complexity for processing all words drops.

Thus, while the worst-case time complexity remains the same, in practice, we expect the average complexity to be lower due to fewer pairs of words being compared.

Space Complexity

The space complexity of the original solution is O(n). The code creates two lists, f and g, both of size n. There are no additional data structures that grow with the input size; therefore, the space complexity is linear with respect to the number of words.

With the optimization mentioned in the reference answer, we would have to account for the space taken by the wildcard hash table. The hash table would have potentially O(n * L) keys (since each of n words contributes L wildcard variants). However, because each word leads to at most L keys and the space for indices is shared among keys, the size of the list of indices is O(n).

Thus, the optimized solution has a space complexity of O(n + L * n), which simplifies to O(n * L) since L could be larger than 1 and in this case, dominates the n term. This represents a trade-off compared to the original solution: better average time efficiency at the cost of increased space usage.

Learn more about how to find time and space complexity quickly using problem constraints.

Discover Your Strengths and Weaknesses: Take Our 2-Minute Quiz to Tailor Your Study Plan:

Question 1 out of 10

You are given an array of intervals where intervals[i] = [start_i, end_i] represent the start and end of the ith interval. You need to merge all overlapping intervals and return an array of the non-overlapping intervals that cover all the intervals in the input.

Sort the intervals by their start time, then iterate through the sorted intervals, merging any overlapping intervals.

Brute force through all interval combinations and find the answer with the least total intervals.

Iterate through the intervals in their original order, merging any overlapping intervals.

Use a priority queue to keep track of the intervals, merging any overlapping intervals.

I don’t know