727. Minimum Window Subsequence


Problem Description

In this problem, you are given two strings s1 and s2. Your task is to find the shortest contiguous substring in s1 such that the entire string s2 is a subsequence of that substring. A subsequence is a sequence that can be derived from the other sequence by deleting some or no elements without changing the order of the remaining elements.

For example, if s1 is "abcdebdde" and s2 is "bde", you have to find the shortest part of s1 that contains all the characters in s2 in the same order.

Here are the conditions you need to satisfy:

  • If no such window exists, return an empty string "".
  • If there are multiple substrings of s1 that satisfy the condition, return the one with the left-most starting index.

In the example given, the minimum window in s1 where s2 is a subsequence is "bcde".

Intuition

The key to solving this problem is understanding dynamic programming and the process of subsequence matching. We must scan through s1 and s2 to find all possible match positions in a way that allows us to efficiently calculate the minimum window length where s2 is a subsequence.

The intuition behind the solution is to first find all the matchings between characters of s2 with s1, and keep track of the starting index of the sequence in s1 that matches up to a certain point in s2.

To solve this, we create a 2D array f where f[i][j] represents the starting index in s1 from which we have a matching sequence for s2 up to its j-th character at i-th index of s1.

Once we have this information stored in the f array, we iterate through s1 to find the character that matches the last character of s2. Each time we find such a match, we look into our f array for the starting index and calculate the window size.

The minimum window size is kept updated during the scan, and finally, we have the starting index and the size of our minimum window which we use to return our result.

The entire algorithm takes into account:

  • Dynamic Programming to solve the subsequence matching.
  • Iterating s1 and s2 smartly to minimize unnecessary comparisons.
  • Keeping track of the minimum window during the iteration.
  • Handling edge cases effectively, like when no window exists.

Learn more about Dynamic Programming and Sliding Window patterns.

Not Sure What to Study? Take the 2-min Quiz to Find Your Missing Piece:

Which two pointer technique does Quick Sort use?

Solution Approach

The solution provided uses a two-dimensional dynamic programming approach:

  1. Initialization:

    • The 2D array f is created with a size of (m+1) x (n+1) where m is the length of s1 and n is the length of s2. This array will help track matches between s1 and s2.
  2. Filling the DP array:

    • We iterate over both strings starting from index 1 (because we've initialized from 0 as part of the dynamic programming table setup), comparing each character of s1 with each character of s2.
    • When a match (a == b) is found:
      • If it is the first character of s2 (j == 1), we record this position i as a starting point of a matching sequence because it could potentially start a new subsequence.
      • For other characters, we copy the starting index from f[i - 1][j - 1], which means we extend the subsequence found until the previous character of s2.
    • When there is no match, we get the starting index from the previous value f[i - 1][j] because we want to retain the starting position of the best match found so far for s2 up to j.
  3. Identifying the minimum window:

    • Now, we look for the last character of s2 in s1 to try and close a potential window.
    • For each matching position i in s1 where s1[i] == s2[n - 1] and a subsequence match has been found (f[i][n] is not zero), we calculate the window size by subtracting the starting index j (which is f[i][n] - 1) from i. We keep track of the smallest window found in variables k for size and p for the starting index.
  4. Returning the result:

    • If we have not found any window (k > m), we return an empty string.
    • Otherwise, we return the substring of s1 starting from p with the length k.

The choice of dynamic programming in this solution is crucial as it eliminates redundant comparisons and stores intermediate results, allowing efficient computation of the final minimum length substring. This algorithm has an O(m * n) time complexity due to the nested loops required to fill the DP table, where m and n are the lengths of s1 and s2, respectively.

The solution is concise and the use of dynamic programming provides optimal substructure and overlapping subproblems, two key characteristics exploited by this paradigm to achieve efficiency. Each entry in the DP table only depends on previously computed values, making the implementation straightforward and logical once the table relationships are understood.

Discover Your Strengths and Weaknesses: Take Our 2-Minute Quiz to Tailor Your Study Plan:

Breadth first search can be used to find the shortest path between two nodes in a directed graph.

Example Walkthrough

Let's illustrate the solution approach using a small example.

Suppose s1 is "axbxcxdx" and s2 is "bcd".

Following the solution approach:

  1. Initialization:

    • We set up a 2D array f with a size of (8+1) x (3+1) since s1 is of length 8 and s2 is of length 3.
  2. Filling the DP array:

    • We iterate over s1 and s2, filling up the f array.
    • Let's iterate over s1: "axbxcxdx" and s2: "bcd".
      • When i = 1 and j = 1, we find that s1[i] != s2[j], so we don't update f[1][1].
      • When i = 2 and j = 1, we find s1[i] == s2[j] ('b' == 'b'), so we update f[2][1] to 2. This marks the start of a possible subsequence.
      • Continuing this process, we find the 'c' of s2 in s1 at position 4, so f[4][2] is updated to 2, indicating that from position 2 in s1, we have 'bc' of s2.
      • We find the 'd' of s2 in s1 at position 6, so f[6][3] is updated to 2, meaning from position 2 in s1, we have the full 'bcd' of s2.
  3. Identifying the minimum window:

    • We scan f looking for the smallest window where s1[i] == s2[3] ('d' in this case) and f[i][3] is not zero.
    • We find this at i = 6 for s2[3] ('d'), where f[i][3] is 2. The window size is 6 - 2 + 1 = 5.
  4. Returning the result:

    • The smallest window has a size of 5 starting at index 2 in s1. So, the result is the substring "bxcxd".

Following this process, we've identified the subsequence 'bcd' within s1 and found the minimum window. The algorithm efficiently computes this by tracking possible starting points for subsequences in s1 and keeping track of these starting points as we match characters of s2. This allows us to quickly compute the length of potential windows without redundant comparisons.

Solution Implementation

1class Solution:
2    def minWindow(self, s1: str, s2: str) -> str:
3        # Length of input strings
4        len_s1, len_s2 = len(s1), len(s2)
5
6        # Initialize a DP table with dimensions (len_s1+1) x (len_s2+1)
7        dp = [[0] * (len_s2 + 1) for _ in range(len_s1 + 1)]
8
9        # Fill the DP table
10        for i, char_s1 in enumerate(s1, 1):
11            for j, char_s2 in enumerate(s2, 1):
12                # If characters match, propagate the match information
13                if char_s1 == char_s2:
14                    dp[i][j] = i if j == 1 else dp[i - 1][j - 1]
15                else:  # If not, inherit the value from previous s1 character
16                    dp[i][j] = dp[i - 1][j]
17
18        # Variables to keep track of the start position and length of the minimum window
19        min_window_start_pos = 0
20        min_window_length = len_s1 + 1
21
22        # Find the minimum window in s1 which contains s2
23        for i, char_s1 in enumerate(s1, 1):
24            # When the last character of s2 is matched in s1 and a match sequence exists
25            if char_s1 == s2[len_s2 - 1] and dp[i][len_s2]:
26                match_start = dp[i][len_s2] - 1
27                window_length = i - match_start
28                # Update the minimum window if a smaller one is found
29                if window_length < min_window_length:
30                    min_window_length = window_length
31                    min_window_start_pos = match_start
32
33        # Check if a valid window was ever found, if not return an empty string
34        return "" if min_window_length > len_s1 else s1[min_window_start_pos: min_window_start_pos + min_window_length]
35
1class Solution {
2    public String minWindow(String s1, String s2) {
3        int s1Length = s1.length(), s2Length = s2.length();
4      
5        // table to store the start index of the window in s1 that ends at i and has s2.charAt(j)
6        int[][] windowStartAtIndex = new int[s1Length + 1][s2Length + 1];
7      
8        // initialize the table with default values
9        for (int i = 0; i <= s1Length; i++) {
10            Arrays.fill(windowStartAtIndex[i], -1);
11        }
12      
13        // fill the table based on the input strings s1 and s2
14        for (int i = 1; i <= s1Length; ++i) {
15            for (int j = 1; j <= s2Length; ++j) {
16                // On matching characters, update the table with the start index of the current window
17                // If it's the first character of s2, the start is the current index in s1
18                // Otherwise, it's the index stored in the previous position of the table
19                if (s1.charAt(i - 1) == s2.charAt(j - 1)) {
20                    windowStartAtIndex[i][j] = (j == 1) ? i : windowStartAtIndex[i - 1][j - 1];
21                } else {
22                    // If there's no match, inherit the value from the previous index of s1
23                    windowStartAtIndex[i][j] = windowStartAtIndex[i - 1][j];
24                }
25            }
26        }
27      
28        // position and length of the minimum window
29        int startPosition = 0, minLength = s1Length + 1;
30      
31        // find the smallest window in s1 that has all characters of s2
32        for (int i = 1; i <= s1Length; ++i) {
33            // check if the current position is the end of a valid window, i.e., it matches last character of s2
34            if (s1.charAt(i - 1) == s2.charAt(s2Length - 1) && windowStartAtIndex[i][s2Length] > 0) {
35                int j = windowStartAtIndex[i][s2Length] - 1; // the window's start position in s1
36                int currentLength = i - j; // the length of the current window
37                // update minimum length window, if the current window is smaller
38                if (currentLength < minLength) {
39                    minLength = currentLength;
40                    startPosition = j;
41                }
42            }
43        }
44      
45        // if no valid window is found, return an empty string
46        // otherwise, return the substring from startPosition with minLength
47        return minLength > s1Length ? "" : s1.substring(startPosition, startPosition + minLength);
48    }
49}
50
1#include <cstring> // include this to use memset
2
3class Solution {
4public:
5    string minWindow(string s1, string s2) {
6        int mainStrSize = s1.size(), subStrSize = s2.size();
7        int lastIndex[subStrSize + 1]; // lastIndex[i] will store the last index of subStr's ith character in mainStr
8        memset(lastIndex, 0, sizeof(lastIndex)); // initializes lastIndex array with 0
9
10        for (int i = 1; i <= mainStrSize; ++i) {
11            for (int j = 1; j <= subStrSize; ++j) {
12                if (s1[i - 1] == s2[j - 1]) {
13                    // If characters match, store the index of start of the subsequence
14                    lastIndex[j] = (j == 1) ? i : lastIndex[j - 1];
15                } else {
16                    // If characters don't match, carry forward the last index
17                    lastIndex[j] = lastIndex[j - 1];
18                }
19            }
20        }
21
22        // Initialize variables for storing the start position and the length of the minimum window
23        int startPosition = 0, minLength = mainStrSize + 1;
24
25        // Loop to find the minimum window in s1 which has s2 as a subsequence
26        for (int i = 1; i <= mainStrSize; ++i) {
27            if (s1[i - 1] == s2[subStrSize - 1] && lastIndex[subStrSize]) {
28                int start = lastIndex[subStrSize] - 1; // Find the start position of the subsequence
29                int length = i - start; // Calculate the length of the window
30                if (length < minLength) { // If this is smaller than the previously found minimum
31                    minLength = length; // Update minLength with the new smaller length
32                    startPosition = start; // Update the start position of the minimum window
33                }
34            }
35        }
36
37        // Check if a valid window was found. If minLength is still greater than mainStrSize, no valid window was found
38        return minLength > mainStrSize ? "" : s1.substr(startPosition, minLength);
39    }
40};
41
1function minWindow(source: string, target: string): string {
2    // Lengths of the source and target strings
3    const sourceLength = source.length;
4    const targetLength = target.length;
5
6    // Initialize a 2D array to store the start index of the substring
7    const startIndex: number[][] = Array(sourceLength + 1)
8        .fill(0)
9        .map(() => Array(targetLength + 1).fill(0));
10
11    // Populate the 2D array with the start index of the substring ending with s1[i] and s2[j]
12    for (let i = 1; i <= sourceLength; ++i) {
13        for (let j = 1; j <= targetLength; ++j) {
14            // When characters match, store the start index if it starts with the first character of s2,
15            // else get the index from the previous character in source and target
16            if (source[i - 1] === target[j - 1]) {
17                startIndex[i][j] = j === 1 ? i : startIndex[i - 1][j - 1];
18            } else {
19                // If characters do not match, carry over the start index from the previous character in source
20                startIndex[i][j] = startIndex[i - 1][j];
21            }
22        }
23    }
24
25    // Variables to store the starting point and smallest window size found so far
26    let startingPoint = 0;
27    let smallestWindowSize = sourceLength + 1;
28
29    // Find the smallest window in source that contains all characters of target in order
30    for (let i = 1; i <= sourceLength; ++i) {
31        // Check for the last character match and if there is a valid starting index
32        if (source[i - 1] === target[targetLength - 1] && startIndex[i][targetLength]) {
33            // Calculate the starting index and window size
34            const j = startIndex[i][targetLength] - 1;
35            if (i - j < smallestWindowSize) {
36                smallestWindowSize = i - j;
37                startingPoint = j;
38            }
39        }
40    }
41
42    // If smallest window size is larger than source length, target is not found; return an empty string
43    return smallestWindowSize > sourceLength ? '' : source.slice(startingPoint, startingPoint + smallestWindowSize);
44}
45
Not Sure What to Study? Take the 2-min Quiz:

What is the running time of the following code?

1int sqrt(int n) {
2  for (int guess = 1; guess * guess <= n; guess++) {
3    if (guess * guess == n) {
4      return guess;
5    }
6  }
7  return -1;
8}

Time and Space Complexity

The given Python code snippet is designed to find the smallest window in string s1 which contains all the characters of string s2 in the same order. It utilizes dynamic programming to achieve this.

Time Complexity

The time complexity of the code can be analyzed by looking at the nested loops:

  1. The first pair of nested loops, where the outer loop runs for m iterations (m being the length of s1) and the inner loop runs for n iterations (n being the length of s2), establishes a time complexity of O(m * n) for the dynamic programming table population.

  2. The second pair of nested loops also runs up to m times,and the inner operations are constant time since they're only comparing and updating values based on previously computed results. Therefore, the second nested loop does not exceed O(m) in complexity.

When combined, the time complexity is dictated by the larger of these loops, which is the first one. Therefore, the overall time complexity of the algorithm is O(m * n).

Space Complexity

The space complexity of the code is determined by the size of the dynamic programming table f. Since the table is of size (m + 1) * (n + 1), where m is the length of s1 and n is the length of s2, the space complexity is O(m * n). No additional significant space is consumed since only integer variables for bookkeeping purposes are used outside the table.

In summary, the time complexity is O(m * n) and the space complexity is O(m * n).

Learn more about how to find time and space complexity quickly using problem constraints.

Fast Track Your Learning with Our Quick Skills Quiz:

How does quick sort divide the problem into subproblems?


Recommended Readings


Got a question? Ask the Teaching Assistant anything you don't understand.

Still not clear? Ask in the Forum,  Discord or Submit the part you don't understand to our editors.


TA 👨‍🏫