2565. Subsequence With the Minimum Score


Problem Description

In this LeetCode problem, we are given two strings s and t. Our goal is to transform string t into a subsequence of string s. A subsequence of a string is a new string derived from the original string by deleting some or no characters, without changing the order of the remaining characters.

We are allowed to remove any number of characters from string t. However, depending on which characters we remove, we will get a different score. If we don't remove any characters, the score is 0. Otherwise, the score is determined by the positions of the removed characters in t:

  • Let left be the position of the first removed character in t, with positions starting at index 0.
  • Let right be the position of the last removed character in t.

The score is then defined as right - left + 1, which is effectively the length of the segment in t that includes all the removed characters.

The objective is to find the minimum possible score by removing characters from t in such a way that t becomes a subsequence of s.

Intuition

To arrive at the solution for this problem, we need to find the shortest segment (if any) of t that can be removed while still allowing t to be a subsequence of s. We can use two pointers to iterate over both s and t and mark down the position of each character of t in s if it exists.

We use an array f to store the leftmost positions of t in s, and another array g to store the rightmost positions of t in s. This helps us in understanding which part of t can be removed to make it a subsequence of s. If a character from t is not in s, we cannot form a subsequence, and the score would potentially be the length of t.

Once we have the leftmost and rightmost positions, we conduct a binary search to find the minimum score. The check function uses the precomputed f and g arrays to verify if by removing a segment of length x from t, it still remains a subsequence of s.

The binary search is applied over a range from 0 to n + 1 (where n is the length of t). This is because the length of the segment removed could be at most the entire string.

The bisect_left function from Python's bisect library is used to find the smallest x for which check(x) is True. This x corresponds to the minimum score that makes t a subsequence of s.

The intuition behind using binary search is that if removing a certain length x allows t to be a subsequence, then any longer length would also work, but we are interested in the smallest such length. Conversely, if length x does not allow t to be a subsequence, then any shorter length will not work either.

Learn more about Two Pointers and Binary Search patterns.

Not Sure What to Study? Take the 2-min Quiz to Find Your Missing Piece:

Given an array of 1,000,000 integers that is almost sorted, except for 2 pairs of integers. Which algorithm is fastest for sorting the array?

Solution Approach

The solution approach involves several steps and implements binary search to minimize the score effectively. The following outlines the algorithm, data structures, and patterns used in the provided Python code:

  1. Preprocessing with Two Pointer Technique: The first step is to iterate through the strings s and t using two pointers. As we go through both strings simultaneously, we record two things:

    • The leftmost index in s at which each character of t can be found. This is stored in an array f.
    • The rightmost index in s at which each character of t can be found. This is stored in an array g.
  2. Binary Search: After preprocessing, we use binary search to find the minimum score that allows t to become a subsequence of s. We know that the score represents the number of characters to be deleted from t, which falls within a range starting at 0 (no deletion) up to the length of t (deleting everything). Here's the binary search step:

    • We define a function check that takes a score x as an argument and checks whether t can be made a subsequence of s by removing a segment of that length.
    • The function iterates over the range [0, n], where n is the length of t, to find the starting point k of the segment to be deleted.
    • For each starting point, i is set to k - 1 and j is set to k + x, determining the supposed ends of the segment to be deleted.
    • If i is negative, it means the segment starts before t; in this case, f[i] is set to -1. Similarly, if j is greater than n, it means the segment ends after t, and g[j] is set to a value larger than the length of s.
    • The function checks whether the position l, the last index before the segment starts, is less than r, the first index after the segment ends. If this is true at any point, it means there exists a segment of length x whose removal allows t to be a subsequence of s.
  3. Finding the Minimum Score: To find the minimum score, the code uses the bisect_left function, which effectively performs the binary search:

    • It's called with a range of possible scores ([0, n+1]).
    • The key parameter is the check function, which determines whether a given score makes t a subsequence of s.
    • bisect_left returns the smallest score for which t can still be made a subsequence of s, thus giving us the desired minimum score.

By utilizing these algorithmic steps and the power of binary search, the solution finds the smallest length of a segment that can be removed from t while ensuring t remains a subsequence of s, minimizing the score as required by the problem.

Discover Your Strengths and Weaknesses: Take Our 2-Minute Quiz to Tailor Your Study Plan:

What is the best way of checking if an element exists in an unsorted array once in terms of time complexity? Select the best that applies.

Example Walkthrough

Let's use a small example to illustrate the solution approach.

Suppose s = "abpcplea" and t = "apple". We want to find the minimum score by removing as small a segment as possible from t to make it a subsequence of s.

Step 1 – Preprocessing with Two Pointer Technique

  1. Initialize two arrays f and g of the same length as t to store leftmost and rightmost positions of characters from t in s.
  2. Set two pointers i and j. i goes over t from left to right, j goes over s from left to right.
  3. Find leftmost positions for each character in t and store it in f. For "apple":
    • f[0] (for a) is 0 because 'a' appears at index 0 in s.
    • f[1] (for p) is 2, as 'p' first appears at index 2 in s.
    • f[2] is also 2, since the same 'p' is used in s.
    • f[3] (for l) is 5, as 'l' appears at index 5 in s.
    • f[4] (for e) is 6 because 'e' is found at index 6 in s.
  4. Use the two-pointers technique again but with j going over s from right to left to find rightmost positions, store in g.
    • g[4] (for e) is 6.
    • g[3] (for l) is 5.
    • g[1] and g[2] (for p) is 3, as 'p' last appears at index 3 in s.
    • g[0] (for a) is 0.

Step 2 – Binary Search

  1. Define a check function that, given a score x, checks if t can be a subsequence of s after removing a segment of x characters from t.
  2. Perform a binary search to find the minimum score x such that check returns true.

For our binary search, we will look at possibilities across the length of t+1, which is 6 in this case.

Step 3 – Finding the Minimum Score

Consider a possible score x, say 2. We try to check if removing any 2 consecutive characters from t would still allow it to be a subsequence of s.

Test all possibilities for x = 2:

  1. Remove "ap" from "apple" (t = "ple"). The segment "ple" can be found in "abpcplea" after "bp".
  2. Remove "pp" from "apple" (t = "ale"). The segment "ale" is not found in the order within "abpcplea".
  3. Remove "pl" (t = "ape"). The segment "ape" can be found in "abpcplea".
  4. Remove "le" (t = "app"). The segment "app" can be found in "abpcplea".

It shows that we can remove a segment of length 2 and still have t be a subsequence of s.

To ensure it is the smallest possible score, the binary search algorithm will continue to check smaller values if possible. Since there's no smaller valid score than 2 in this case (besides 0, which doesn't work), our binary search concludes with a score of 2.

Therefore, the output of the process using our example is that the minimum score is 2.

Solution Implementation

1from bisect import bisect_left
2from math import inf
3
4class Solution:
5    def minimumScore(self, source: str, target: str) -> int:
6        # Helper function to check if there is a subsequence of length 'length'
7        # within source that matches target.
8        def is_match_possible(length):
9            for start_index in range(source_length):
10                prev_index = start_index - 1
11                next_index = start_index + length
12                left_bound = prefix_matches[prev_index] if prev_index >= 0 else -1
13                right_bound = suffix_matches[next_index] if next_index < source_length else source_matches + 1
14                if left_bound < right_bound:
15                    return True
16            return False
17
18        source_matches, source_length = len(source), len(target)
19        # Arrays to store the first and last match indices for each character in 'target'
20        prefix_matches = [inf] * source_length
21        suffix_matches = [-1] * source_length
22
23        # Forward pass to populate prefix matches of target in source
24        source_index, target_index = 0, 0
25        while source_index < source_matches and target_index < source_length:
26            if source[source_index] == target[target_index]:
27                prefix_matches[target_index] = source_index
28                target_index += 1
29            source_index += 1
30
31        # Backward pass to populate suffix matches of target in source
32        source_index, target_index = source_matches - 1, source_length - 1
33        while source_index >= 0 and target_index >= 0:
34            if source[source_index] == target[target_index]:
35                suffix_matches[target_index] = source_index
36                target_index -= 1
37            source_index -= 1
38
39        # Using binary search to find the minimum length of subsequence
40        # required such that 'target' is a subsequence of 'source'.
41        # The search range is from 0 to 'source_length' + 1.
42        return bisect_left(range(source_length + 1), True, key=is_match_possible)
43
1class Solution {
2  
3    private int stringLength; // Length of string 's'
4    private int subStringLength; // Length of string 't'
5    private int[] forward; // Array to keep track of the forward pass
6    private int[] backward; // Array to keep track of the backward pass
7  
8    // Method to compute the minimum score for converting 's' into 't'
9    public int minimumScore(String s, String t) {
10        stringLength = s.length();
11        subStringLength = t.length();
12        forward = new int[subStringLength];
13        backward = new int[subStringLength];
14      
15        // Initialize forward and backward arrays with default values
16        for (int i = 0; i < subStringLength; ++i) {
17            forward[i] = Integer.MAX_VALUE; // Using MAX_VALUE instead of 1 << 30 for clarity
18            backward[i] = -1;
19        }
20      
21        // Forward pass to populate 'forward' array
22        for (int i = 0, j = 0; i < stringLength && j < subStringLength; ++i) {
23            if (s.charAt(i) == t.charAt(j)) {
24                forward[j] = i;
25                ++j;
26            }
27        }
28      
29        // Backward pass to populate 'backward' array
30        for (int i = stringLength - 1, j = subStringLength - 1; i >= 0 && j >= 0; --i) {
31            if (s.charAt(i) == t.charAt(j)) {
32                backward[j] = i;
33                --j;
34            }
35        }
36      
37        // Binary search to find the minimum length of substring that is good
38        int left = 0, right = subStringLength;
39        while (left < right) {
40            int mid = (left + right) >> 1;
41            if (check(mid)) {
42                right = mid;
43            } else {
44                left = mid + 1;
45            }
46        }
47      
48        return left;
49    }
50  
51    // Helper method to check if there's a good string of length 'len'
52    private boolean check(int len) {
53        for (int k = 0; k < subStringLength; ++k) {
54            int i = k - 1, j = k + len;
55            int left = i >= 0 ? forward[i] : -1;
56            int right = j < subStringLength ? backward[j] : stringLength + 1;
57          
58            // If there is overlap between the indices, a good substring exists
59            if (left < right) {
60                return true;
61            }
62        }
63        return false;
64    }
65}
66
1class Solution {
2public:
3    int minimumScore(string s, string t) {
4        int sourceSize = s.size();
5        int targetSize = t.size();
6        vector<int> firstOccurrence(targetSize, 1e6);
7        vector<int> lastOccurrence(targetSize, -1);
8      
9        // Populate firstOccurrence with the indices of the first occurrences
10        // of the characters of 't' in 's'.
11        for (int si = 0, ti = 0; si < sourceSize && ti < targetSize; ++si) {
12            if (s[si] == t[ti]) {
13                firstOccurrence[ti] = si;
14                ++ti;
15            }
16        }
17      
18        // Populate lastOccurrence with the indices of the last occurrences
19        // of the characters of 't' in 's', traversing 's' in reverse.
20        for (int si = sourceSize - 1, ti = targetSize - 1; si >= 0 && ti >= 0; --si) {
21            if (s[si] == t[ti]) {
22                lastOccurrence[ti] = si;
23                --ti;
24            }
25        }
26      
27        // Lambda function to check if there's a subsequence of 't' of a given length 'len'
28        // that occurs completely within 's'.
29        auto checkSubsequenceLength = [&](int len) {
30            for (int ti = 0; ti < targetSize; ++ti) {
31                int prevIndex = ti - 1;
32                int nextIndex = ti + len;
33                int left = prevIndex >= 0 ? firstOccurrence[prevIndex] : -1;
34                int right = nextIndex < targetSize ? lastOccurrence[nextIndex] : sourceSize + 1;
35                if (left < right) {
36                    return true;
37                }
38            }
39            return false;
40        };
41      
42        // Perform binary search to find the minimum length of the subsequence
43        // of 't' that occurs in 's'.
44        int leftBoundary = 0, rightBoundary = targetSize;
45        while (leftBoundary < rightBoundary) {
46            int mid = (leftBoundary + rightBoundary) >> 1;
47            if (checkSubsequenceLength(mid)) {
48                rightBoundary = mid;
49            } else {
50                leftBoundary = mid + 1;
51            }
52        }
53      
54        // The leftBoundary holds the minimum length after the binary search ends.
55        return leftBoundary;
56    }
57};
58
1// Return the minimum subsequence length of 't' that occurs in 's'.
2function minimumScore(s: string, t: string): number {
3    const sourceSize: number = s.length;
4    const targetSize: number = t.length;
5    const firstOccurrence: number[] = new Array(targetSize).fill(1e6);
6    const lastOccurrence: number[] = new Array(targetSize).fill(-1);
7
8    // Populate firstOccurrence with the indices of the first occurrences
9    // of the characters of 't' in 's'.
10    for (let si = 0, ti = 0; si < sourceSize && ti < targetSize; ++si) {
11        if (s[si] === t[ti]) {
12            firstOccurrence[ti] = si;
13            ++ti;
14        }
15    }
16
17    // Populate lastOccurrence with the indices of the last occurrences
18    // of the characters of 't' in 's', traversing 's' in reverse.
19    for (let si = sourceSize - 1, ti = targetSize - 1; si >= 0 && ti >= 0; --si) {
20        if (s[si] === t[ti]) {
21            lastOccurrence[ti] = si;
22            --ti;
23        }
24    }
25
26    // Check if there's a subsequence of 't' of a given length 'len'
27    // that occurs completely within 's'.
28    const checkSubsequenceLength = (len: number): boolean => {
29        for (let ti = 0; ti < targetSize; ++ti) {
30            const prevIndex: number = ti - 1;
31            const nextIndex: number = ti + len;
32            const left: number = prevIndex >= 0 ? firstOccurrence[prevIndex] : -1;
33            const right: number = nextIndex < targetSize ? lastOccurrence[nextIndex] : sourceSize + 1;
34            if (left < right) {
35                return true;
36            }
37        }
38        return false;
39    };
40
41    // Perform binary search to find the minimum length of the subsequence
42    // of 't' that occurs in 's'.
43    let leftBoundary: number = 0, rightBoundary: number = targetSize;
44    while (leftBoundary < rightBoundary) {
45        const mid: number = Math.floor((leftBoundary + rightBoundary) / 2);
46        if (checkSubsequenceLength(mid)) {
47            rightBoundary = mid;
48        } else {
49            leftBoundary = mid + 1;
50        }
51    }
52
53    // The leftBoundary holds the minimum length after the binary search ends.
54    return leftBoundary;
55}
56
Not Sure What to Study? Take the 2-min Quiz:

In a binary min heap, the minimum element can be found in:

Time and Space Complexity

The provided code snippet is trying to solve a problem where it is necessary to find the minimum length of a substring of s that contains the string t as a subsequence.

The principal operations of interest in this code from a computational complexity standpoint are the two while loops that iterate over the indices of the strings s and t to create arrays f and g, and the binary search function bisect_left combined with the check function.

Time Complexity

  1. Initializing f and g: This is done in constant time, O(n) for each array, where n is the length of string t.

  2. First while loop: It iterates over all characters in s and t. This runs in O(m + n), where m is the length of string s and n is the length of string t, since each character in both strings is visited at most once.

  3. Second while loop: Similar to the first, this loop runs in O(m + n) time.

  4. Binary search with check function: Binary search runs in O(log n) on the range n + 1. The check function is called in each iteration of the binary search, and it runs in O(n) since it iterates over n elements. Therefore, the combined complexity for the binary search with the check function is O(n * log n).

Combining these, the overall time complexity is O(n + n + (m + n) + (m + n) + n * log n) which simplifies to O(m + n * log n) since n * log n will dominate as n grows.

Space Complexity

  1. Arrays f and g: Both arrays have a space complexity of O(n) each, where n is the length of string t.

  2. Range for binary search: This does not require additional space as the range is not materialized into a list but used as an iterable in bisect_left.

  3. Variables and pointers: Only a constant number of integer variables and pointers are used, which adds an O(1) space complexity.

Overall, the space complexity is the sum of the space taken by f and g, which gives us O(n) + O(n) that simplifies to O(n).

Learn more about how to find time and space complexity quickly using problem constraints.

Fast Track Your Learning with Our Quick Skills Quiz:

Which of these pictures shows the visit order of a depth-first search?


Recommended Readings


Got a question? Ask the Teaching Assistant anything you don't understand.

Still not clear? Ask in the Forum,  Discord or Submit the part you don't understand to our editors.


TA 👨‍🏫