1898. Maximum Number of Removable Characters
Problem Description
You are provided with two strings s
and p
, where p
must be a subsequence of s
. There is also an array called removable
with distinct indices from s
. The goal is to find out the maximum number of characters you can remove from s
(using the indices provided in removable
from lowest to highest) so that p
remains a subsequence of s
. The steps for removing characters are as follows: mark characters in s
at indices indicated by the first k
elements of removable
, remove them, and then check if p
remains a subsequence of the modified s
. A subsequence is a sequence that can be derived from another sequence by deleting some elements without changing the order of the remaining elements.
Intuition
The key to solving this problem lies in understanding that once you remove characters from s
, you must check whether each subsequent removal maintains p
as a subsequence. Since removable
contains distinct indices, and we want to maximize k
, we can approach this problem using binary search.
The intuition behind using binary search here is that if removing k
characters still leaves p
as a subsequence, removing fewer than k
characters will also leave p
as a subsequence. Conversely, if p
is not a subsequence after removing k
characters, removing more than k
characters will certainly not leave p
as a subsequence. Therefore, there is a threshold value for k
which we want to find - the maximum k
for which p
remains a subsequence. Binary search allows us to efficiently home in on this threshold by halving the search space with each iteration and identifying the exact point where p
stops being a subsequence.
In the provided solution, the function check(k)
takes a number k
and simulates the process of marking and removing k
characters from s
to verify if p
remains a subsequence. The binary search then adjusts the search interval (left
, right
) based on the outcomes of check(k)
. If p
is still a subsequence after removing k
characters, it means we could possibly remove more, so the search space is adjusted to [mid, right]
. If not, we have removed too many characters, and the search space is adjusted to [left, mid-1]
. The search continues until the maximum k
is found.
Learn more about Binary Search patterns.
Solution Approach
The implementation makes use of a binary search pattern, specifically what's often referred to as "Binary Search Template 2". This is due to the way the mid
point is calculated and how the search space is adjusted. The pattern is conducive when we want to search for the element or condition which requires accessing the current index and its immediate right neighbour in the array.
Here's the breakdown of the solution approach:
-
Binary Search Algorithm: We want to find the maximum value of
k
such that removingk
characters specified by the firstk
indices inremovable
froms
leavesp
as a subsequence. We perform a binary search over the length of the arrayremovable
because the problem has a "yes" or "no" nature (isp
still a subsequence after the removal?) and because the removal condition is monotonic—that is, if removingk
characters works, then removing any less thank
will also work. -
check
Function: This function simulates the removal of characters froms
and checks ifp
is still a subsequence. It uses two pointers,i
for the strings
andj
for the stringp
. As it iterates overs
, it skips over the characters at the indices specified in the firstk
entries ofremovable
. If a non-removable character ins
matches the current character inp
, it moves the pointerj
forward. If the end ofp
is reached (j == n
), thenp
is still a subsequence ofs
after the removals. -
Set Data Structure: A set called
ids
is used to store the indices fromremovable
that we're proposing to remove. This allows constant-time checks to determine if a character at a particular index ins
should be skipped during the subsequence check. -
Calculating
mid
: The variablemid
is calculated as(left + right + 1) // 2
to ensure that the search space moves towards the higher numbers when thecheck(mid)
is successful, ultimately helping to identify the upper bound ofk
. -
Adjusting Search Space: Based on the results of
check(mid)
, the binary search narrows the search space. If removingmid
characters works, it means we should look for a possibly higherk
, so the new search interval is[mid, right]
. If not, we search the space[left, mid - 1]
. The search ends whenleft
andright
meet, which will be the point just beforep
stops being a subsequence, thus giving us the maximumk
.
The combination of these components results in a solution that efficiently pinpoints the largest k
for which p
remains a subsequence in s
, despite the removals.
Ready to land your dream job?
Unlock your dream job with a 2-minute evaluator for a personalized learning plan!
Start EvaluatorExample Walkthrough
Let's consider a small example to illustrate the solution approach.
Suppose:
s = "abcacb"
p = "ab"
removable = [3, 1, 0]
Applying Binary Search
-
Initial Setup: We set up our binary search with bounds
left = 0
andright = len(removable) - 1 = 2
. The goal is to find the maximumk
such thatp
is still a subsequence of the modifieds
after the removal. -
First Iteration:
- Calculate
mid = (left + right + 1) // 2 = (0 + 2 + 1) // 2 = 1
. - Use
check(1)
to simulate the removal of characters at indicesremovable[0]
andremovable[1]
, which are0
and1
. - After removal,
s = "_b_acb"
.p = "ab"
is still a subsequence of this new strings
. - Since
check(1)
is true, we adjust the search range to[mid, right]
which means[1, 2]
.
- Calculate
-
Second Iteration:
- Calculate
mid = (1 + 2 + 1) // 2 = 2
. - Use
check(2)
to simulate the removal of characters at indicesremovable[0]
,removable[1]
, andremovable[2]
, which are0
,1
, and3
. - After removal,
s = "_b__cb"
.p = "ab"
is no longer a subsequence ofs
. - Since
check(2)
is false, we adjust the search range to[left, mid - 1]
which means[1, 1]
.
- Calculate
Since left
and right
are the same, we conclude the binary search. We determined that the maximum k
for which p = "ab"
remains a subsequence of s
after removal is 1
.
Therefore, the maximum number of characters that can be removed from s
while ensuring p
remains a subsequence is 1
(removing the characters at indices [0, 1]
from s
).
Solution Implementation
1from typing import List
2
3class Solution:
4 def maximum_removals(self, string: str, pattern: str, removable: List[int]) -> int:
5 def can_form_pattern_after_removals(k: int) -> bool:
6 # Initialize pointers for pattern and string
7 pattern_index = 0
8 string_index = 0
9 # Convert the list of removable indices up to k into a set for faster look-ups
10 removal_set = set(removable[:k])
11 # Iterate through string and pattern
12 while pattern_index < pattern_length and string_index < string_length:
13 # Check if current index is not removed, and characters match
14 if string_index not in removal_set and string[string_index] == pattern[pattern_index]:
15 # Move to the next character in the pattern
16 pattern_index += 1
17 # Move to the next character in the string
18 string_index += 1
19 # We can form the pattern if we have gone through all characters of the pattern
20 return pattern_index == pattern_length
21
22 # Get string and pattern lengths
23 string_length, pattern_length = len(string), len(pattern)
24 # Set the search space for the binary search
25 left, right = 0, len(removable)
26 # Perform binary search to find the maximum number of removable characters
27 while left < right:
28 mid = (left + right + 1) // 2 # Pick the middle value
29 # Check if the pattern can be formed after removing mid characters
30 if can_form_pattern_after_removals(mid):
31 left = mid # If it is possible, try removing more characters
32 else:
33 right = mid - 1 # Otherwise, try with fewer removable characters
34 return left # The maximum number of removable characters is left
35
36# Explanation:
37# The above code checks the maximum number of indices that can be removed from the string `s`
38# such that it is still possible to find the pattern `p` as a subsequence.
39# It uses binary search to efficiently find this maximum number.
40
1class Solution {
2 // Method to find the maximum number of characters that can be removed from
3 // string s so that string p is still a subsequence of s
4 public int maximumRemovals(String s, String p, int[] removable) {
5 int left = 0, right = removable.length; // Define binary search boundaries
6
7 // Perform binary search to find the maximum index 'mid'
8 while (left < right) {
9 int mid = (left + right + 1) / 2; // Choose a middle point (avoid overflow)
10 // Check if string p is still a subsequence of s after removing mid characters
11 if (checkSubsequence(s, p, removable, mid)) {
12 left = mid; // Move left pointer to mid if p is still a subsequence
13 } else {
14 right = mid - 1; // Move right pointer if p is not a subsequence
15 }
16 }
17 return left; // Returns maximum removables that still keep p as a subsequence of s
18 }
19
20 // Helper method to check if p is a subsequence of s after removing certain characters
21 private boolean checkSubsequence(String s, String p, int[] removable, int mid) {
22 int sLen = s.length(); // Length of string s
23 int pLen = p.length(); // Length of string p
24 int i = 0; // Index for iterating over s
25 int j = 0; // Index for iterating over p
26
27 // Create a hash set to store the indices of characters to be removed
28 Set<Integer> removalSet = new HashSet<>();
29 for (int k = 0; k < mid; k++) {
30 removalSet.add(removable[k]); // Add removable characters' indices up to mid
31 }
32
33 // Iterate over string s and p to check if p is a subsequence
34 while (i < sLen && j < pLen) {
35 // If current index is not in the removal set and characters match, move j
36 if (!removalSet.contains(i) && s.charAt(i) == p.charAt(j)) {
37 j++;
38 }
39 i++; // Always move i to the next character
40 }
41
42 // If we've found the whole string p, return true; otherwise, return false
43 return j == pLen;
44 }
45}
46
1#include <vector>
2#include <string>
3#include <unordered_set>
4
5class Solution {
6public:
7 int maximumRemovals(std::string s, std::string p, std::vector<int>& removable) {
8 int left = 0, right = removable.size();
9
10 // Binary search to find the maximum number of characters that can be removed
11 while (left < right) {
12 // Choose the midpoint of the current range
13 int mid = (left + right + 1) / 2;
14
15 // Check if the subsequence 'p' is still in 's' after removing 'mid' characters
16 if (checkSubsequence(s, p, removable, mid)) {
17 left = mid; // If it is, increase the lower bound of the search range
18 } else {
19 right = mid - 1; // If not, decrease the upper bound of the search range
20 }
21 }
22 return left; // The maximum number of characters we can remove
23 }
24
25 // Helper function to check if 'p' is a subsequence of 's' after removing 'mid' characters
26 bool checkSubsequence(std::string& s, std::string& p, std::vector<int>& removable, int mid) {
27 int sLength = s.size();
28 int pLength = p.size();
29 int i = 0, j = 0; // Pointers for string 's' and the subsequence 'p'
30
31 // Using a set to keep track of the indices that have been removed
32 std::unordered_set<int> removedIndices;
33 for (int k = 0; k < mid; ++k) {
34 removedIndices.insert(removable[k]);
35 }
36
37 // Iterate over both strings and check if 'p' is a subsequence of 's'
38 while (i < sLength && j < pLength) {
39 // If the current character is not removed and matches with `p[j]`,
40 // move the pointer `j` of subsequence `p`
41 if (removedIndices.count(i) == 0 && s[i] == p[j]) {
42 ++j;
43 }
44 ++i; // Always move the pointer `i` of string `s`
45 }
46
47 // If we have iterated through the entire subsequence `p`, it's a subsequence of `s`
48 return j == pLength;
49 }
50};
51
1// Function to determine the maximum number of characters that can be removed
2// from string 's' such that string 'p' is still a subsequence of 's'.
3function maximumRemovals(s: string, p: string, removable: number[]): number {
4 let leftIndex = 0;
5 let rightIndex = removable.length;
6
7 while (leftIndex < rightIndex) {
8 // Midpoint calculation to use as the possible number of characters to be removed
9 let midPoint = Math.floor((leftIndex + rightIndex + 1) / 2);
10
11 // Check if 'p' is still a subsequence after removing 'midPoint' characters
12 if (isSubsequence(s, p, new Set(removable.slice(0, midPoint)))) {
13 leftIndex = midPoint;
14 } else {
15 rightIndex = midPoint - 1;
16 }
17 }
18 // The maximum number of characters that can be removed
19 return leftIndex;
20}
21
22// Function to check if 'sub' is a subsequence of 'str' when indices in 'removedIndices' are removed.
23function isSubsequence(str: string, sub: string, removedIndices: Set<number>): boolean {
24 let strLength = str.length;
25 let subLength = sub.length;
26 let indexStr = 0;
27 let indexSub = 0;
28
29 // Iterate over each character of 'str' and 'sub' to check for subsequence
30 while (indexStr < strLength && indexSub < subLength) {
31 // If the current index is not removed, and characters match, increment 'sub' index
32 if (!removedIndices.has(indexStr) && str.charAt(indexStr) === sub.charAt(indexSub)) {
33 indexSub++;
34 }
35 indexStr++;
36 }
37 // Check if the end of 'sub' was reached, indicating it is a subsequence
38 return indexSub === subLength;
39}
40
Time and Space Complexity
Time Complexity:
The time complexity of the function mainly comes from two parts: the binary search and repeatedly checking whether p
is a subsequence of s
after certain characters have been removed.
The binary search is done over the length of removable
, which has a complexity of O(log k)
where k
is the length of removable
. For each iteration of binary search, the check
function is called, which iterates over the characters of s
and p
.
The check
function can take O(m + k)
time in the worst case, where m
is the length of s
and k
is the length of the prefix of removable
considered in the check function. Although k
changes with each binary search iteration, we need to consider the worst-case scenario where the function checks the entire prefix, hence k
.
Combining these two parts, the overall time complexity is O((m + k) * log k)
.
Space Complexity:
The space complexity is primarily due to the set ids
used in the check
function, which stores the indices that are removable up to k
. This set can grow up to size k
.
Thus, the space complexity is O(k)
.
Learn more about how to find time and space complexity quickly using problem constraints.
Which of the two traversal algorithms (BFS and DFS) can be used to find whether two nodes are connected?
Recommended Readings
Binary Search Speedrun For each of the Speedrun questions you will be given a binary search related problem and a corresponding multiple choice question The multiple choice questions are related to the techniques and template s introduced in the binary search section It's recommended that you have gone through at
LeetCode Patterns Your Personal Dijkstra's Algorithm to Landing Your Dream Job The goal of AlgoMonster is to help you get a job in the shortest amount of time possible in a data driven way We compiled datasets of tech interview problems and broke them down by patterns This way we
Recursion Recursion is one of the most important concepts in computer science Simply speaking recursion is the process of a function calling itself Using a real life analogy imagine a scenario where you invite your friends to lunch https algomonster s3 us east 2 amazonaws com recursion jpg You first