301. Remove Invalid Parentheses

HardBreadth-First SearchStringBacktracking

Problem Description

The problem tasks us with correcting a string s that consists of lowercase letters and parentheses. Our goal is to remove the smallest number of parentheses to make the string a valid expression. An expression is considered valid if parentheses are correctly paired and nested. For example, "()()" and "(a)(b)" are valid, but ")(" and "(()" are not. Since there may be multiple ways to remove parentheses and validate the string, we should return a list of all unique valid strings that can result from the minimal number of removals. Order of the output is not important.

Flowchart Walkthrough

First, let's analyze the problem using the Flowchart. Here's a step-by-step breakdown to deduce the appropriate algorithm:

Is it a graph?

No: The problem is not inherently a graph problem as it involves string manipulation to ensure valid parentheses. However, we can frame each alteration of the string as a state in a state-space search, which is effectively a graph traversal problem.

Since thinking of it as a graph could be considered a stretch in traditional terms and the raw structure of the problem doesn't command a direct representation in graph terms, let's consider alternative routes:

Is the problem related to directed acyclic graphs (DAGs)?

No: This problem isn't about dependencies or topological sorting.

Is the problem related to shortest paths?

No: Although we are looking to make minimal changes, it's not in the sense of a path or distance in graph theory, but rather the minimum removals to make the string valid.

Does the problem involve connectivity?

No: The problem doesn't relate specifically to connecting various components like that in a graph, but rather ensuring each component in a set (the string in this case) adheres to a rule (being valid parentheses).

Does the problem have small constraints?

Yes: While the input string may not be particularly short, the exponential growth of possibilities is often reasonably bounded in such problems that we look to manage using clever algorithms like BFS.

With the problem mapped as a state-space search where each state is a version of the string with a certain number of characters removed, and each edge represents the removal of one parenthesis, we need an algorithm capable of exploring these states level by level to ensure the minimal removal of characters:

Conclusion: According to the analysis and adapting BFS to work in the scenario of state-space search (where each state is a possible string), BFS is the suitable algorithm. It allows exploration of all possible states by levels of removal, ensuring that the first valid strings we encounter are the ones with the minimal amount of modification. This fits well as BFS will naturally explore all options of the same level (number of removals) before proceeding deeper, adhering to our minimal removal requirement effectively.

Quick Interview Experience

Help others by sharing your interview experience

Have you seen this problem before?

Intuition

The solution relies on a Depth-First Search (DFS) approach to generate all possible strings through removing minimum invalid parentheses and test each string for validity. Since we are interested in the least number of removals, there are two important counts: the number of left parentheses that can be removed (l), and the number of right parentheses that can be removed (r). These counts are first determined by iterating through the string and applying simple rules:

Increment the left count (l) when we encounter an opening parenthesis.
If we encounter a closing parenthesis and there are already opening ones to balance it, we decrement the left count. Otherwise, we increment the right count (r).

With these counts, we use DFS to traverse the string and try the following actions:

Skip an opening parenthese if we still have left parentheses to remove (l > 0), and proceed to the next character without adding it to our ongoing string t.
Similarly, skip a closing parenthese if we can still remove right parentheses (r > 0).
Or, take the current character and add it to t, updating the number of parenthesis added (condition lcnt < rcnt is checking that at any stage in the DFS, we should never have more closing parentheses taken than opening ones).

Every time we reach the end of the string, if both left and right counters are zero, it means we've made valid removals, and we add the ongoing string t to our answers set.

With this recursive approach, all valid combinations of the string s with the minimum necessary removals will be explored and added to the answers set, while invalid paths will be pruned early, optimizing the process. At the end of all DFS calls, the set contains all unique valid strings, which is converted to a list and returned.

Learn more about Breadth-First Search and Backtracking patterns.

Solution Approach

The solution begins by defining a DFS function dfs with the following parameters:

i: the current index in the string s.
l: the count of left parentheses that we can still remove.
r: the count of right parentheses that we can still remove.
lcnt: the count of left parentheses that have been added to the ongoing string t.
rcnt: the count of right parentheses that have been added to the ongoing string t.
t: the current state of the string after removals.

Here are the steps in the implementation:

Base Condition: When i equals the length of the string n, it checks if both l and r are zero, indicating that a correct balance has been maintained. If so, it adds the current string t to the set ans.
Pruning: The function returns early without further exploring the current branch if:
- More characters are needed to correct the balance than are remaining in the string (n - i < l + r), or
- There are more right parentheses than left in t (lcnt < rcnt), breaking the balance rule.
Recursion for Removals: The algorithm can choose to skip a parenthesis character if there are still parentheses of that kind left to be removed:
- For a left parenthesis at the current index (s[i] == '('), if l > 0, the function calls itself recursively without including this character and decrements l.
- For a right parenthesis (s[i] == ')'), if r > 0, it does the same while decrementing r.

This allows the exploration of all the possibilities where parentheses are excluded from the final string.

Recursion for Inclusion: The function also calls itself recursively with the current character included in t:
- If the current character is a left parenthesis, it increments lcnt.
- If it's a right parenthesis, it increments rcnt.

This continues the path where the current character is considered part of the valid string.

Before running DFS, we initialize the counts l and r which determine how many of each type of parentheses can be removed, by iterating over s once.

The solution utilizes a set ans to keep track of unique valid strings and a variable n to store the length of the input string s. The set ensures that duplicate solutions are not included providing unique final answers.

After setting initial values for l, r, and n, the DFS function is kicked off and left to explore all valid removal patterns.

After the DFS function has completed its execution, ans will contain all unique strings that can be obtained by removing the minimal amount of parentheses required to make s valid. The last line of the function returns the set as a list. The utilisation of a set is crucial, as it automatically handles the deduplication of valid strings, which is a requirement.

The chosen algorithm and data structures (depth-first search, integers for counters, a string for the current state, and a set for the answers) are all integral to ensuring that the solution is efficient and satisfies the problem's constraints.

Ready to land your dream job?

Unlock your dream job with a 2-minute evaluator for a personalized learning plan!

Start Evaluator

Example Walkthrough

Let's use the string s = "(()))(" as an example to illustrate the solution approach.

Determining Parentheses to Remove: We begin by iterating through the string s and counting the number of invalid left ( and right ) parentheses.

Start with l = r = 0.
For each character in s:
- If (, increment l to indicate we have another left parenthesis that can be balanced later.
- If ) and l > 0, it means there's a left parenthesis that can balance it, so we decrement l.
- If ) and l == 0, there is no left parenthesis to balance it, so we increment r.

Setting Up for DFS: After the first iteration, we know we have l = 0 (no extra left parentheses to remove) and r = 2 (there are two unbalanced right parentheses).
Running DFS:

We start at index i = 0 with the initial counts l = 0 and r = 2. Our current string t is empty, and lcnt = rcnt = 0.
At each step, we make recursive calls considering inclusion or removal of the current parenthesis:
- When i = 0, s[i] is (. We cannot remove a left parenthesis (l = 0), but we can add it to t. So we add it and make a recursive call with lcnt = 1.
- At i = 1, we again have (, and proceed similarly, with lcnt now being 2.
- At i = 2, we encounter ), with lcnt = 2, so we can balance it. We add it to t (recursively continuing with rcnt = 1) and we could also skip it because r > 0. We make two recursive calls, one including it and one excluding it.
- We continue like this until i equals the length of s.

Valid String Condition: Each time we reach the end of s, we check if l and r are both 0. If so, we add t to our set ans. If we took the right path previously, the valid string t becomes "()()".
Continuing Search: The DFS continues until all valid combinations of string s with the minimum parentheses removed are explored. Since our example string could have multiple valid answers, the DFS will eventually find and store them all.
Collecting Results: When the DFS is done, ans will hold all unique valid strings. If we just removed right parentheses in our example, ans could contain strings like "()()".

The set ans will include every unique solution, which we convert to a list and return.

This approach ensures that all possible strings with the minimum number of parentheses removed are considered without duplicates, thus satisfying the problem's requirement of finding all possible unique valid strings.

Solution Implementation

1from typing import List
2
3class Solution:
4    def removeInvalidParentheses(self, s: str) -> List[str]:
5        # Use depth-first search (DFS) to explore possibilities
6        def dfs(index: int, left_to_remove: int, right_to_remove: int, left_count: int, right_count: int, path: str):
7            # Base case: if we have reached the end of the string
8            if index == length:
9                # Check if we no longer have any parentheses to remove and the parentheses are balanced
10                if left_to_remove == 0 and right_to_remove == 0:
11                    # Add the current path to the answer set
12                    valid_expressions.add(path)
13                return
14          
15            # Pruning: if there are more parentheses to remove than remaining characters
16            # or more right parentheses used than left
17            if length - index < left_to_remove + right_to_remove or left_count < right_count:
18                return
19          
20            # Choose to ignore the current character if it's a parenthesis and we need to remove it
21            if s[index] == '(' and left_to_remove > 0:
22                dfs(index + 1, left_to_remove - 1, right_to_remove, left_count, right_count, path)
23            elif s[index] == ')' and right_to_remove > 0:
24                dfs(index + 1, left_to_remove, right_to_remove - 1, left_count, right_count, path)
25          
26            # Choose to keep the current character, updating the count of used parentheses
27            dfs(index + 1, left_to_remove, right_to_remove, left_count + (s[index] == '('), right_count + (s[index] == ')'), path + s[index])
28      
29        # Initial preparations
30        left_to_remove, right_to_remove = 0, 0
31        # First pass to find out the number of unmatched '(' and ')'
32        for char in s:
33            if char == '(':
34                left_to_remove += 1
35            elif char == ')':
36                # Only decrease the count of left parentheses if there's an unmatched left parenthesis
37                if left_to_remove:
38                    left_to_remove -= 1
39                else:
40                    right_to_remove += 1
41      
42        # A set to keep the unique valid expressions
43        valid_expressions = set()
44        length = len(s)
45      
46        # Start the DFS with the initial parameters
47        dfs(0, left_to_remove, right_to_remove, 0, 0, '')
48      
49        # Convert the set into a list and return it
50        return list(valid_expressions)
51

1import java.util.ArrayList;
2import java.util.HashSet;
3import java.util.List;
4import java.util.Set;
5
6class Solution {
7    private String inputString;
8    private int stringLength;
9    private Set<String> validParenthesesSet = new HashSet<>();
10
11    // Function to start the removal of invalid parentheses
12    public List<String> removeInvalidParentheses(String s) {
13        this.inputString = s;
14        this.stringLength = s.length();
15      
16        int leftCount = 0, rightCount = 0;
17        // First pass to count necessary removals
18        for (char ch : s.toCharArray()) {
19            if (ch == '(') {
20                ++leftCount;
21            } else if (ch == ')') {
22                if (leftCount > 0) {
23                    --leftCount;
24                } else {
25                    ++rightCount;
26                }
27            }
28        }
29        // Start the DFS traversal
30        depthFirstSearch(0, leftCount, rightCount, 0, 0, "");
31        // Convert the set of valid combinations to a list
32        return new ArrayList<>(validParenthesesSet);
33    }
34
35    // Helper function to perform DFS, removing invalid parentheses
36    private void depthFirstSearch(int currentIndex, int leftRemovals, int rightRemovals, int leftCount, int rightCount, String currentStr) {
37        // Base case: end of string reached
38        if (currentIndex == stringLength) {
39            if (leftRemovals == 0 && rightRemovals == 0) {
40                validParenthesesSet.add(currentStr);
41            }
42            return;
43        }
44        // Pruning: check if we can skip the current character
45        if (stringLength - currentIndex < leftRemovals + rightRemovals || leftCount < rightCount) {
46            return;
47        }
48      
49        char currentChar = inputString.charAt(currentIndex);
50        // Check if we can discard a left parenthesis
51        if (currentChar == '(' && leftRemovals > 0) {
52            depthFirstSearch(currentIndex + 1, leftRemovals - 1, rightRemovals, leftCount, rightCount, currentStr);
53        }
54        // Check if we can discard a right parenthesis
55        if (currentChar == ')' && rightRemovals > 0) {
56            depthFirstSearch(currentIndex + 1, leftRemovals, rightRemovals - 1, leftCount, rightCount, currentStr);
57        }
58      
59        // Either keep the current character or increase the respective counter
60        int increaseLeft = currentChar == '(' ? 1 : 0;
61        int increaseRight = currentChar == ')' ? 1 : 0;
62        depthFirstSearch(currentIndex + 1, leftRemovals, rightRemovals, leftCount + increaseLeft, rightCount + increaseRight, currentStr + currentChar);
63    }
64}
65

1#include <unordered_set>
2#include <vector>
3#include <string>
4#include <functional>
5
6using std::string;
7using std::vector;
8using std::function;
9using std::unordered_set;
10
11class Solution {
12public:
13    vector<string> removeInvalidParentheses(string s) {
14        unordered_set<string> validExpressionsSet; // Use a set to prevent duplicates
15        int leftRemovalCount = 0, rightRemovalCount = 0; // Counter for the parentheses we need to remove
16        int n = s.size(); // Size of the input string
17
18        // First, find out the number of misplaced left and right parentheses
19        for (char& ch : s) {
20            if (ch == '(') {
21                ++leftRemovalCount;
22            } else if (ch == ')') {
23                if (leftRemovalCount) {
24                    --leftRemovalCount; // A matching pair of parentheses
25                } else {
26                    ++rightRemovalCount; // A misplaced right parenthesis
27                }
28            }
29        }
30
31        // Depth-first search to generate all possible valid expressions
32        function<void(int, int, int, int, int, string)> searchValidExpressions;
33        searchValidExpressions = [&](int index, int leftCount, int rightCount, int leftAccumulated, int rightAccumulated, string currentExpression) {
34            // If we reached the end of the string, check if the expression is valid
35            if (index == n) {
36                if (leftCount == 0 && rightCount == 0) {
37                    // If no more parentheses need to be removed, add to the set
38                    validExpressionsSet.insert(currentExpression);
39                }
40                return;
41            }
42            // Early termination if we don't have enough parentheses left to remove
43            if (n - index < leftCount + rightCount || leftAccumulated < rightAccumulated) {
44                return;
45            }
46            // If it's a left parenthesis and we can remove it, recurse without including it
47            if (s[index] == '(' && leftCount) {
48                searchValidExpressions(index + 1, leftCount - 1, rightCount, leftAccumulated, rightAccumulated, currentExpression);
49            }
50            // If it's a right parenthesis and we can remove it, recurse without including it
51            if (s[index] == ')' && rightCount) {
52                searchValidExpressions(index + 1, leftCount, rightCount - 1, leftAccumulated, rightAccumulated, currentExpression);
53            }
54            // Determine whether the current character is a left or right parenthesis
55            int increaseLeft = s[index] == '(' ? 1 : 0;
56            int increaseRight = s[index] == ')' ? 1 : 0;
57            // Recurse to the next character in the string, including the current character in the expression
58            searchValidExpressions(index + 1, leftCount, rightCount, leftAccumulated + increaseLeft, rightAccumulated + increaseRight, currentExpression + s[index]);
59        };
60
61        // Initialize the recursive depth-first search
62        searchValidExpressions(0, leftRemovalCount, rightRemovalCount, 0, 0, "");
63        // Convert the set to a vector to return the final result
64        return vector<string>(validExpressionsSet.begin(), validExpressionsSet.end());
65    }
66};
67

1type ValidExpressionsSetType = Set<string>;
2
3let validExpressionsSet: ValidExpressionsSetType = new Set<string>();
4let s: string;
5let n: number;
6let leftRemovalCount: number;
7let rightRemovalCount: number;
8
9// First, calculate the number of misplaced left and right parentheses.
10function calculateMisplacedParentheses(input: string): void {
11  s = input;
12  n = s.length;
13  leftRemovalCount = 0;
14  rightRemovalCount = 0;
15
16  for (const ch of s) {
17    if (ch === '(') {
18      leftRemovalCount++;
19    } else if (ch === ')') {
20      if (leftRemovalCount) {
21        leftRemovalCount--; // A matching pair of parentheses.
22      } else {
23        rightRemovalCount++; // A misplaced right parenthesis.
24      }
25    }
26  }
27}
28
29// Perform a depth-first search to generate all possible valid expressions.
30function searchValidExpressions(
31  index: number,
32  leftCount: number,
33  rightCount: number,
34  leftAccumulated: number,
35  rightAccumulated: number,
36  currentExpression: string
37): void {
38  // If we've reached the end of the string, check if the expression is valid.
39  if (index === n) {
40    if (leftCount === 0 && rightCount === 0) {
41      validExpressionsSet.add(currentExpression);
42    }
43    return;
44  }
45
46  // Terminate early if there aren't enough parentheses left to be removed.
47  if (n - index < leftCount + rightCount || leftAccumulated < rightAccumulated) {
48    return;
49  }
50
51  if (s[index] === '(' && leftCount > 0) {
52    // Recurse without including this left parenthesis if we can still remove it.
53    searchValidExpressions(index + 1, leftCount - 1, rightCount, leftAccumulated, rightAccumulated, currentExpression);
54  }
55
56  if (s[index] === ')' && rightCount > 0) {
57    // Recurse without including this right parenthesis if we can still remove it.
58    searchValidExpressions(index + 1, leftCount, rightCount - 1, leftAccumulated, rightAccumulated, currentExpression);
59  }
60
61  let increaseLeft = s[index] === '(' ? 1 : 0;
62  let increaseRight = s[index] === ')' ? 1 : 0;
63
64  // Always include the current character and recurse.
65  searchValidExpressions(
66    index + 1,
67    leftCount,
68    rightCount,
69    leftAccumulated + increaseLeft,
70    rightAccumulated + increaseRight,
71    currentExpression + s[index]
72  );
73}
74
75// Function to initiate the process of removing invalid parentheses.
76function removeInvalidParentheses(input: string): string[] {
77  validExpressionsSet.clear(); // Reset the set of valid expressions.
78  calculateMisplacedParentheses(input); // Find out misplaced parentheses count.
79  searchValidExpressions(0, leftRemovalCount, rightRemovalCount, 0, 0, "");
80
81  // Convert the set of valid expressions to an array to return.
82  return Array.from(validExpressionsSet);
83}
84

Time and Space Complexity

The provided code snippet is a Python implementation of an algorithm that removes the minimum number of invalid parentheses to make the input string valid. The algorithm uses Depth-First Search (DFS) with backtracking.

Time Complexity: Let's denote n as the length of the string s. In the worst case, the DFS will visit every possible combination of parentheses removals. For each character in the input string, the algorithm has up to 3 choices: remove '(', remove ')', or keep the character (if it is a character other than the parenthese) - leading to O(3^n) in the worst case. However, we optimize this complexity by using the l and r counters for the minimum numbers of left and right parentheses that need to be removed, which prunes the recursion tree significantly. Additionally, the lcnt < rcnt condition further prunes invalid states whenever right parentheses count exceed the left parentheses count. Hence, the time complexity is less than O(3^n) but more than O(2^n), making it difficult to pinpoint the exact complexity without a rigorous mathematical derivation. A rough bound could be O(c^n), where c is some constant between 2 and 3.
Space Complexity: The space complexity is O(n) for the recursion stack since in the worst case, the recursive calls can have a maximum call chain of the length of the string. Furthermore, auxiliary space is needed for the string t in the DFS calls and an extra set ans to store the results, which can have up to O(2^n) combinations (each character is either included or excluded). So, while the stack space is linear in the size of the input, the space required for the solution set can be significant. The total space used is O(n * 2^n), which accounts for the stack space and the generated solutions. However, this is a loose upper bound given that many branches of recursion might be pruned, and not all 2^n combinations will be generated or stored (due to invalid sequences being discarded early).

Learn more about how to find time and space complexity quickly using problem constraints.

Discover Your Strengths and Weaknesses: Take Our 2-Minute Quiz to Tailor Your Study Plan: