3170. Lexicographically Minimum String After Removing Stars
Problem Description
You have a string s
that contains lowercase letters and asterisk (*
) characters. Your goal is to remove all asterisks from the string, but there's a special rule for how to do this.
Whenever you encounter an asterisk, you must:
- Delete that asterisk
- Also delete the lexicographically smallest non-asterisk character that appears to the left of this asterisk
- If there are multiple instances of the smallest character, you can delete any one of them
You need to repeat this process until all asterisks are removed from the string.
For example, if you have the string "cab*"
:
- The asterisk at position 3 requires deleting the smallest character to its left
- Among
'c'
,'a'
, and'b'
, the smallest is'a'
- So you delete both the
'*'
and the'a'
, leaving you with"cb"
The solution uses a clever approach:
- It maintains a dictionary
g
where each letter maps to a list of indices where that letter appears - It uses a boolean array
rem
to track which positions should be removed - When encountering an asterisk, it marks that position for removal and then searches through letters
'a'
to'z'
in order to find the first letter that has appeared earlier in the string - For that letter, it removes the rightmost occurrence (the one with the largest index) by popping from the list and marking it for removal
- Finally, it builds the result string by including only the characters at positions not marked for removal
The key insight is that by maintaining indices sorted by character and processing them in lexicographical order, we can efficiently find and remove the correct character each time we encounter an asterisk.
Intuition
When we see an asterisk, we need to find and delete the lexicographically smallest character to its left. The naive approach would be to scan all characters to the left of each asterisk to find the smallest one, but this would be inefficient.
The key observation is that we need two pieces of information efficiently:
- Which characters have appeared so far (to the left of current position)
- For each character type, where they are located
Since we're looking for the lexicographically smallest character, we can think of organizing our data by character type. If we group all indices by their character value, then when we need to find the smallest character, we can simply check characters in alphabetical order ('a'
, then 'b'
, then 'c'
, etc.) until we find one that exists.
But there's another consideration - if multiple instances of the smallest character exist, we can delete any of them. Which one should we choose? The insight here is that we should delete the rightmost occurrence of that character. Why? Because keeping characters that appear earlier gives us more flexibility for future asterisk operations - those earlier characters can potentially be paired with asterisks that appear later in the string.
This leads us to maintain a list of indices for each character. When we encounter a regular character, we append its index to the corresponding list. When we encounter an asterisk, we:
- Iterate through characters
'a'
to'z'
- Find the first character that has at least one occurrence
- Remove the last (rightmost) index from that character's list
By using a boolean array to mark which positions should be deleted rather than actually modifying the string during traversal, we avoid the complexity of shifting indices when characters are removed. We can simply mark positions as "to be deleted" and construct the final string in one pass at the end.
Learn more about Stack, Greedy and Heap (Priority Queue) patterns.
Solution Approach
Let's walk through the implementation step by step:
Data Structures Used:
g = defaultdict(list)
: A dictionary where each key is a character and the value is a list of indices where that character appears in the stringrem = [False] * n
: A boolean array to track which positions should be removed from the final result
Algorithm Steps:
-
Initialize the data structures:
- Create an empty defaultdict
g
that will automatically create empty lists for new keys - Create a boolean array
rem
of sizen
(length of string) initialized withFalse
- Create an empty defaultdict
-
Process each character in the string:
for i, c in enumerate(s):
-
Handle asterisk characters:
- When we encounter a
"*"
at positioni
:- Mark this position for removal:
rem[i] = True
- Search for the lexicographically smallest character to remove:
for a in ascii_lowercase: # iterates 'a' to 'z' if g[a]: # if this character has appeared rem[g[a].pop()] = True # remove its last occurrence break # stop after finding the first (smallest) character
- The
pop()
operation removes and returns the last index from the list - We mark that index in
rem
asTrue
to indicate deletion
- Mark this position for removal:
- When we encounter a
-
Handle non-asterisk characters:
- For regular characters, simply record their position:
g[c].append(i)
- This maintains a list of indices for each character type
- For regular characters, simply record their position:
-
Build the final result:
return "".join(c for i, c in enumerate(s) if not rem[i])
- Iterate through the original string with indices
- Include only characters at positions where
rem[i]
isFalse
- Join them into the final string
Time Complexity: O(n × 26)
= O(n)
where n
is the length of the string. For each character, we might iterate through at most 26 lowercase letters.
Space Complexity: O(n)
for storing the indices in g
and the boolean array rem
.
The elegance of this solution lies in how it avoids actually modifying the string during processing. Instead, it marks positions for deletion and constructs the result in a single pass at the end, making the implementation both clean and efficient.
Ready to land your dream job?
Unlock your dream job with a 3-minute evaluator for a personalized learning plan!
Start EvaluatorExample Walkthrough
Let's walk through the solution with the string s = "aaba*c*d"
.
Initial Setup:
g = {}
(empty dictionary)rem = [False, False, False, False, False, False, False, False]
(8 positions, all False)
Processing each character:
Position 0: 'a'
- Not an asterisk, so add to dictionary
g = {'a': [0]}
rem = [False, False, False, False, False, False, False, False]
Position 1: 'a'
- Not an asterisk, so add to dictionary
g = {'a': [0, 1]}
rem = [False, False, False, False, False, False, False, False]
Position 2: 'b'
- Not an asterisk, so add to dictionary
g = {'a': [0, 1], 'b': [2]}
rem = [False, False, False, False, False, False, False, False]
Position 3: 'a'
- Not an asterisk, so add to dictionary
g = {'a': [0, 1, 3], 'b': [2]}
rem = [False, False, False, False, False, False, False, False]
Position 4: '*'
- Mark position 4 for removal:
rem[4] = True
- Search for smallest character: check 'a' first
g['a']
has values[0, 1, 3]
, so it exists- Pop the last index:
g['a'].pop()
returns 3 - Mark position 3 for removal:
rem[3] = True
g = {'a': [0, 1], 'b': [2]}
rem = [False, False, False, True, True, False, False, False]
Position 5: 'c'
- Not an asterisk, so add to dictionary
g = {'a': [0, 1], 'b': [2], 'c': [5]}
rem = [False, False, False, True, True, False, False, False]
Position 6: '*'
- Mark position 6 for removal:
rem[6] = True
- Search for smallest character: check 'a' first
g['a']
has values[0, 1]
, so it exists- Pop the last index:
g['a'].pop()
returns 1 - Mark position 1 for removal:
rem[1] = True
g = {'a': [0], 'b': [2], 'c': [5]}
rem = [False, True, False, True, True, False, True, False]
Position 7: 'd'
- Not an asterisk, so add to dictionary
g = {'a': [0], 'b': [2], 'c': [5], 'd': [7]}
rem = [False, True, False, True, True, False, True, False]
Building the result:
- Include characters at positions where
rem[i]
is False:- Position 0: 'a' (include)
- Position 1: 'a' (skip, rem[1] = True)
- Position 2: 'b' (include)
- Position 3: 'a' (skip, rem[3] = True)
- Position 4: '*' (skip, rem[4] = True)
- Position 5: 'c' (include)
- Position 6: '*' (skip, rem[6] = True)
- Position 7: 'd' (include)
Final result: "abcd"
The solution correctly removed both asterisks and their corresponding smallest characters to the left (the 'a' at position 3 for the first asterisk, and the 'a' at position 1 for the second asterisk).
Solution Implementation
1from collections import defaultdict
2from string import ascii_lowercase
3
4class Solution:
5 def clearStars(self, s: str) -> str:
6 # Dictionary to store indices of each character (excluding '*')
7 # Key: character, Value: list of indices where this character appears
8 char_indices = defaultdict(list)
9
10 # Get the length of the input string
11 n = len(s)
12
13 # Track which indices should be removed from the final result
14 is_removed = [False] * n
15
16 # Process each character in the string
17 for i, char in enumerate(s):
18 if char == "*":
19 # Mark the current star for removal
20 is_removed[i] = True
21
22 # Find and remove the smallest character that appears before this star
23 # Check characters in lexicographical order (a, b, c, ...)
24 for letter in ascii_lowercase:
25 if char_indices[letter]:
26 # Remove the rightmost occurrence of the smallest available character
27 index_to_remove = char_indices[letter].pop()
28 is_removed[index_to_remove] = True
29 break
30 else:
31 # For non-star characters, store their index
32 char_indices[char].append(i)
33
34 # Build the result string by including only characters that weren't removed
35 return "".join(char for i, char in enumerate(s) if not is_removed[i])
36
1class Solution {
2 public String clearStars(String s) {
3 // Create an array of deques, one for each letter (a-z)
4 // Each deque stores indices of occurrences of that letter
5 Deque<Integer>[] letterIndices = new Deque[26];
6 Arrays.setAll(letterIndices, i -> new ArrayDeque<>());
7
8 int length = s.length();
9 // Track which characters should be removed
10 boolean[] isRemoved = new boolean[length];
11
12 // Process each character in the string
13 for (int i = 0; i < length; i++) {
14 if (s.charAt(i) == '*') {
15 // Mark the star for removal
16 isRemoved[i] = true;
17
18 // Find and remove the closest non-star character with smallest value
19 // Search from 'a' to 'z' to find the lexicographically smallest
20 for (int letterIndex = 0; letterIndex < 26; letterIndex++) {
21 if (!letterIndices[letterIndex].isEmpty()) {
22 // Remove the most recent occurrence of this letter
23 int indexToRemove = letterIndices[letterIndex].pop();
24 isRemoved[indexToRemove] = true;
25 break;
26 }
27 }
28 } else {
29 // Store the index of this non-star character
30 int letterIndex = s.charAt(i) - 'a';
31 letterIndices[letterIndex].push(i);
32 }
33 }
34
35 // Build the result string excluding removed characters
36 StringBuilder result = new StringBuilder();
37 for (int i = 0; i < length; i++) {
38 if (!isRemoved[i]) {
39 result.append(s.charAt(i));
40 }
41 }
42
43 return result.toString();
44 }
45}
46
1class Solution {
2public:
3 string clearStars(string s) {
4 // Array of stacks to store indices of each character ('a' to 'z')
5 // charStacks[0] stores indices of 'a', charStacks[1] stores indices of 'b', etc.
6 stack<int> charStacks[26];
7
8 int stringLength = s.length();
9
10 // Track which characters should be removed
11 vector<bool> isRemoved(stringLength, false);
12
13 // Process each character in the string
14 for (int i = 0; i < stringLength; ++i) {
15 if (s[i] == '*') {
16 // Mark the star itself for removal
17 isRemoved[i] = true;
18
19 // Find and remove the closest non-star character to the left
20 // Starting from 'a' (lexicographically smallest)
21 for (int charIndex = 0; charIndex < 26; ++charIndex) {
22 if (!charStacks[charIndex].empty()) {
23 // Mark the top character (rightmost occurrence) for removal
24 isRemoved[charStacks[charIndex].top()] = true;
25 charStacks[charIndex].pop();
26 break; // Remove only one character per star
27 }
28 }
29 } else {
30 // Store the index of non-star character in corresponding stack
31 int charIndex = s[i] - 'a';
32 charStacks[charIndex].push(i);
33 }
34 }
35
36 // Build the result string with non-removed characters
37 string result;
38 for (int i = 0; i < stringLength; ++i) {
39 if (!isRemoved[i]) {
40 result.push_back(s[i]);
41 }
42 }
43
44 return result;
45 }
46};
47
1/**
2 * Removes stars and the closest non-star character to the left of each star
3 * Characters are removed starting with the smallest lexicographical character available
4 * @param s - Input string containing letters and asterisks
5 * @returns String with stars and corresponding characters removed
6 */
7function clearStars(s: string): string {
8 // Create 26 stacks to track indices of each letter (a-z)
9 const letterIndexStacks: number[][] = Array.from({ length: 26 }, () => []);
10
11 const stringLength: number = s.length;
12
13 // Track which characters should be removed
14 const isRemoved: boolean[] = Array(stringLength).fill(false);
15
16 // Process each character in the string
17 for (let currentIndex = 0; currentIndex < stringLength; ++currentIndex) {
18 if (s[currentIndex] === '*') {
19 // Mark the star for removal
20 isRemoved[currentIndex] = true;
21
22 // Find and remove the leftmost occurrence of the smallest available letter
23 for (let letterIndex = 0; letterIndex < 26; ++letterIndex) {
24 if (letterIndexStacks[letterIndex].length > 0) {
25 // Remove the most recent occurrence of this letter
26 const indexToRemove: number = letterIndexStacks[letterIndex].pop()!;
27 isRemoved[indexToRemove] = true;
28 break;
29 }
30 }
31 } else {
32 // Add current letter's index to its corresponding stack
33 const letterCode: number = s.charCodeAt(currentIndex) - 97; // Convert 'a'-'z' to 0-25
34 letterIndexStacks[letterCode].push(currentIndex);
35 }
36 }
37
38 // Build result string by filtering out removed characters
39 return s
40 .split('')
41 .filter((_, index) => !isRemoved[index])
42 .join('');
43}
44
Time and Space Complexity
Time Complexity: O(n × |Σ|)
where n
is the length of string s
and |Σ| = 26
(the size of the lowercase alphabet).
The algorithm iterates through each character in the string once, which takes O(n)
time. For each "*"
character encountered, it searches through all 26 lowercase letters in alphabetical order (for a in ascii_lowercase
) to find the first letter that has a non-empty list in the dictionary g
. In the worst case, this search examines all 26 letters. Since there can be up to n
asterisks in the string, the worst-case time complexity is O(n × 26) = O(n × |Σ|)
.
Space Complexity: O(n)
The algorithm uses:
- A dictionary
g
that stores lists of indices. In the worst case, alln
indices could be stored across all lists, usingO(n)
space. - A boolean array
rem
of sizen
to track which characters should be removed, usingO(n)
space. - The final string construction creates a new string, which in the worst case could be of length
n
, usingO(n)
space.
Therefore, the total space complexity is O(n)
.
Learn more about how to find time and space complexity quickly.
Common Pitfalls
1. Attempting to Find the Minimum Character Incorrectly
A common mistake is trying to find the minimum character by searching backwards through the string from the asterisk position, which becomes inefficient and complex when handling multiple occurrences of the same character.
Incorrect Approach:
# Wrong: Trying to manually search for minimum
for j in range(i-1, -1, -1):
if s[j] != '*' and not is_removed[j]:
if s[j] < min_char:
min_char = s[j]
min_index = j
Why it fails: This requires tracking which characters have already been removed and comparing characters manually, leading to O(n²) complexity and potential bugs.
Correct Solution: Use the dictionary approach with lexicographical iteration through ascii_lowercase
to automatically find the smallest character in O(1) amortized time.
2. Removing the Wrong Occurrence of the Minimum Character
When multiple instances of the smallest character exist, the problem states "you can delete any one of them." A pitfall is assuming you should remove the first occurrence or not having a consistent strategy.
Incorrect Approach:
# Wrong: Always removing the first occurrence if char_indices[letter]: index_to_remove = char_indices[letter][0] # Takes first char_indices[letter].remove(index_to_remove) # O(n) operation!
Why it fails: Using remove()
or deleting from the beginning of a list is O(n) for each deletion, making the overall complexity O(n²).
Correct Solution: Always remove from the end using pop()
which is O(1), maintaining efficiency while still satisfying the problem requirements.
3. Modifying the String During Processing
Attempting to modify the string directly while iterating through it causes index misalignment and complexity issues.
Incorrect Approach:
# Wrong: Trying to modify string in-place
result = list(s)
for i in range(len(result)):
if result[i] == '*':
# Remove the star and a character
result.pop(i) # Indices shift!
# Now all subsequent indices are wrong
Why it fails: Removing characters shifts all subsequent indices, making it impossible to track positions correctly.
Correct Solution: Use the boolean array is_removed
to mark positions for deletion, then construct the final string in one pass at the end.
4. Not Handling Edge Cases
Failing to handle cases where an asterisk appears but no non-asterisk characters exist to its left.
Incorrect Approach:
# Wrong: Assuming there's always a character to remove for letter in ascii_lowercase: # What if no letters exist before the asterisk? index_to_remove = char_indices[letter].pop() # Could be empty!
Why it fails: According to the problem constraints, there's always a valid character to remove before each asterisk, but defensive programming suggests checking anyway.
Correct Solution: The current implementation handles this correctly with the if char_indices[letter]:
check before attempting to pop.
Depth first search is equivalent to which of the tree traversal order?
Recommended Readings
Stack Intro Imagine you have a pile of books on your desk If you want to add a new book you place it on top If you want to read a book you take it from the top And if you simply want to see which book is on the top you
Greedy Introduction div class responsive iframe iframe src https www youtube com embed WTslqPbj7I title YouTube video player frameborder 0 allow accelerometer autoplay clipboard write encrypted media gyroscope picture in picture web share allowfullscreen iframe div When do we use greedy Greedy algorithms tend to solve optimization problems Typically they will ask you to calculate the max min of some value Commonly you may see this phrased in the problem as max min longest shortest largest smallest etc These keywords can be identified by just scanning
https assets algo monster cover_photos heap svg Priority Queue and Heap What is the relationship between priority queue and heap Priority Queue is an Abstract Data Type and Heap is the concrete data structure we use to implement a priority queue Priority Queue A priority queue is a data structure
Want a Structured Path to Master System Design Too? Don’t Miss This!