Facebook Pixel

3170. Lexicographically Minimum String After Removing Stars

Problem Description

You have a string s that contains lowercase letters and asterisk (*) characters. Your goal is to remove all asterisks from the string, but there's a special rule for how to do this.

Whenever you encounter an asterisk, you must:

  1. Delete that asterisk
  2. Also delete the lexicographically smallest non-asterisk character that appears to the left of this asterisk
  3. If there are multiple instances of the smallest character, you can delete any one of them

You need to repeat this process until all asterisks are removed from the string.

For example, if you have the string "cab*":

  • The asterisk at position 3 requires deleting the smallest character to its left
  • Among 'c', 'a', and 'b', the smallest is 'a'
  • So you delete both the '*' and the 'a', leaving you with "cb"

The solution uses a clever approach:

  • It maintains a dictionary g where each letter maps to a list of indices where that letter appears
  • It uses a boolean array rem to track which positions should be removed
  • When encountering an asterisk, it marks that position for removal and then searches through letters 'a' to 'z' in order to find the first letter that has appeared earlier in the string
  • For that letter, it removes the rightmost occurrence (the one with the largest index) by popping from the list and marking it for removal
  • Finally, it builds the result string by including only the characters at positions not marked for removal

The key insight is that by maintaining indices sorted by character and processing them in lexicographical order, we can efficiently find and remove the correct character each time we encounter an asterisk.

Quick Interview Experience
Help others by sharing your interview experience
Have you seen this problem before?

Intuition

When we see an asterisk, we need to find and delete the lexicographically smallest character to its left. The naive approach would be to scan all characters to the left of each asterisk to find the smallest one, but this would be inefficient.

The key observation is that we need two pieces of information efficiently:

  1. Which characters have appeared so far (to the left of current position)
  2. For each character type, where they are located

Since we're looking for the lexicographically smallest character, we can think of organizing our data by character type. If we group all indices by their character value, then when we need to find the smallest character, we can simply check characters in alphabetical order ('a', then 'b', then 'c', etc.) until we find one that exists.

But there's another consideration - if multiple instances of the smallest character exist, we can delete any of them. Which one should we choose? The insight here is that we should delete the rightmost occurrence of that character. Why? Because keeping characters that appear earlier gives us more flexibility for future asterisk operations - those earlier characters can potentially be paired with asterisks that appear later in the string.

This leads us to maintain a list of indices for each character. When we encounter a regular character, we append its index to the corresponding list. When we encounter an asterisk, we:

  1. Iterate through characters 'a' to 'z'
  2. Find the first character that has at least one occurrence
  3. Remove the last (rightmost) index from that character's list

By using a boolean array to mark which positions should be deleted rather than actually modifying the string during traversal, we avoid the complexity of shifting indices when characters are removed. We can simply mark positions as "to be deleted" and construct the final string in one pass at the end.

Learn more about Stack, Greedy and Heap (Priority Queue) patterns.

Solution Approach

Let's walk through the implementation step by step:

Data Structures Used:

  1. g = defaultdict(list): A dictionary where each key is a character and the value is a list of indices where that character appears in the string
  2. rem = [False] * n: A boolean array to track which positions should be removed from the final result

Algorithm Steps:

  1. Initialize the data structures:

    • Create an empty defaultdict g that will automatically create empty lists for new keys
    • Create a boolean array rem of size n (length of string) initialized with False
  2. Process each character in the string:

    for i, c in enumerate(s):
  3. Handle asterisk characters:

    • When we encounter a "*" at position i:
      • Mark this position for removal: rem[i] = True
      • Search for the lexicographically smallest character to remove:
        for a in ascii_lowercase:  # iterates 'a' to 'z'
            if g[a]:  # if this character has appeared
                rem[g[a].pop()] = True  # remove its last occurrence
                break  # stop after finding the first (smallest) character
      • The pop() operation removes and returns the last index from the list
      • We mark that index in rem as True to indicate deletion
  4. Handle non-asterisk characters:

    • For regular characters, simply record their position:
      g[c].append(i)
    • This maintains a list of indices for each character type
  5. Build the final result:

    return "".join(c for i, c in enumerate(s) if not rem[i])
    • Iterate through the original string with indices
    • Include only characters at positions where rem[i] is False
    • Join them into the final string

Time Complexity: O(n × 26) = O(n) where n is the length of the string. For each character, we might iterate through at most 26 lowercase letters.

Space Complexity: O(n) for storing the indices in g and the boolean array rem.

The elegance of this solution lies in how it avoids actually modifying the string during processing. Instead, it marks positions for deletion and constructs the result in a single pass at the end, making the implementation both clean and efficient.

Ready to land your dream job?

Unlock your dream job with a 3-minute evaluator for a personalized learning plan!

Start Evaluator

Example Walkthrough

Let's walk through the solution with the string s = "aaba*c*d".

Initial Setup:

  • g = {} (empty dictionary)
  • rem = [False, False, False, False, False, False, False, False] (8 positions, all False)

Processing each character:

Position 0: 'a'

  • Not an asterisk, so add to dictionary
  • g = {'a': [0]}
  • rem = [False, False, False, False, False, False, False, False]

Position 1: 'a'

  • Not an asterisk, so add to dictionary
  • g = {'a': [0, 1]}
  • rem = [False, False, False, False, False, False, False, False]

Position 2: 'b'

  • Not an asterisk, so add to dictionary
  • g = {'a': [0, 1], 'b': [2]}
  • rem = [False, False, False, False, False, False, False, False]

Position 3: 'a'

  • Not an asterisk, so add to dictionary
  • g = {'a': [0, 1, 3], 'b': [2]}
  • rem = [False, False, False, False, False, False, False, False]

Position 4: '*'

  • Mark position 4 for removal: rem[4] = True
  • Search for smallest character: check 'a' first
  • g['a'] has values [0, 1, 3], so it exists
  • Pop the last index: g['a'].pop() returns 3
  • Mark position 3 for removal: rem[3] = True
  • g = {'a': [0, 1], 'b': [2]}
  • rem = [False, False, False, True, True, False, False, False]

Position 5: 'c'

  • Not an asterisk, so add to dictionary
  • g = {'a': [0, 1], 'b': [2], 'c': [5]}
  • rem = [False, False, False, True, True, False, False, False]

Position 6: '*'

  • Mark position 6 for removal: rem[6] = True
  • Search for smallest character: check 'a' first
  • g['a'] has values [0, 1], so it exists
  • Pop the last index: g['a'].pop() returns 1
  • Mark position 1 for removal: rem[1] = True
  • g = {'a': [0], 'b': [2], 'c': [5]}
  • rem = [False, True, False, True, True, False, True, False]

Position 7: 'd'

  • Not an asterisk, so add to dictionary
  • g = {'a': [0], 'b': [2], 'c': [5], 'd': [7]}
  • rem = [False, True, False, True, True, False, True, False]

Building the result:

  • Include characters at positions where rem[i] is False:
    • Position 0: 'a' (include)
    • Position 1: 'a' (skip, rem[1] = True)
    • Position 2: 'b' (include)
    • Position 3: 'a' (skip, rem[3] = True)
    • Position 4: '*' (skip, rem[4] = True)
    • Position 5: 'c' (include)
    • Position 6: '*' (skip, rem[6] = True)
    • Position 7: 'd' (include)

Final result: "abcd"

The solution correctly removed both asterisks and their corresponding smallest characters to the left (the 'a' at position 3 for the first asterisk, and the 'a' at position 1 for the second asterisk).

Solution Implementation

1from collections import defaultdict
2from string import ascii_lowercase
3
4class Solution:
5    def clearStars(self, s: str) -> str:
6        # Dictionary to store indices of each character (excluding '*')
7        # Key: character, Value: list of indices where this character appears
8        char_indices = defaultdict(list)
9      
10        # Get the length of the input string
11        n = len(s)
12      
13        # Track which indices should be removed from the final result
14        is_removed = [False] * n
15      
16        # Process each character in the string
17        for i, char in enumerate(s):
18            if char == "*":
19                # Mark the current star for removal
20                is_removed[i] = True
21              
22                # Find and remove the smallest character that appears before this star
23                # Check characters in lexicographical order (a, b, c, ...)
24                for letter in ascii_lowercase:
25                    if char_indices[letter]:
26                        # Remove the rightmost occurrence of the smallest available character
27                        index_to_remove = char_indices[letter].pop()
28                        is_removed[index_to_remove] = True
29                        break
30            else:
31                # For non-star characters, store their index
32                char_indices[char].append(i)
33      
34        # Build the result string by including only characters that weren't removed
35        return "".join(char for i, char in enumerate(s) if not is_removed[i])
36
1class Solution {
2    public String clearStars(String s) {
3        // Create an array of deques, one for each letter (a-z)
4        // Each deque stores indices of occurrences of that letter
5        Deque<Integer>[] letterIndices = new Deque[26];
6        Arrays.setAll(letterIndices, i -> new ArrayDeque<>());
7      
8        int length = s.length();
9        // Track which characters should be removed
10        boolean[] isRemoved = new boolean[length];
11      
12        // Process each character in the string
13        for (int i = 0; i < length; i++) {
14            if (s.charAt(i) == '*') {
15                // Mark the star for removal
16                isRemoved[i] = true;
17              
18                // Find and remove the closest non-star character with smallest value
19                // Search from 'a' to 'z' to find the lexicographically smallest
20                for (int letterIndex = 0; letterIndex < 26; letterIndex++) {
21                    if (!letterIndices[letterIndex].isEmpty()) {
22                        // Remove the most recent occurrence of this letter
23                        int indexToRemove = letterIndices[letterIndex].pop();
24                        isRemoved[indexToRemove] = true;
25                        break;
26                    }
27                }
28            } else {
29                // Store the index of this non-star character
30                int letterIndex = s.charAt(i) - 'a';
31                letterIndices[letterIndex].push(i);
32            }
33        }
34      
35        // Build the result string excluding removed characters
36        StringBuilder result = new StringBuilder();
37        for (int i = 0; i < length; i++) {
38            if (!isRemoved[i]) {
39                result.append(s.charAt(i));
40            }
41        }
42      
43        return result.toString();
44    }
45}
46
1class Solution {
2public:
3    string clearStars(string s) {
4        // Array of stacks to store indices of each character ('a' to 'z')
5        // charStacks[0] stores indices of 'a', charStacks[1] stores indices of 'b', etc.
6        stack<int> charStacks[26];
7      
8        int stringLength = s.length();
9      
10        // Track which characters should be removed
11        vector<bool> isRemoved(stringLength, false);
12      
13        // Process each character in the string
14        for (int i = 0; i < stringLength; ++i) {
15            if (s[i] == '*') {
16                // Mark the star itself for removal
17                isRemoved[i] = true;
18              
19                // Find and remove the closest non-star character to the left
20                // Starting from 'a' (lexicographically smallest)
21                for (int charIndex = 0; charIndex < 26; ++charIndex) {
22                    if (!charStacks[charIndex].empty()) {
23                        // Mark the top character (rightmost occurrence) for removal
24                        isRemoved[charStacks[charIndex].top()] = true;
25                        charStacks[charIndex].pop();
26                        break;  // Remove only one character per star
27                    }
28                }
29            } else {
30                // Store the index of non-star character in corresponding stack
31                int charIndex = s[i] - 'a';
32                charStacks[charIndex].push(i);
33            }
34        }
35      
36        // Build the result string with non-removed characters
37        string result;
38        for (int i = 0; i < stringLength; ++i) {
39            if (!isRemoved[i]) {
40                result.push_back(s[i]);
41            }
42        }
43      
44        return result;
45    }
46};
47
1/**
2 * Removes stars and the closest non-star character to the left of each star
3 * Characters are removed starting with the smallest lexicographical character available
4 * @param s - Input string containing letters and asterisks
5 * @returns String with stars and corresponding characters removed
6 */
7function clearStars(s: string): string {
8    // Create 26 stacks to track indices of each letter (a-z)
9    const letterIndexStacks: number[][] = Array.from({ length: 26 }, () => []);
10  
11    const stringLength: number = s.length;
12  
13    // Track which characters should be removed
14    const isRemoved: boolean[] = Array(stringLength).fill(false);
15  
16    // Process each character in the string
17    for (let currentIndex = 0; currentIndex < stringLength; ++currentIndex) {
18        if (s[currentIndex] === '*') {
19            // Mark the star for removal
20            isRemoved[currentIndex] = true;
21          
22            // Find and remove the leftmost occurrence of the smallest available letter
23            for (let letterIndex = 0; letterIndex < 26; ++letterIndex) {
24                if (letterIndexStacks[letterIndex].length > 0) {
25                    // Remove the most recent occurrence of this letter
26                    const indexToRemove: number = letterIndexStacks[letterIndex].pop()!;
27                    isRemoved[indexToRemove] = true;
28                    break;
29                }
30            }
31        } else {
32            // Add current letter's index to its corresponding stack
33            const letterCode: number = s.charCodeAt(currentIndex) - 97; // Convert 'a'-'z' to 0-25
34            letterIndexStacks[letterCode].push(currentIndex);
35        }
36    }
37  
38    // Build result string by filtering out removed characters
39    return s
40        .split('')
41        .filter((_, index) => !isRemoved[index])
42        .join('');
43}
44

Time and Space Complexity

Time Complexity: O(n × |Σ|) where n is the length of string s and |Σ| = 26 (the size of the lowercase alphabet).

The algorithm iterates through each character in the string once, which takes O(n) time. For each "*" character encountered, it searches through all 26 lowercase letters in alphabetical order (for a in ascii_lowercase) to find the first letter that has a non-empty list in the dictionary g. In the worst case, this search examines all 26 letters. Since there can be up to n asterisks in the string, the worst-case time complexity is O(n × 26) = O(n × |Σ|).

Space Complexity: O(n)

The algorithm uses:

  • A dictionary g that stores lists of indices. In the worst case, all n indices could be stored across all lists, using O(n) space.
  • A boolean array rem of size n to track which characters should be removed, using O(n) space.
  • The final string construction creates a new string, which in the worst case could be of length n, using O(n) space.

Therefore, the total space complexity is O(n).

Learn more about how to find time and space complexity quickly.

Common Pitfalls

1. Attempting to Find the Minimum Character Incorrectly

A common mistake is trying to find the minimum character by searching backwards through the string from the asterisk position, which becomes inefficient and complex when handling multiple occurrences of the same character.

Incorrect Approach:

# Wrong: Trying to manually search for minimum
for j in range(i-1, -1, -1):
    if s[j] != '*' and not is_removed[j]:
        if s[j] < min_char:
            min_char = s[j]
            min_index = j

Why it fails: This requires tracking which characters have already been removed and comparing characters manually, leading to O(n²) complexity and potential bugs.

Correct Solution: Use the dictionary approach with lexicographical iteration through ascii_lowercase to automatically find the smallest character in O(1) amortized time.

2. Removing the Wrong Occurrence of the Minimum Character

When multiple instances of the smallest character exist, the problem states "you can delete any one of them." A pitfall is assuming you should remove the first occurrence or not having a consistent strategy.

Incorrect Approach:

# Wrong: Always removing the first occurrence
if char_indices[letter]:
    index_to_remove = char_indices[letter][0]  # Takes first
    char_indices[letter].remove(index_to_remove)  # O(n) operation!

Why it fails: Using remove() or deleting from the beginning of a list is O(n) for each deletion, making the overall complexity O(n²).

Correct Solution: Always remove from the end using pop() which is O(1), maintaining efficiency while still satisfying the problem requirements.

3. Modifying the String During Processing

Attempting to modify the string directly while iterating through it causes index misalignment and complexity issues.

Incorrect Approach:

# Wrong: Trying to modify string in-place
result = list(s)
for i in range(len(result)):
    if result[i] == '*':
        # Remove the star and a character
        result.pop(i)  # Indices shift!
        # Now all subsequent indices are wrong

Why it fails: Removing characters shifts all subsequent indices, making it impossible to track positions correctly.

Correct Solution: Use the boolean array is_removed to mark positions for deletion, then construct the final string in one pass at the end.

4. Not Handling Edge Cases

Failing to handle cases where an asterisk appears but no non-asterisk characters exist to its left.

Incorrect Approach:

# Wrong: Assuming there's always a character to remove
for letter in ascii_lowercase:
    # What if no letters exist before the asterisk?
    index_to_remove = char_indices[letter].pop()  # Could be empty!

Why it fails: According to the problem constraints, there's always a valid character to remove before each asterisk, but defensive programming suggests checking anyway.

Correct Solution: The current implementation handles this correctly with the if char_indices[letter]: check before attempting to pop.

Discover Your Strengths and Weaknesses: Take Our 3-Minute Quiz to Tailor Your Study Plan:

Depth first search is equivalent to which of the tree traversal order?


Recommended Readings

Want a Structured Path to Master System Design Too? Don’t Miss This!

Load More