316. Remove Duplicate Letters


Problem Description

The problem at hand is to take an input string s and remove duplicate letters from it, such that the resulting string contains each letter only once. However, there's an additional constraint: the resulting string must be the smallest possible string in lexicographical order. Lexicographical order is essentially the dictionary order or alphabetical order. So, if given two strings, the one that comes first in a dictionary is said to be lexicographically smaller. Our task is to ensure that, among all the possible unique permutations of the input string's letters, we pick the one that's smallest lexicographically.

Intuition

The intuition behind the solution is to construct the result string character by character, making sure that:

  • Each letter appears once, and
  • The string is the smallest in lexicographical order.

To accomplish this, we use a stack to keep track of the characters in the result string while iterating over the input string. The algorithm uses two data structures—a stack and a set:

  • The stack stk is used to create the result string.
  • The set vis keeps track of the characters currently in the stack.

As we go through each character in the input string, we perform checks:

  • If the current character is already in vis, we skip adding it to the stack to avoid duplicates.
  • If it's not in vis, we check whether it can replace any characters previously added to the stack to ensure the lexicographically smallest order. We do this by popping characters from the stack that are greater than the current character and appear later in the string (their last occurrence in the string is at an index greater than the current index).

By using the stack in this way, we're able to maintain the smallest lexicographical order while also ensuring we don't skip any characters that must be in the final result because we know where their last appearance is in the string, thanks to the last dictionary.

The process of building the result string incorporates the following steps:

  1. If current character is not in vis, continue to step 2. Otherwise, go to the next character.
  2. While there is a character at the top of the stack that is lexicographically larger than the current character and it appears again at a later index, pop it from the stack and remove it from vis.
  3. Push the current character onto the stack and add it to vis.

Once we finish iterating over the string, the result is constructed by joining all characters in the stack. This result is guaranteed to be void of duplicates and the smallest lexicographical permutation of the input string's characters.

Solution Approach

The solution is implemented in Python, and it employs a stack, a set, and a dictionary comprehensively to ensure that duplicates are removed and the lexicographically smallest order is achieved.

Here's how the implementation breaks down:

  1. We first create a dictionary last which holds the last occurrence index of each character in the string. This helps us to know if a character on the stack can be popped out or not (if a character can be found later in the string, perhaps we want to remove it now to maintain lexicographical order).

    1last = {c: i for i, c in enumerate(s)}
  2. We then initialize an empty list stk which acts as our stack, and an empty set vis which holds the characters that have been added to the stack.

    1stk = []
    2vis = set()
  3. We iterate over the characters and their indices in the string. For each character c:

    • If c is already in vis, we continue to the next character as we want to avoid duplicates in our stack.
    • If c is not in vis, we compare it with the characters currently in the stack and see if we can make our string smaller by removing any characters that are greater than c and are also present later in the string (last[stk[-1]] > i).
    • This is done in a while loop that continues to pop characters from the top of the stack until the top character is smaller than c, or there is no more occurrence of that character later in the string.
    1for i, c in enumerate(s):
    2    if c in vis:
    3        continue
    4    while stk and stk[-1] > c and last[stk[-1]] > i:
    5        vis.remove(stk.pop())
    6    stk.append(c)
    7    vis.add(c)
  4. Finally, we join the characters in the stack and return the resulting string. The stack, at this point, contains the characters of our resulting string in the correct order, satisfying both of our conditions: no duplicates, and smallest lexicographical order.

    1return ''.join(stk)

The key algorithmic patterns used in this solution are a stack to manage the order of characters and a set to track which characters are in the stack to prevent duplicates. The use of a dictionary to keep track of the last occurrences of characters helps to decide whether to pop a character from the stack to maintain the lexicographically smallest order.

In conclusion, the code systematically manages to satisfy the requirements of the problem statement by ensuring unique characters using a set and optimizing for the lexicographically smallest output by intelligently popping from a stack with the aid of a dictionary that keeps track of characters' last positions.

💪
Level Up Your
Algo Skills

Example Walkthrough

Let's walk through the solution with a simple example. Consider the input string s = "bcabc". We want to remove duplicates and return the smallest lexicographical string using the approach described.

  1. We start by constructing the last dictionary to record the last occurrence of each character:

    1last = {'b': 3, 'c': 4, 'a': 2}
  2. We initialize the stk list and vis set:

    1stk = []
    2vis = set()
  3. We iterate over the characters in the string "bcabc":

    • For the first character 'b', it's not in vis, so we add 'b' to stk and vis.

      1stk = ['b']
      2vis = {'b'}
    • Next is 'c', also not in vis, so we add 'c' to stk and vis.

      1stk = ['b', 'c']
      2vis = {'b', 'c'}
    • Then, we encounter 'a'. It's not in vis, but before we add it to stk, we check if we can pop any characters that are lexicographically larger and occur later. 'c' and 'b' are both larger, but 'c' occurs later (last['c'] > i), so we pop 'c' from stk and remove it from vis. 'b' also occurs later, so we pop 'b' as well.

      1stk = []
      2vis = {}

      Now we add 'a'.

      1stk = ['a']
      2vis = {'a'}
    • For the next 'b', it is not in vis, and 'a' in stack is smaller than 'b', so 'b' is simply added to stk and vis.

      1stk = ['a', 'b']
      2vis = {'a', 'b'}
    • Finally, we have another 'c'. It isn't in vis, and 'b' in the stack is smaller than 'c', so again we add 'c' to both stk and vis.

      1stk = ['a', 'b', 'c']
      2vis = {'a', 'b', 'c'}
  4. The iteration is over and we join the stack to get the result:

    1return ''.join(stk)  # 'abc'

The final output is 'abc', which contains no duplicate letters and is the smallest possible string in lexicographical order made from the letters of the original string.

Python Solution

1class Solution:
2    def removeDuplicateLetters(self, s: str) -> str:
3        # Create a dictionary to store the last occurrence of each character
4        last_occurrence = {char: index for index, char in enumerate(s)}
5      
6        # Initialize an empty stack to keep track of the characters in result
7        stack = []
8      
9        # Set to keep track of characters already in the stack to avoid duplicates
10        visited = set()
11      
12        # Iterate over each character and its index in the string
13        for index, char in enumerate(s):
14            # Skip if the character is already in the visited set
15            if char in visited:
16                continue
17
18            # Ensure the top element of the stack is greater than the current character
19            # and the top element occurs later in the string as well
20            while stack and stack[-1] > char and last_occurrence[stack[-1]] > index:
21                # The top element can be removed and thus it is no longer visited.
22                visited.remove(stack.pop())
23
24            # Add the current character to the stack and mark it as visited
25            stack.append(char)
26            visited.add(char)
27
28        # Convert the stack to a string by joining the characters
29        return ''.join(stack)
30

Java Solution

1class Solution {
2    public String removeDuplicateLetters(String s) {
3        int stringLength = s.length();
4        int[] lastIndex = new int[26]; // to store the last index of each character
5      
6        // Fill array with the last position of each character in the string
7        for (int i = 0; i < stringLength; ++i) {
8            lastIndex[s.charAt(i) - 'a'] = i;
9        }
10      
11        Deque<Character> stack = new ArrayDeque<>(); // stack to hold the characters for result
12        int bitmap = 0; // to keep track of which characters are already in stack
13      
14        // Iterate through the string characters
15        for (int i = 0; i < stringLength; ++i) {
16            char currentChar = s.charAt(i);
17            // Check if the current character is already in stack (bit is set)
18            if (((bitmap >> (currentChar - 'a')) & 1) == 1) {
19                continue; // Skip if character is already present
20            }
21          
22            // Ensure characters in stack are in the correct order and remove any that aren't
23            while (!stack.isEmpty() && stack.peek() > currentChar && lastIndex[stack.peek() - 'a'] > i) {
24                bitmap ^= 1 << (stack.pop() - 'a'); // Set the bit to 0 for popped character
25            }
26          
27            stack.push(currentChar); // Add current character to the stack
28            bitmap |= 1 << (currentChar - 'a'); // Set the bit to 1 for current character
29        }
30      
31        StringBuilder resultBuilder = new StringBuilder();
32        // Build the result string from the characters in stack
33        for (char c : stack) {
34            resultBuilder.append(c);
35        }
36      
37        // The order of characters in stack is reversed so we need to reverse the string
38        return resultBuilder.reverse().toString();
39    }
40}
41

C++ Solution

1class Solution {
2public:
3    // Function to remove duplicate letters and return the smallest in lexicographical order string that contains every letter of s exactly once.
4    string removeDuplicateLetters(string s) {
5        int n = s.size();
6      
7        // Initialize an array to store the index of the last occurrence of each character.
8        int lastIndex[26] = {0};
9        for (int i = 0; i < n; ++i) {
10            lastIndex[s[i] - 'a'] = i;
11        }
12      
13        // String to store the answer.
14        string answer;
15      
16        // This will serve as a bitmask to keep track of which characters have been added to the answer.
17        int usedMask = 0;
18      
19        // Iterate through the characters of the string.
20        for (int i = 0; i < n; ++i) {
21            char currentChar = s[i];
22          
23            // If the current character has already been used, skip it.
24            if ((usedMask >> (currentChar - 'a')) & 1) {
25                continue;
26            }
27          
28            // While the answer is not empty, the last character in the answer is greater than the current character,
29            // and the last occurrence of the last character in the answer is after the current position,
30            // we remove the last character from the answer and update the mask.
31            while (!answer.empty() && answer.back() > currentChar && lastIndex[answer.back() - 'a'] > i) {
32                // Remove the last character from the used mask.
33                usedMask ^= 1 << (answer.back() - 'a');
34                // Remove the last character from the answer.
35                answer.pop_back();
36            }
37          
38            // Append the current character to the answer.
39            answer.push_back(currentChar);
40            // Mark the current character as used in the mask.
41            usedMask |= 1 << (currentChar - 'a');
42        }
43      
44        return answer;
45    }
46};
47

Typescript Solution

1// Function to remove duplicate letters and return the smallest 
2// lexicographical order string that contains every letter of s exactly once.
3function removeDuplicateLetters(s: string): string {
4    const n: number = s.length;
5
6    // Initialize an array to store the index of the last occurrence of each character.
7    const lastIndex: number[] = new Array(26).fill(0);
8    for (let i = 0; i < n; ++i) {
9        lastIndex[s.charCodeAt(i) - 'a'.charCodeAt(0)] = i;
10    }
11
12    // String to store the answer.
13    let answer: string = '';
14
15    // This will serve as a bitmask to keep track of which characters have been added to the answer.
16    let usedMask: number = 0;
17
18    // Iterate through the characters of the string.
19    for (let i = 0; i < n; ++i) {
20        const currentChar: string = s[i];
21
22        // If the current character has already been used, skip it.
23        if ((usedMask >> (currentChar.charCodeAt(0) - 'a'.charCodeAt(0))) & 1) {
24            continue;
25        }
26
27        // While the answer is not empty, the last character in the answer is greater than the current character,
28        // and the last occurrence of the last character in the answer is after the current position,
29        // we remove the last character from the answer and update the mask.
30        while (answer && answer[answer.length - 1] > currentChar &&
31               lastIndex[answer.charCodeAt(answer.length - 1) - 'a'.charCodeAt(0)] > i) {
32            // Remove the last character from the used mask.
33            usedMask ^= 1 << (answer.charCodeAt(answer.length - 1) - 'a'.charCodeAt(0));
34            // Remove the last character from the answer.
35            answer = answer.slice(0, -1); // String slice operation to remove last character
36        }
37
38        // Append the current character to the answer.
39        answer += currentChar;
40        // Mark the current character as used in the mask.
41        usedMask |= 1 << (currentChar.charCodeAt(0) - 'a'.charCodeAt(0));
42    }
43
44    return answer;
45}
46

Time and Space Complexity

Time Complexity

The main loop in the given code iterates through each character of the input string s. Operations inside this loop include checking if a character is in the visited set, which is an O(1) operation, and comparing characters which is also O(1). Additionally, the while loop may result in popping elements from the stack. In the worst case, this popping can happen n times throughout the entire for loop, but each element is popped only once. So, the amortized time complexity for popping is O(1). These operations inside the loop do not change the overall linear time traversal of the string. Hence, the time complexity of the code is O(n), where n is the length of the string.

Space Complexity

We use a stack stk, a set vis, and a dictionary last to store the characters. The size of the stack and the set will at most grow to the size of the unique characters in s, which is limited by the number of distinct characters in the alphabet (at most 26 for English letters). Hence, we can consider this to be O(1) space. The dictionary last contains a key for every unique character in the string, so this is also O(1) under the assumption of a fixed character set. Therefore, the overall space complexity is O(1) because the space required does not grow with the size of the input string, but is rather constant due to the fixed character set.

😈
Become an
Algo Monster

Got a question? Ask the Teaching Assistant anything you don't understand.

Still not clear? Ask in the Forum,  Discord or Submit the part you don't understand to our editors.


TA 👨‍🏫