316. Remove Duplicate Letters
Problem Description
The problem at hand is to take an input string s
and remove duplicate letters from it, such that the resulting string contains each letter only once. However, there's an additional constraint: the resulting string must be the smallest possible string in lexicographical order. Lexicographical order is essentially the dictionary order or alphabetical order. So, if given two strings, the one that comes first in a dictionary is said to be lexicographically smaller. Our task is to ensure that, among all the possible unique permutations of the input string's letters, we pick the one that's smallest lexicographically.
Intuition
The intuition behind the solution is to construct the result string character by character, making sure that:
- Each letter appears once, and
- The string is the smallest in lexicographical order.
To accomplish this, we use a stack to keep track of the characters in the result string while iterating over the input string. The algorithm uses two data structuresโa stack and a set:
- The stack
stk
is used to create the result string. - The set
vis
keeps track of the characters currently in the stack.
As we go through each character in the input string, we perform checks:
- If the current character is already in
vis
, we skip adding it to the stack to avoid duplicates. - If it's not in
vis
, we check whether it can replace any characters previously added to the stack to ensure the lexicographically smallest order. We do this by popping characters from the stack that are greater than the current character and appear later in the string (their last occurrence in the string is at an index greater than the current index).
By using the stack in this way, we're able to maintain the smallest lexicographical order while also ensuring we don't skip any characters that must be in the final result because we know where their last appearance is in the string, thanks to the last
dictionary.
The process of building the result string incorporates the following steps:
- If current character is not in
vis
, continue to step 2. Otherwise, go to the next character. - While there is a character at the top of the stack that is lexicographically larger than the current character and it appears again at a later index, pop it from the stack and remove it from
vis
. - Push the current character onto the stack and add it to
vis
.
Once we finish iterating over the string, the result is constructed by joining all characters in the stack. This result is guaranteed to be void of duplicates and the smallest lexicographical permutation of the input string's characters.
Learn more about Stack, Greedy and Monotonic Stack patterns.
Solution Approach
The solution is implemented in Python, and it employs a stack, a set, and a dictionary comprehensively to ensure that duplicates are removed and the lexicographically smallest order is achieved.
Here's how the implementation breaks down:
-
We first create a dictionary
last
which holds the last occurrence index of each character in the string. This helps us to know if a character on the stack can be popped out or not (if a character can be found later in the string, perhaps we want to remove it now to maintain lexicographical order).1last = {c: i for i, c in enumerate(s)}
-
We then initialize an empty list
stk
which acts as our stack, and an empty setvis
which holds the characters that have been added to the stack.1stk = [] 2vis = set()
-
We iterate over the characters and their indices in the string. For each character
c
:- If
c
is already invis
, we continue to the next character as we want to avoid duplicates in our stack. - If
c
is not invis
, we compare it with the characters currently in the stack and see if we can make our string smaller by removing any characters that are greater thanc
and are also present later in the string (last[stk[-1]] > i
). - This is done in a
while
loop that continues to pop characters from the top of the stack until the top character is smaller thanc
, or there is no more occurrence of that character later in the string.
1for i, c in enumerate(s): 2 if c in vis: 3 continue 4 while stk and stk[-1] > c and last[stk[-1]] > i: 5 vis.remove(stk.pop()) 6 stk.append(c) 7 vis.add(c)
- If
-
Finally, we join the characters in the stack and return the resulting string. The stack, at this point, contains the characters of our resulting string in the correct order, satisfying both of our conditions: no duplicates, and smallest lexicographical order.
1return ''.join(stk)
The key algorithmic patterns used in this solution are a stack to manage the order of characters and a set to track which characters are in the stack to prevent duplicates. The use of a dictionary to keep track of the last occurrences of characters helps to decide whether to pop a character from the stack to maintain the lexicographically smallest order.
In conclusion, the code systematically manages to satisfy the requirements of the problem statement by ensuring unique characters using a set and optimizing for the lexicographically smallest output by intelligently popping from a stack with the aid of a dictionary that keeps track of characters' last positions.
Ready to land your dream job?
Unlock your dream job with a 2-minute evaluator for a personalized learning plan!
Start EvaluatorExample Walkthrough
Let's walk through the solution with a simple example. Consider the input string s = "bcabc"
. We want to remove duplicates and return the smallest lexicographical string using the approach described.
-
We start by constructing the
last
dictionary to record the last occurrence of each character:1last = {'b': 3, 'c': 4, 'a': 2}
-
We initialize the
stk
list andvis
set:1stk = [] 2vis = set()
-
We iterate over the characters in the string "bcabc":
-
For the first character
'b'
, it's not invis
, so we add'b'
tostk
andvis
.1stk = ['b'] 2vis = {'b'}
-
Next is
'c'
, also not invis
, so we add'c'
tostk
andvis
.1stk = ['b', 'c'] 2vis = {'b', 'c'}
-
Then, we encounter
'a'
. It's not invis
, but before we add it tostk
, we check if we can pop any characters that are lexicographically larger and occur later.'c'
and'b'
are both larger, but'c'
occurs later (last['c'] > i
), so we pop'c'
fromstk
and remove it fromvis
.'b'
also occurs later, so we pop'b'
as well.1stk = [] 2vis = {}
Now we add
'a'
.1stk = ['a'] 2vis = {'a'}
-
For the next
'b'
, it is not invis
, and 'a' in stack is smaller than 'b', so'b'
is simply added tostk
andvis
.1stk = ['a', 'b'] 2vis = {'a', 'b'}
-
Finally, we have another
'c'
. It isn't invis
, and'b'
in the stack is smaller than'c'
, so again we add'c'
to bothstk
andvis
.1stk = ['a', 'b', 'c'] 2vis = {'a', 'b', 'c'}
-
-
The iteration is over and we join the stack to get the result:
1return ''.join(stk) # 'abc'
The final output is 'abc'
, which contains no duplicate letters and is the smallest possible string in lexicographical order made from the letters of the original string.
Solution Implementation
1class Solution:
2 def removeDuplicateLetters(self, s: str) -> str:
3 # Create a dictionary to store the last occurrence of each character
4 last_occurrence = {char: index for index, char in enumerate(s)}
5
6 # Initialize an empty stack to keep track of the characters in result
7 stack = []
8
9 # Set to keep track of characters already in the stack to avoid duplicates
10 visited = set()
11
12 # Iterate over each character and its index in the string
13 for index, char in enumerate(s):
14 # Skip if the character is already in the visited set
15 if char in visited:
16 continue
17
18 # Ensure the top element of the stack is greater than the current character
19 # and the top element occurs later in the string as well
20 while stack and stack[-1] > char and last_occurrence[stack[-1]] > index:
21 # The top element can be removed and thus it is no longer visited.
22 visited.remove(stack.pop())
23
24 # Add the current character to the stack and mark it as visited
25 stack.append(char)
26 visited.add(char)
27
28 # Convert the stack to a string by joining the characters
29 return ''.join(stack)
30
1class Solution {
2 public String removeDuplicateLetters(String s) {
3 int stringLength = s.length();
4 int[] lastIndex = new int[26]; // to store the last index of each character
5
6 // Fill array with the last position of each character in the string
7 for (int i = 0; i < stringLength; ++i) {
8 lastIndex[s.charAt(i) - 'a'] = i;
9 }
10
11 Deque<Character> stack = new ArrayDeque<>(); // stack to hold the characters for result
12 int bitmap = 0; // to keep track of which characters are already in stack
13
14 // Iterate through the string characters
15 for (int i = 0; i < stringLength; ++i) {
16 char currentChar = s.charAt(i);
17 // Check if the current character is already in stack (bit is set)
18 if (((bitmap >> (currentChar - 'a')) & 1) == 1) {
19 continue; // Skip if character is already present
20 }
21
22 // Ensure characters in stack are in the correct order and remove any that aren't
23 while (!stack.isEmpty() && stack.peek() > currentChar && lastIndex[stack.peek() - 'a'] > i) {
24 bitmap ^= 1 << (stack.pop() - 'a'); // Set the bit to 0 for popped character
25 }
26
27 stack.push(currentChar); // Add current character to the stack
28 bitmap |= 1 << (currentChar - 'a'); // Set the bit to 1 for current character
29 }
30
31 StringBuilder resultBuilder = new StringBuilder();
32 // Build the result string from the characters in stack
33 for (char c : stack) {
34 resultBuilder.append(c);
35 }
36
37 // The order of characters in stack is reversed so we need to reverse the string
38 return resultBuilder.reverse().toString();
39 }
40}
41
1class Solution {
2public:
3 // Function to remove duplicate letters and return the smallest in lexicographical order string that contains every letter of s exactly once.
4 string removeDuplicateLetters(string s) {
5 int n = s.size();
6
7 // Initialize an array to store the index of the last occurrence of each character.
8 int lastIndex[26] = {0};
9 for (int i = 0; i < n; ++i) {
10 lastIndex[s[i] - 'a'] = i;
11 }
12
13 // String to store the answer.
14 string answer;
15
16 // This will serve as a bitmask to keep track of which characters have been added to the answer.
17 int usedMask = 0;
18
19 // Iterate through the characters of the string.
20 for (int i = 0; i < n; ++i) {
21 char currentChar = s[i];
22
23 // If the current character has already been used, skip it.
24 if ((usedMask >> (currentChar - 'a')) & 1) {
25 continue;
26 }
27
28 // While the answer is not empty, the last character in the answer is greater than the current character,
29 // and the last occurrence of the last character in the answer is after the current position,
30 // we remove the last character from the answer and update the mask.
31 while (!answer.empty() && answer.back() > currentChar && lastIndex[answer.back() - 'a'] > i) {
32 // Remove the last character from the used mask.
33 usedMask ^= 1 << (answer.back() - 'a');
34 // Remove the last character from the answer.
35 answer.pop_back();
36 }
37
38 // Append the current character to the answer.
39 answer.push_back(currentChar);
40 // Mark the current character as used in the mask.
41 usedMask |= 1 << (currentChar - 'a');
42 }
43
44 return answer;
45 }
46};
47
1// Function to remove duplicate letters and return the smallest
2// lexicographical order string that contains every letter of s exactly once.
3function removeDuplicateLetters(s: string): string {
4 const n: number = s.length;
5
6 // Initialize an array to store the index of the last occurrence of each character.
7 const lastIndex: number[] = new Array(26).fill(0);
8 for (let i = 0; i < n; ++i) {
9 lastIndex[s.charCodeAt(i) - 'a'.charCodeAt(0)] = i;
10 }
11
12 // String to store the answer.
13 let answer: string = '';
14
15 // This will serve as a bitmask to keep track of which characters have been added to the answer.
16 let usedMask: number = 0;
17
18 // Iterate through the characters of the string.
19 for (let i = 0; i < n; ++i) {
20 const currentChar: string = s[i];
21
22 // If the current character has already been used, skip it.
23 if ((usedMask >> (currentChar.charCodeAt(0) - 'a'.charCodeAt(0))) & 1) {
24 continue;
25 }
26
27 // While the answer is not empty, the last character in the answer is greater than the current character,
28 // and the last occurrence of the last character in the answer is after the current position,
29 // we remove the last character from the answer and update the mask.
30 while (answer && answer[answer.length - 1] > currentChar &&
31 lastIndex[answer.charCodeAt(answer.length - 1) - 'a'.charCodeAt(0)] > i) {
32 // Remove the last character from the used mask.
33 usedMask ^= 1 << (answer.charCodeAt(answer.length - 1) - 'a'.charCodeAt(0));
34 // Remove the last character from the answer.
35 answer = answer.slice(0, -1); // String slice operation to remove last character
36 }
37
38 // Append the current character to the answer.
39 answer += currentChar;
40 // Mark the current character as used in the mask.
41 usedMask |= 1 << (currentChar.charCodeAt(0) - 'a'.charCodeAt(0));
42 }
43
44 return answer;
45}
46
Time and Space Complexity
Time Complexity
The main loop in the given code iterates through each character of the input string s
. Operations inside this loop include checking if a character is in the visited set, which is an O(1)
operation, and comparing characters which is also O(1)
. Additionally, the while loop may result in popping elements from the stack. In the worst case, this popping can happen n
times throughout the entire for loop, but each element is popped only once. So, the amortized time complexity for popping is O(1). These operations inside the loop do not change the overall linear time traversal of the string. Hence, the time complexity of the code is O(n)
, where n
is the length of the string.
Space Complexity
We use a stack stk
, a set vis
, and a dictionary last
to store the characters. The size of the stack and the set will at most grow to the size of the unique characters in s
, which is limited by the number of distinct characters in the alphabet (at most 26 for English letters). Hence, we can consider this to be O(1)
space. The dictionary last
contains a key for every unique character in the string, so this is also O(1)
under the assumption of a fixed character set. Therefore, the overall space complexity is O(1)
because the space required does not grow with the size of the input string, but is rather constant due to the fixed character set.
Learn more about how to find time and space complexity quickly using problem constraints.
Which of the following shows the order of node visit in a Breadth-first Search?
Recommended Readings
Stack Intro Imagine you have a pile of books on your desk If you want to add a new book you place it on top If you want to read a book you take it from the top And if you simply want to see which book is on the top you
Greedy Introduction div class responsive iframe iframe src https www youtube com embed WTslqPbj7I title YouTube video player frameborder 0 allow accelerometer autoplay clipboard write encrypted media gyroscope picture in picture web share allowfullscreen iframe div When do we use greedy Greedy algorithms tend to solve optimization problems Typically they will ask you to calculate the max min of some value Commonly you may see this phrased in the problem as max min longest shortest largest smallest etc These keywords can be identified by just scanning
Monotonic Stack Deque Intro If you'd prefer a video div class responsive iframe iframe src https www youtube com embed Dq_ObZwTY_Q title YouTube video player frameborder 0 allow accelerometer autoplay clipboard write encrypted media gyroscope picture in picture web share allowfullscreen iframe div The word monotonic means a list or
Got a question?ย Ask the Monster Assistantย anything you don't understand.
Still not clear? ย Submitย the part you don't understand to our editors. Or join ourย Discord and ask the community.