2351. First Letter to Appear Twice
Problem Description
The problem provides us with a string s
which is composed of lowercase English letters. Our task is to find the first letter that appears twice in the string. It's important to understand that by 'first', it means the letter whose second occurrence comes before the second occurrence of any other character that might also appear more than once. The string is guaranteed to have at least one character that meets this condition.
Intuition
To solve this problem efficiently, we consider each character in the string one by one and keep track of the ones we have already seen. The approach is to use a bitmask to keep track of the characters that have occurred once. Here's the intuition:
- We initialize an integer
mask
to 0 to use as our bitmask. - For each character
c
in the strings
, we convert it to an integer indexi
by subtracting the ASCII value of 'a' from the ASCII value ofc
. This maps 'a' to 0, 'b' to 1, and so on up to 'z'. - We then check if the ith bit in the
mask
is already set (which would mean we've seen this character before). This is done by shifting 1 left byi
positions, ANDing it withmask
, and checking if the result is not zero. - If we find that the
mask
already has the ith bit set, it means that this characterc
is the first character to appear twice, so we return it. - If not, we set the ith bit in the
mask
to indicate that we have seen the current characterc
for the first time.
By using the bitmask, we can quickly determine whether a character has appeared before with just bitwise operations that are very efficient.
Solution Approach
The solution approach uses bitwise operators to implement an efficient algorithm to find the first repeating character in the string s
. Let's dive into each part of the implementation:
-
Initializing the Bitmask: A variable
mask
is initialized to 0. Thismask
will be used to keep track of characters that have been seen once in the string. The ith bit inmask
will correspond to the ith character in the alphabet (where 'a' is 0, 'b' is 1, etc.). -
Iterating through the String: The code uses a
for
loop to iterate through each characterc
of the strings
. -
Mapping Character to Bit Position: For each character
c
, we calculate an indexi
that represents its position in the alphabet (i = ord(c) - ord('a')
). This indexi
is then used to check or set the corresponding bit in themask
. -
Checking for Repeats: To check if a character has appeared before, we shift 1 to the left by
i
positions (1 << i
) and perform a bitwise AND with themask
. If the result is nonzero (mask >> i & 1
), the character has been seen before, and it is the first character to appear twice. -
Setting Bits for Unseen Characters: If the character has not been seen before, we set its corresponding bit in the
mask
using a bitwise OR with1 << i
(mask |= 1 << i
). -
Returning the Result: Once we find a character whose bit is already set in the
mask
, we return that character since it is the first character to occur twice.
This algorithm efficiently solves the problem in O(n)
time complexity, where n
is the length of the string s
. Because we only go through the string once and the operations for each character are constant time. Moreover, it does not require additional data structures like hash maps or arrays to store counts or indexes of characters, thereby offering O(1)
space complexity, aside from the input string.
Ready to land your dream job?
Unlock your dream job with a 2-minute evaluator for a personalized learning plan!
Start EvaluatorExample Walkthrough
Let's walk through a small example to illustrate the solution approach using the string s = "abccba"
:
-
Initializing the Bitmask: Start with a
mask
equal to 0. This mask will be used to track each letter we encounter ins
. -
Iterating through the String: We go through each letter in
s
one-by-one:a. First Letter - 'a': For
c = 'a'
:i
is computed asord('a') - ord('a')
which is0
.- Check if
mask & (1 << 0)
is non-zero. It's not; it's zero sincemask
is currently 0. - Set the 0th bit of
mask
to denote that we've seen 'a':mask |= 1 << 0
. Nowmask = 1
.
b. Second Letter - 'b': For
c = 'b'
:i
is computed asord('b') - ord('a')
which is1
.- Check if
mask & (1 << 1)
is non-zero. It's not; hence, 'b' hasn't appeared before. - Set the 1st bit of
mask
:mask |= 1 << 1
. Nowmask = 3
.
c. Third Letter - 'c': For
c = 'c'
:i
is computed asord('c') - ord('a')
which is2
.- Check if
mask & (1 << 2)
is non-zero. It's not; hence, 'c' hasn't appeared before. - Set the 2nd bit of
mask
:mask |= 1 << 2
. Nowmask = 7
.
d. Fourth Letter - 'c' (repeat): For
c = 'c'
again:i
is computed asord('c') - ord('a')
again, which is2
.- Check if
mask & (1 << 2)
is non-zero. It is non-zero since the 2nd bit inmask
is already set. - We've found the first letter that appears twice in the string, so we return 'c'.
-
Returning the Result: The first repeating character is
'c'
, as its second occurrence is before the second occurrence of any other letter. Therefore,'c'
is the output.
Understand that every time we "set a bit" in the mask, we transform the mask to represent the letters we've seen. For 'a', the binary representation of mask
becomes 000001
, for 'b' it becomes 000011
, and for 'c' we have 000111
. When the second 'c' is encountered, the corresponding bit is already set, and because bitwise AND of mask
and 1 << 2
results in a non-zero value, we know that 'c' is a repeating character.
Solution Implementation
1class Solution:
2 def repeatedCharacter(self, s: str) -> str:
3 # Initialize a mask to keep track of characters encountered
4 encountered_chars_mask = 0
5
6 # Loop through each character in the string
7 for char in s:
8 # Calculate the position of the character in the alphabet
9 # 'a' maps to 0, 'b' maps to 1, ..., 'z' maps to 25
10 alphabet_pos = ord(char) - ord('a')
11
12 # Check if the bit at the position of the current character is already set
13 if encountered_chars_mask & (1 << alphabet_pos):
14 # If the bit is set, the current character has been encountered before
15 return char
16
17 # Set the bit at the position of the current character in the mask,
18 # indicating the character has now been encountered
19 encountered_chars_mask |= 1 << alphabet_pos
20
21 # If the function exits the loop without returning, no repeated character was found
22 # (Although given the context of this function, it seems that an assumption is made
23 # that there will always be at least one repeated character in the string.)
24
1class Solution {
2 // This method finds the first repeated character in a given string
3 public char repeatedCharacter(String s) {
4 // 'mask' is used to store the information about characters seen so far
5 int mask = 0;
6 // Loop through each character in the string
7 for (int i = 0; i < s.length(); ++i) {
8 // Get the current character
9 char c = s.charAt(i);
10 // Calculate the bit position for the current character
11 int charPosition = c - 'a';
12 // Check if the current character has already been seen
13 if ((mask & (1 << charPosition)) != 0) {
14 // If the character has been seen before, return it as the first repeated character
15 return c;
16 }
17 // Set the bit for the current character in the 'mask' to mark it as seen
18 mask |= 1 << charPosition;
19 }
20 // This line is never reached since the problem statement guarantees at least one repetition
21 return '\0'; // Placeholder for compile-time checks, it represents a null character
22 }
23}
24
1#include <string> // Include necessary header
2
3class Solution {
4public:
5 // Method to find the first recurring character in a string
6 char repeatedCharacter(string s) {
7 int charBitmask = 0; // Initialize a bitmask to keep track of characters seen
8
9 // Loop through the characters of the string
10 for (int i = 0; i < s.length(); ++i) {
11 int charIndex = s[i] - 'a'; // Calculate the index of the current character (0-25 for a-z)
12
13 // Check if this character has been seen before by checking the bitmask.
14 // If the corresponding bit is set, we've found a repeated character
15 if ((charBitmask >> charIndex) & 1) {
16 return s[i]; // Return the repeated character
17 }
18
19 // Set the bit corresponding to this character in the bitmask
20 // to mark that it has now been seen
21 charBitmask |= 1 << charIndex;
22 }
23
24 // If no repeated character is found by the end of the string
25 // (which shouldn't happen as per the method's contract),
26 // the loop would exit by an exception since the exit condition of the loop is never true.
27 // To avoid the potential of reaching the end of the function without a return value,
28 // you can return a placeholder here (e.g., 0) or handle the error according to specifications.
29 throw std::logic_error("No repeated characters in input string.");
30 }
31};
32
1/**
2 * Finds the first repeated character in a given string.
3 *
4 * @param {string} targetString - The string to search for repeated characters.
5 * @returns {string} - The first repeated character if found; otherwise, a space character.
6 */
7function repeatedCharacter(targetString: string): string {
8 // Initialize a bitmask to track the presence of characters.
9 let charPresenceMask = 0;
10
11 // Iterate through each character in the given string.
12 for (const char of targetString) {
13 // Calculate the bit position by getting ASCII value difference
14 // from the target char to the 'a' character.
15 const bitPosition = char.charCodeAt(0) - 'a'.charCodeAt(0);
16
17 // Check if the bit at the calculated position is already set in the mask.
18 // If it is, we have found a repeated character.
19 if (charPresenceMask & (1 << bitPosition)) {
20 return char; // Return the repeated character.
21 }
22
23 // Update the mask to include the current character.
24 charPresenceMask |= 1 << bitPosition;
25 }
26
27 // If no repeated character was found, return a space character.
28 return ' ';
29}
30
Time and Space Complexity
Time Complexity:
The time complexity of the code is O(n)
, where n
is the length of the input string s
. This is because the code iterates over each character of the string exactly once. Within the loop, the operations—calculating the index i
, shifting the bitmask, performing the bitwise AND and OR operations—are all constant time operations, i.e., O(1)
.
Space Complexity:
The space complexity of the code is O(1)
. This is because the space used by the variables mask
and i
, and any other temporary space, is constant and does not depend on the input size. The bitmask mask
is an integer value that only requires a fixed amount of space, and regardless of the size of the string, it does not require additional space as the input size grows.
Learn more about how to find time and space complexity quickly using problem constraints.
How does quick sort divide the problem into subproblems?
Recommended Readings
LeetCode Patterns Your Personal Dijkstra's Algorithm to Landing Your Dream Job The goal of AlgoMonster is to help you get a job in the shortest amount of time possible in a data driven way We compiled datasets of tech interview problems and broke them down by patterns This way we
Recursion Recursion is one of the most important concepts in computer science Simply speaking recursion is the process of a function calling itself Using a real life analogy imagine a scenario where you invite your friends to lunch https algomonster s3 us east 2 amazonaws com recursion jpg You first
Runtime Overview When learning about algorithms and data structures you'll frequently encounter the term time complexity This concept is fundamental in computer science and offers insights into how long an algorithm takes to complete given a certain input size What is Time Complexity Time complexity represents the amount of time
Want a Structured Path to Master System Design Too? Don’t Miss This!