389. Find the Difference

EasyBit ManipulationHash TableStringSorting

Problem Description

You are given two strings s and t.

String t is created by taking string s, randomly shuffling all its characters, and then adding one extra letter at a random position in the shuffled result.

Your task is to find and return the single letter that was added to create string t.

For example:

If s = "abcd" and t = "abcde", the added letter is 'e'
If s = "a" and t = "aa", the added letter is 'a'
If s = "" and t = "y", the added letter is 'y'

The solution uses a counting approach: it counts the frequency of each character in string s using a Counter, then iterates through string t and decrements the count for each character encountered. When a character's count becomes negative, it means this character appears more times in t than in s, so it must be the added letter.

Quick Interview Experience

Help others by sharing your interview experience

Have you seen this problem before?

Intuition

Since string t is just string s with one extra character added, we know that t has exactly the same characters as s, except for one character that appears one more time in t.

Think of it like having two bags of letters. The second bag has everything from the first bag plus one extra letter. To find that extra letter, we can track how many of each letter we have in the first bag, then go through the second bag and remove letters one by one. The moment we try to remove a letter that we don't have (or have already used up), we've found our extra letter.

This naturally leads us to a counting strategy: if we count all characters in s and then subtract the counts as we see characters in t, the character that makes our count go negative must be the added one. This works because:

All original characters from s are present in t (just shuffled)
There's exactly one extra character in t
When we encounter this extra character, we'll either have no count for it (if it wasn't in s at all) or we'll have already used up all instances of it from s

The beauty of this approach is that we don't need to worry about the shuffling at all - we're just comparing character frequencies between the two strings.

Solution Approach

The implementation uses a hash table (Counter in Python) to track character frequencies:

Initialize the Counter: Create a Counter object from string s. This automatically counts the frequency of each character in s. For example, if s = "aabb", then cnt = {'a': 2, 'b': 2}.
Iterate through string t: Process each character c in string t one by one.
Decrement the count: For each character c encountered in t, subtract 1 from its count in cnt using cnt[c] -= 1. This effectively "removes" one instance of that character from our tracking.
Check for negative count: After decrementing, check if cnt[c] < 0. If the count becomes negative, it means:
- Either this character wasn't in s at all (Counter returns 0 for missing keys, so 0 - 1 = -1)
- Or we've already matched all instances of this character from s, and this is the extra occurrence
Return the extra character: As soon as we find a character with a negative count, we immediately return it as the answer.

The algorithm has a time complexity of O(n) where n is the length of string t (since we iterate through both strings once), and space complexity of O(1) since there are at most 26 lowercase English letters to track in the Counter.

Ready to land your dream job?

Unlock your dream job with a 5-minute evaluator for a personalized learning plan!

Start Evaluator

Example Walkthrough

Let's walk through the solution with s = "abc" and t = "cbdaa".

Step 1: Initialize the Counter We create a Counter from string s = "abc":

cnt = {'a': 1, 'b': 1, 'c': 1}

Step 2-4: Process each character in t = "cbdaa"

Processing 'c' (index 0):

cnt['c'] -= 1 → cnt['c'] = 0
Check: Is cnt['c'] < 0? No (it's 0)
Continue

Processing 'b' (index 1):

cnt['b'] -= 1 → cnt['b'] = 0
Check: Is cnt['b'] < 0? No (it's 0)
Continue

Processing 'd' (index 2):

cnt['d'] -= 1 → cnt['d'] = -1 (since 'd' wasn't in original counter, it starts at 0)
Check: Is cnt['d'] < 0? Yes!
Return 'd' ✓

We found our answer! The character 'd' was added to create string t.

Let's verify: String s = "abc" contains the letters a, b, and c. String t = "cbdaa" contains all those letters (c, b, a) plus an extra 'd' and an extra 'a'. Wait, that seems wrong...

Let me reconsider. Actually, looking at t = "cbdaa", it has: c(1), b(1), d(1), a(2). String s = "abc" has: a(1), b(1), c(1). So the difference is that t has one extra 'a' and one 'd'. This means the problem statement indicates only ONE character is added total.

Let me redo with a correct example where only one character is added:

Let's walk through with s = "abc" and t = "abcd".

Step 1: Initialize the Counter We create a Counter from string s = "abc":

cnt = {'a': 1, 'b': 1, 'c': 1}

Step 2-4: Process each character in t = "abcd"

Processing 'a' (index 0):

cnt['a'] -= 1 → cnt['a'] = 0
Check: Is cnt['a'] < 0? No (it's 0)
Continue

Processing 'b' (index 1):

cnt['b'] -= 1 → cnt['b'] = 0
Check: Is cnt['b'] < 0? No (it's 0)
Continue

Processing 'c' (index 2):

cnt['c'] -= 1 → cnt['c'] = 0
Check: Is cnt['c'] < 0? No (it's 0)
Continue

Processing 'd' (index 3):

cnt['d'] -= 1 → cnt['d'] = -1 (since 'd' wasn't in original counter, it starts at 0)
Check: Is cnt['d'] < 0? Yes!
Return 'd' ✓

The algorithm correctly identifies 'd' as the added character.

Solution Implementation

1from collections import Counter
2
3class Solution:
4    def findTheDifference(self, s: str, t: str) -> str:
5        # Count the frequency of each character in string s
6        char_count = Counter(s)
7      
8        # Iterate through each character in string t
9        for char in t:
10            # Decrement the count for this character
11            char_count[char] -= 1
12          
13            # If count becomes negative, this is the extra character added to t
14            if char_count[char] < 0:
15                return char
16

1class Solution {
2    public char findTheDifference(String s, String t) {
3        // Create an array to count frequency of each letter (26 lowercase letters)
4        int[] charCount = new int[26];
5      
6        // Count the frequency of each character in string s
7        for (int i = 0; i < s.length(); i++) {
8            // Convert character to index (0-25) and increment count
9            charCount[s.charAt(i) - 'a']++;
10        }
11      
12        // Iterate through string t to find the extra character
13        for (int i = 0; i < t.length(); i++) {
14            // Decrement count for each character in t
15            charCount[t.charAt(i) - 'a']--;
16          
17            // If count becomes negative, this is the extra character
18            if (charCount[t.charAt(i) - 'a'] < 0) {
19                return t.charAt(i);
20            }
21        }
22      
23        // This line should never be reached given the problem constraints
24        return ' ';
25    }
26}
27

1class Solution {
2public:
3    char findTheDifference(string s, string t) {
4        // Initialize frequency array for 26 lowercase letters
5        int charFrequency[26] = {0};
6      
7        // Count frequency of each character in string s
8        for (char& ch : s) {
9            charFrequency[ch - 'a']++;
10        }
11      
12        // Decrement frequency for each character in string t
13        // If frequency becomes negative, this is the added character
14        for (char& ch : t) {
15            charFrequency[ch - 'a']--;
16            if (charFrequency[ch - 'a'] < 0) {
17                return ch;
18            }
19        }
20      
21        // This line should never be reached given the problem constraints
22        // (t is generated by adding one letter to s)
23        return ' ';
24    }
25};
26

1/**
2 * Finds the extra character that was added to string t
3 * @param s - The original string containing lowercase letters
4 * @param t - The shuffled string with one additional character
5 * @returns The extra character that appears in t but not in s
6 */
7function findTheDifference(s: string, t: string): string {
8    // Create an array to count character frequencies (26 letters in alphabet)
9    const characterCount: number[] = Array(26).fill(0);
10  
11    // Count each character in string s by incrementing the corresponding index
12    for (const char of s) {
13        const index = char.charCodeAt(0) - 'a'.charCodeAt(0);
14        characterCount[index]++;
15    }
16  
17    // Decrement count for each character in string t
18    for (const char of t) {
19        const index = char.charCodeAt(0) - 'a'.charCodeAt(0);
20        characterCount[index]--;
21    }
22  
23    // Find the character with negative count (appears more in t than in s)
24    for (let i = 0; i < characterCount.length; i++) {
25        if (characterCount[i] < 0) {
26            // Convert index back to character and return
27            return String.fromCharCode(i + 'a'.charCodeAt(0));
28        }
29    }
30  
31    // This point should never be reached given valid input
32    return '';
33}
34

Time and Space Complexity

The time complexity is O(n + m), where n is the length of string s and m is the length of string t. The algorithm first creates a Counter from string s which takes O(n) time, then iterates through string t which takes O(m) time. Since t has exactly one more character than s (i.e., m = n + 1), the overall time complexity simplifies to O(n).

The space complexity is O(|Σ|), where Σ represents the character set. The Counter stores at most all unique characters from string s. Since the problem deals with lowercase English letters only, |Σ| = 26, making the space complexity O(26) = O(1) constant space.

Common Pitfalls

1. Assuming the Extra Character is Always New

A common misconception is thinking the added character must be a letter that doesn't exist in string s. However, the extra character can be a duplicate of an existing character. For example, if s = "abc" and t = "abcc", the extra character is 'c', not a new letter.

Why this matters: If you try to find a character that exists in t but not in s, you'll miss cases where the extra character is a duplicate.

2. Using Direct Dictionary Access Without Counter

Some might attempt to use a regular dictionary and access dict[char] directly without initializing counts:

# Incorrect approach
char_count = {}
for char in s:
    char_count[char] = char_count.get(char, 0) + 1

for char in t:
    char_count[char] -= 1  # KeyError if char not in s!

Problem: This raises a KeyError when encountering a character in t that wasn't in s. While Counter handles missing keys gracefully (returning 0), a regular dict doesn't.

Solution: Either use Counter as shown in the original solution, or handle missing keys explicitly:

for char in t:
    char_count[char] = char_count.get(char, 0) - 1
    if char_count[char] < 0:
        return char

3. Overthinking with Sorting

Some might sort both strings and compare them character by character:

# Less efficient approach
s_sorted = sorted(s)
t_sorted = sorted(t)

Problem: While this works, sorting has O(n log n) time complexity, making it less efficient than the O(n) counting approach.

4. XOR Solution Confusion

An alternative O(1) space solution uses XOR properties, but it's easy to misunderstand:

# XOR approach
result = 0
for char in s:
    result ^= ord(char)
for char in t:
    result ^= ord(char)
return chr(result)

Pitfall: Forgetting that XOR works because identical values cancel out (a ^ a = 0), leaving only the extra character. This requires understanding bitwise operations and may be less intuitive than the counting approach.

Discover Your Strengths and Weaknesses: Take Our 5-Minute Quiz to Tailor Your Study Plan:

Problem: Given a list of tasks and a list of requirements, compute a sequence of tasks that can be performed, such that we complete every task once while satisfying all the requirements.

Which of the following method should we use to solve this problem?

Topological Sort

Uniform Cost Search

Monotonic Stack

Memoization

I don't know