2053. Kth Distinct String in an Array
Problem Description
You are given an array of strings arr
and an integer k
. A distinct string is defined as a string that appears exactly once in the array (not duplicated).
Your task is to find and return the k
th distinct string from the array, considering the strings in the order they appear. If there are fewer than k
distinct strings in the array, return an empty string ""
.
For example, if arr = ["a", "b", "a", "c", "b", "d"]
and k = 2
:
- The distinct strings in order are:
"c"
(appears once) and"d"
(appears once) - The 1st distinct string is
"c"
- The 2nd distinct string is
"d"
- So the answer would be
"d"
The solution uses a hash table to count occurrences of each string using Counter(arr)
. It then iterates through the original array maintaining the order, checks if each string has a count of 1 (meaning it's distinct), and decrements k
each time a distinct string is found. When k
reaches 0, that string is the k
th distinct string and is returned. If the loop completes without finding k
distinct strings, an empty string is returned.
Intuition
The key insight is that we need to identify which strings are "distinct" (appear only once) while preserving their original order in the array. This immediately suggests a two-step approach:
- First, we need to know which strings qualify as distinct - this requires counting how many times each string appears
- Then, we need to find the
k
th such string in the original order
Why use a hash table? When we need to count occurrences of elements, a hash table (or Counter
in Python) is the natural choice because it gives us O(1)
lookup time for checking how many times a string appears.
The clever part is realizing we don't need to create a separate list of distinct strings. Instead, we can iterate through the original array again, which naturally preserves the order. As we encounter each string, we check if it's distinct (count equals 1), and if so, we decrement our counter k
. This way, when k
reaches 0, we've found exactly the k
th distinct string.
This approach avoids the need for extra space to store distinct strings separately and handles the ordering requirement elegantly. If we exhaust the array without k
reaching 0, it means there aren't enough distinct strings, so we return an empty string.
The time complexity is O(n)
for counting and O(n)
for finding the k
th distinct string, giving us O(n)
overall. The space complexity is O(n)
for the hash table in the worst case where all strings are unique.
Solution Approach
The implementation follows a straightforward counting and traversal strategy using a hash table.
Step 1: Count Occurrences
cnt = Counter(arr)
We use Python's Counter
from the collections module to create a hash table that maps each string to its frequency in the array. This takes O(n)
time where n
is the length of the array.
Step 2: Find the kth Distinct String
for s in arr: if cnt[s] == 1: k -= 1 if k == 0: return s
We iterate through the original array arr
to maintain the order of appearance. For each string s
:
- Check if
cnt[s] == 1
, which means the string is distinct (appears only once) - If it's distinct, decrement
k
by 1 - When
k
reaches 0, we've found thek
th distinct string and return it immediately
Step 3: Handle Edge Case
return ""
If we complete the loop without finding k
distinct strings (meaning there are fewer than k
distinct strings in the array), we return an empty string as required.
Example Walkthrough:
For arr = ["d", "b", "c", "b", "c", "a"]
and k = 2
:
cnt = {"d": 1, "b": 2, "c": 2, "a": 1}
- Traverse array:
"d"
: count is 1, sok
becomes 1"b"
: count is 2, skip"c"
: count is 2, skip"b"
: count is 2, skip"c"
: count is 2, skip"a"
: count is 1, sok
becomes 0, return"a"
The algorithm efficiently solves the problem in O(n)
time with O(n)
space complexity.
Ready to land your dream job?
Unlock your dream job with a 5-minute evaluator for a personalized learning plan!
Start EvaluatorExample Walkthrough
Let's walk through a small example to illustrate the solution approach.
Example: arr = ["cat", "dog", "cat", "bird", "dog", "fish"]
and k = 2
Step 1: Count occurrences using Counter
cnt = {"cat": 2, "dog": 2, "bird": 1, "fish": 1}
- "cat" appears 2 times (not distinct)
- "dog" appears 2 times (not distinct)
- "bird" appears 1 time (distinct)
- "fish" appears 1 time (distinct)
Step 2: Traverse the original array to find the kth distinct string
Index | String | cnt[s] | Is Distinct? | k value | Action |
---|---|---|---|---|---|
0 | "cat" | 2 | No | 2 | Skip |
1 | "dog" | 2 | No | 2 | Skip |
2 | "cat" | 2 | No | 2 | Skip |
3 | "bird" | 1 | Yes | 2→1 | Decrement k |
4 | "dog" | 2 | No | 1 | Skip |
5 | "fish" | 1 | Yes | 1→0 | k reaches 0, return "fish" |
Result: The function returns "fish"
as it's the 2nd distinct string in order.
The key insight is that we maintain the original ordering by iterating through the array as it was given, only counting strings that appear exactly once. When we've seen exactly k
such strings, we return the current one.
Solution Implementation
1from typing import List
2from collections import Counter
3
4class Solution:
5 def kthDistinct(self, arr: List[str], k: int) -> str:
6 # Count the frequency of each string in the array
7 frequency_map = Counter(arr)
8
9 # Iterate through the array in original order
10 for string in arr:
11 # Check if the current string appears exactly once (is distinct)
12 if frequency_map[string] == 1:
13 # Decrement k for each distinct string found
14 k -= 1
15 # If we've found the k-th distinct string, return it
16 if k == 0:
17 return string
18
19 # If there are fewer than k distinct strings, return empty string
20 return ""
21
1class Solution {
2 /**
3 * Finds the kth distinct string that appears exactly once in the array.
4 * A distinct string is one that appears exactly once in the array.
5 *
6 * @param arr the input array of strings
7 * @param k the position (1-indexed) of the distinct string to return
8 * @return the kth distinct string, or empty string if there are fewer than k distinct strings
9 */
10 public String kthDistinct(String[] arr, int k) {
11 // Create a frequency map to count occurrences of each string
12 Map<String, Integer> frequencyMap = new HashMap<>();
13
14 // Count the frequency of each string in the array
15 for (String str : arr) {
16 frequencyMap.merge(str, 1, Integer::sum);
17 }
18
19 // Iterate through the array in original order to find the kth distinct string
20 for (String str : arr) {
21 // Check if current string appears exactly once (is distinct)
22 if (frequencyMap.get(str) == 1) {
23 // Decrement k and check if we've found the kth distinct string
24 k--;
25 if (k == 0) {
26 return str;
27 }
28 }
29 }
30
31 // Return empty string if there are fewer than k distinct strings
32 return "";
33 }
34}
35
1class Solution {
2public:
3 string kthDistinct(vector<string>& arr, int k) {
4 // Hash map to store the frequency of each string
5 unordered_map<string, int> frequencyMap;
6
7 // First pass: Count occurrences of each string
8 for (const auto& str : arr) {
9 ++frequencyMap[str];
10 }
11
12 // Second pass: Find the k-th distinct string (appears exactly once)
13 for (const auto& str : arr) {
14 // Check if current string is distinct (frequency == 1)
15 if (frequencyMap[str] == 1) {
16 // Decrement k and check if we've found the k-th distinct string
17 --k;
18 if (k == 0) {
19 return str;
20 }
21 }
22 }
23
24 // If there are fewer than k distinct strings, return empty string
25 return "";
26 }
27};
28
1/**
2 * Finds the kth distinct string in an array
3 * A distinct string is one that appears exactly once in the array
4 * @param arr - Array of strings to search through
5 * @param k - The position (1-indexed) of the distinct string to return
6 * @returns The kth distinct string, or empty string if there are fewer than k distinct strings
7 */
8function kthDistinct(arr: string[], k: number): string {
9 // Map to store the frequency count of each string
10 const frequencyMap = new Map<string, number>();
11
12 // First pass: Count occurrences of each string
13 for (const str of arr) {
14 frequencyMap.set(str, (frequencyMap.get(str) || 0) + 1);
15 }
16
17 // Second pass: Find the kth distinct string (appears exactly once)
18 for (const str of arr) {
19 // Check if current string appears exactly once
20 if (frequencyMap.get(str) === 1) {
21 // Decrement k and check if we've found the kth distinct string
22 k--;
23 if (k === 0) {
24 return str;
25 }
26 }
27 }
28
29 // Return empty string if there are fewer than k distinct strings
30 return '';
31}
32
Time and Space Complexity
The time complexity is O(n)
, where n
is the number of strings in the array arr
. This is because:
- Creating the Counter takes
O(n)
time to iterate through all elements - The second loop iterates through the array once in the worst case, which is
O(n)
- Overall:
O(n) + O(n) = O(n)
The space complexity is O(n)
, where n
is the number of strings in the array arr
. This is because:
- The Counter dictionary stores at most
n
unique strings as keys - Each string reference in the Counter points to the same string objects in the original array, so we're only storing references, not duplicating the strings themselves
Note: The reference answer suggests O(L)
where L
is the total length of all strings. This would be accurate if we consider the space needed to store the actual string data. However, in Python's implementation with Counter, we're storing references to existing strings rather than creating new copies. The time complexity could also be considered O(L)
if we account for string comparison operations during hashing, as comparing strings takes time proportional to their length. Under this interpretation:
- Time complexity:
O(L)
accounting for string comparison in hash operations - Space complexity:
O(L)
accounting for the total space occupied by all unique strings
Learn more about how to find time and space complexity quickly.
Common Pitfalls
Pitfall 1: Losing Original Order When Using Only the Counter
A common mistake is trying to iterate through the Counter object directly instead of the original array:
# WRONG APPROACH
def kthDistinct(self, arr: List[str], k: int) -> str:
frequency_map = Counter(arr)
# This loses the original order!
for string, count in frequency_map.items():
if count == 1:
k -= 1
if k == 0:
return string
return ""
Why it's wrong: The Counter object doesn't preserve the insertion order of the original array. When you iterate through frequency_map.items()
, you might get strings in a different order than they appeared in the input array.
Solution: Always iterate through the original array arr
to maintain the order of appearance, and use the Counter only for frequency lookups.
Pitfall 2: Modifying k Without Creating a Copy
While the current solution correctly modifies the parameter k
directly, in some contexts this could be problematic if you need to preserve the original value:
# Potentially problematic if k is needed elsewhere
def kthDistinct(self, arr: List[str], k: int) -> str:
frequency_map = Counter(arr)
original_k = k # If you need the original value later
for string in arr:
if frequency_map[string] == 1:
k -= 1
if k == 0:
return string
return ""
Solution: If you need to preserve the original k
value for any reason (debugging, logging, etc.), create a copy before modifying it.
Pitfall 3: Not Handling Edge Cases Properly
Developers might forget to handle cases where k <= 0
or when the array is empty:
# MORE ROBUST VERSION
def kthDistinct(self, arr: List[str], k: int) -> str:
# Handle edge cases explicitly
if not arr or k <= 0:
return ""
frequency_map = Counter(arr)
for string in arr:
if frequency_map[string] == 1:
k -= 1
if k == 0:
return string
return ""
Why it matters: While the current solution works correctly even without explicit edge case handling (it will naturally return ""
for these cases), explicitly handling them makes the code more readable and can prevent unnecessary computation for invalid inputs.
Which of the following array represent a max heap?
Recommended Readings
Coding Interview Patterns Your Personal Dijkstra's Algorithm to Landing Your Dream Job The goal of AlgoMonster is to help you get a job in the shortest amount of time possible in a data driven way We compiled datasets of tech interview problems and broke them down by patterns This way
Recursion Recursion is one of the most important concepts in computer science Simply speaking recursion is the process of a function calling itself Using a real life analogy imagine a scenario where you invite your friends to lunch https assets algo monster recursion jpg You first call Ben and ask
Runtime Overview When learning about algorithms and data structures you'll frequently encounter the term time complexity This concept is fundamental in computer science and offers insights into how long an algorithm takes to complete given a certain input size What is Time Complexity Time complexity represents the amount of time
Want a Structured Path to Master System Design Too? Don’t Miss This!