Facebook Pixel

3076. Shortest Uncommon Substring in an Array

MediumTrieArrayHash TableString
Leetcode Link

Problem Description

You are given an array arr containing n non-empty strings. Your task is to find the shortest unique substring for each string in the array.

For each string arr[i], you need to find the shortest substring that:

  • Appears in arr[i]
  • Does NOT appear as a substring in any other string in the array

If multiple substrings of the same length satisfy these conditions, choose the lexicographically smallest one (the one that would come first in dictionary order).

If no such unique substring exists for a particular string (meaning all its substrings also appear in at least one other string), then the answer for that string should be an empty string "".

The function should return an array answer of size n, where answer[i] contains the shortest unique substring for arr[i].

Example scenario:

  • If arr = ["abc", "bcd", "abcd"]
  • For "abc": We need to find its shortest substring that doesn't appear in "bcd" or "abcd"
  • For "bcd": We need to find its shortest substring that doesn't appear in "abc" or "abcd"
  • For "abcd": We need to find its shortest substring that doesn't appear in "abc" or "bcd"

The algorithm checks substrings in increasing order of length (1, 2, 3, ...) and for same-length substrings, it checks them in lexicographical order to ensure we get the shortest and lexicographically smallest valid substring.

Quick Interview Experience
Help others by sharing your interview experience
Have you seen this problem before?

Intuition

The key insight is that we need to find the shortest substring that is unique to each string. Since we want the shortest possible substring, we should start by checking substrings of length 1, then length 2, and so on. This greedy approach ensures we find the minimum length substring first.

For each string arr[i], we systematically generate all possible substrings starting from the shortest ones. The order matters here - we first try all substrings of length 1, then all of length 2, and continue until we find a valid unique substring or exhaust all possibilities.

When generating substrings of a fixed length j, we slide a window across the string from left to right. For a string of length m, there are m - j + 1 possible starting positions for a substring of length j. This sliding window approach ensures we check all possible substrings of each length.

To determine if a substring is unique to the current string, we need to verify it doesn't appear in any other string in the array. We can do this by checking each other string and seeing if our candidate substring exists within it. If the substring appears in even one other string, it's not unique and we continue searching.

The lexicographical ordering requirement is naturally handled by our left-to-right sliding window approach. When we have multiple valid substrings of the same length, we encounter them in lexicographical order due to how we iterate through starting positions from left to right.

Once we find the first valid unique substring for a string (which will be the shortest due to our length-based iteration order), we can immediately move on to the next string. If we've checked all possible substrings and found none that are unique, we assign an empty string as the answer for that position.

Learn more about Trie patterns.

Solution Approach

The solution uses a brute force enumeration approach with three nested loops to systematically check all possible substrings for each string in the array.

Main Algorithm Structure:

  1. Initialize the answer array: Create an array ans of empty strings with the same length as the input array. Each position will store the shortest unique substring for the corresponding input string.

  2. Iterate through each string: For each string arr[i] at index i:

    • Store its length as m = len(s)
    • We'll search for the shortest unique substring for this string
  3. Enumerate substring lengths: For each possible substring length j from 1 to m:

    • This ensures we check shorter substrings before longer ones
  4. Enumerate starting positions: For each starting position l from 0 to m - j:

    • Extract the substring: sub = s[l : l + j]
    • This sliding window approach generates all substrings of length j
  5. Check uniqueness and update answer:

    • First check if we should consider this substring: if not ans[i] or ans[i] > sub
      • not ans[i]: We haven't found any valid substring yet
      • ans[i] > sub: The current substring is lexicographically smaller than our current answer
    • Then verify the substring is unique: all(k == i or sub not in t for k, t in enumerate(arr))
      • For every other string in the array (where k != i), check that sub is not a substring of it
      • If sub doesn't appear in any other string, it's unique to arr[i]
    • If both conditions are met, update ans[i] = sub
  6. Early termination: After checking all substrings of length j, if we've found a valid answer (if ans[i]), we break out of the length loop. This optimization prevents checking longer substrings once we've found the shortest valid one.

Time Complexity Analysis:

  • For each string (n strings total)
  • We check substrings of each length (up to m)
  • For each length, we have O(m) starting positions
  • For each substring, we check against all n strings
  • Overall: O(nΒ² Γ— mΒ³) where n is the number of strings and m is the maximum string length

The algorithm guarantees finding the shortest, lexicographically smallest unique substring for each string, or returns an empty string if no unique substring exists.

Ready to land your dream job?

Unlock your dream job with a 5-minute evaluator for a personalized learning plan!

Start Evaluator

Example Walkthrough

Let's walk through the algorithm with arr = ["cab", "ad", "bad", "c"].

Processing "cab" (index 0):

  • Length 1 substrings: "c", "a", "b"
    • Check "c": Appears in "c" (index 3) β†’ not unique
    • Check "a": Appears in "ad" (index 1) and "bad" (index 2) β†’ not unique
    • Check "b": Appears in "bad" (index 2) β†’ not unique
  • Length 2 substrings: "ca", "ab"
    • Check "ca": Not in "ad", "bad", or "c" β†’ unique!
    • Set ans[0] = "ca" and move to next string

Processing "ad" (index 1):

  • Length 1 substrings: "a", "d"
    • Check "a": Appears in "cab" (index 0) and "bad" (index 2) β†’ not unique
    • Check "d": Appears in "bad" (index 2) β†’ not unique
  • Length 2 substrings: "ad"
    • Check "ad": Not in "cab", but appears in "bad" (index 2) β†’ not unique
  • No unique substring found β†’ ans[1] = ""

Processing "bad" (index 2):

  • Length 1 substrings: "b", "a", "d"
    • Check "b": Appears in "cab" (index 0) β†’ not unique
    • Check "a": Appears in "cab" (index 0) and "ad" (index 1) β†’ not unique
    • Check "d": Appears in "ad" (index 1) β†’ not unique
  • Length 2 substrings: "ba", "ad"
    • Check "ba": Not in "cab", "ad", or "c" β†’ unique!
    • Set ans[2] = "ba" and move to next string

Processing "c" (index 3):

  • Length 1 substrings: "c"
    • Check "c": Appears in "cab" (index 0) β†’ not unique
  • No more substrings possible β†’ ans[3] = ""

Final Result: ["ca", "", "ba", ""]

The algorithm efficiently finds the shortest unique substring for each string. For "cab", it finds "ca" at length 2. For "bad", it finds "ba" at length 2. For "ad" and "c", no unique substrings exist since all their substrings appear in other strings.

Solution Implementation

1class Solution:
2    def shortestSubstrings(self, arr: List[str]) -> List[str]:
3        """
4        Find the lexicographically smallest substring for each string that doesn't appear in any other string.
5      
6        Args:
7            arr: List of strings to process
8          
9        Returns:
10            List of shortest unique substrings for each string
11        """
12        result = [""] * len(arr)
13      
14        # Process each string in the array
15        for index, current_string in enumerate(arr):
16            string_length = len(current_string)
17          
18            # Try all possible substring lengths from shortest to longest
19            for substring_length in range(1, string_length + 1):
20              
21                # Try all possible starting positions for current substring length
22                for start_position in range(string_length - substring_length + 1):
23                    # Extract substring from start_position with given length
24                    substring = current_string[start_position : start_position + substring_length]
25                  
26                    # Check if this substring is lexicographically smaller than current answer
27                    # (empty string is considered larger than any non-empty string)
28                    if not result[index] or result[index] > substring:
29                      
30                        # Verify this substring doesn't appear in any other string in the array
31                        is_unique = all(
32                            idx == index or substring not in other_string 
33                            for idx, other_string in enumerate(arr)
34                        )
35                      
36                        if is_unique:
37                            result[index] = substring
38              
39                # If we found a valid substring for this length, no need to check longer ones
40                if result[index]:
41                    break
42      
43        return result
44
1class Solution {
2    public String[] shortestSubstrings(String[] arr) {
3        int numStrings = arr.length;
4        String[] result = new String[numStrings];
5      
6        // Initialize all result strings as empty
7        Arrays.fill(result, "");
8      
9        // Process each string in the input array
10        for (int stringIndex = 0; stringIndex < numStrings; stringIndex++) {
11            String currentString = arr[stringIndex];
12            int currentStringLength = currentString.length();
13          
14            // Try substrings of increasing length (from 1 to full string length)
15            for (int substringLength = 1; substringLength <= currentStringLength && result[stringIndex].isEmpty(); substringLength++) {
16              
17                // Try all possible starting positions for current substring length
18                for (int startPos = 0; startPos <= currentStringLength - substringLength; startPos++) {
19                    String currentSubstring = currentString.substring(startPos, startPos + substringLength);
20                  
21                    // Check if this substring is lexicographically smaller than current result (or if result is empty)
22                    if (result[stringIndex].isEmpty() || currentSubstring.compareTo(result[stringIndex]) < 0) {
23                        boolean isUnique = true;
24                      
25                        // Check if this substring appears in any other string
26                        for (int otherStringIndex = 0; otherStringIndex < numStrings && isUnique; otherStringIndex++) {
27                            if (otherStringIndex != stringIndex && arr[otherStringIndex].contains(currentSubstring)) {
28                                isUnique = false;
29                            }
30                        }
31                      
32                        // If substring is unique to current string, update result
33                        if (isUnique) {
34                            result[stringIndex] = currentSubstring;
35                        }
36                    }
37                }
38            }
39        }
40      
41        return result;
42    }
43}
44
1class Solution {
2public:
3    vector<string> shortestSubstrings(vector<string>& arr) {
4        int arraySize = arr.size();
5        vector<string> result(arraySize);
6      
7        // Process each string in the array
8        for (int stringIndex = 0; stringIndex < arraySize; ++stringIndex) {
9            int currentStringLength = arr[stringIndex].size();
10          
11            // Try substrings of increasing length until we find a valid one
12            for (int substringLength = 1; substringLength <= currentStringLength && result[stringIndex].empty(); ++substringLength) {
13              
14                // Generate all substrings of current length
15                for (int startPos = 0; startPos <= currentStringLength - substringLength; ++startPos) {
16                    string currentSubstring = arr[stringIndex].substr(startPos, substringLength);
17                  
18                    // Update result if this substring is lexicographically smaller or result is empty
19                    if (result[stringIndex].empty() || currentSubstring < result[stringIndex]) {
20                        bool isUnique = true;
21                      
22                        // Check if this substring appears in any other string
23                        for (int otherStringIndex = 0; otherStringIndex < arraySize && isUnique; ++otherStringIndex) {
24                            if (otherStringIndex != stringIndex && 
25                                arr[otherStringIndex].find(currentSubstring) != string::npos) {
26                                isUnique = false;
27                            }
28                        }
29                      
30                        // If substring is unique to current string, update result
31                        if (isUnique) {
32                            result[stringIndex] = currentSubstring;
33                        }
34                    }
35                }
36            }
37        }
38      
39        return result;
40    }
41};
42
1/**
2 * Finds the shortest unique substring for each string in the array.
3 * A unique substring is one that appears in only one string from the array.
4 * If multiple shortest unique substrings exist, returns the lexicographically smallest one.
5 * 
6 * @param arr - Array of strings to process
7 * @returns Array of shortest unique substrings for each input string
8 */
9function shortestSubstrings(arr: string[]): string[] {
10    const stringCount: number = arr.length;
11    const result: string[] = Array(stringCount).fill('');
12  
13    // Process each string in the array
14    for (let stringIndex = 0; stringIndex < stringCount; ++stringIndex) {
15        const currentStringLength: number = arr[stringIndex].length;
16      
17        // Try substrings of increasing length (starting from length 1)
18        for (let substringLength = 1; substringLength <= currentStringLength && result[stringIndex] === ''; ++substringLength) {
19          
20            // Try all possible starting positions for current substring length
21            for (let startPosition = 0; startPosition <= currentStringLength - substringLength; ++startPosition) {
22                // Extract substring from current position
23                const currentSubstring: string = arr[stringIndex].slice(startPosition, startPosition + substringLength);
24              
25                // Check if this substring is lexicographically smaller than current result (or if result is empty)
26                if (result[stringIndex] === '' || currentSubstring.localeCompare(result[stringIndex]) < 0) {
27                    let isUniqueSubstring: boolean = true;
28                  
29                    // Check if this substring appears in any other string
30                    for (let otherStringIndex = 0; otherStringIndex < stringCount && isUniqueSubstring; ++otherStringIndex) {
31                        if (otherStringIndex !== stringIndex && arr[otherStringIndex].includes(currentSubstring)) {
32                            isUniqueSubstring = false;
33                        }
34                    }
35                  
36                    // If substring is unique to current string, update result
37                    if (isUniqueSubstring) {
38                        result[stringIndex] = currentSubstring;
39                    }
40                }
41            }
42        }
43    }
44  
45    return result;
46}
47

Time and Space Complexity

The time complexity is O(n^2 Γ— m^4), where n is the length of the array arr and m is the maximum length of a string in the array.

Breaking down the time complexity:

  • The outer loop iterates through each string in arr: O(n)
  • For each string s of length m, we iterate through all possible substring lengths from 1 to m: O(m)
  • For each substring length j, we iterate through all starting positions: O(m)
  • We extract a substring using slicing s[l : l + j]: O(m)
  • The comparison ans[i] > sub for lexicographic ordering: O(m)
  • The all() function checks every string in arr to see if sub is not present: O(n)
  • The substring containment check sub not in t for each string: O(m^2) (using naive string matching)

Combining these nested operations: O(n Γ— m Γ— m Γ— (m + m + n Γ— m^2)) = O(n Γ— m^2 Γ— n Γ— m^2) = O(n^2 Γ— m^4)

The space complexity is O(n Γ— m) for storing the answer array with n strings, each potentially of length up to m. Additionally, temporary space O(m) is used for substring creation during execution. Since O(n Γ— m) dominates, the overall space complexity is O(n Γ— m).

Note: The reference answer states space complexity as O(m), which appears to only consider the temporary space for substring operations and not the output array storage.

Learn more about how to find time and space complexity quickly.

Common Pitfalls

1. Incorrect Substring Generation Order Leading to Wrong Lexicographical Selection

The Pitfall: A common mistake is generating substrings in the wrong order within the same length. The current implementation generates substrings from left to right (incrementing start positions), which naturally produces them in lexicographical order for the same length. However, developers often make mistakes by:

  • Using a different iteration pattern that doesn't guarantee lexicographical order
  • Attempting to sort substrings afterward, which adds unnecessary complexity
  • Checking all substrings of a given length and then selecting the minimum, which is inefficient

Example of the issue:

# WRONG: Collecting all substrings then finding minimum
for substring_length in range(1, string_length + 1):
    valid_substrings = []
    for start_position in range(string_length - substring_length + 1):
        substring = current_string[start_position : start_position + substring_length]
        if is_unique(substring):
            valid_substrings.append(substring)
    if valid_substrings:
        result[index] = min(valid_substrings)  # Extra work and memory
        break

The Solution: The correct approach (as shown in the provided code) is to:

  1. Generate substrings from left to right for each length
  2. Check and update immediately when finding a better (lexicographically smaller) unique substring
  3. Use the condition if not result[index] or result[index] > substring to ensure we only update when we find a better option

This guarantees that for substrings of the same length, we naturally encounter them in lexicographical order since:

  • "abc" starting at position 0 comes before "bcd" starting at position 1
  • We check and potentially update immediately, avoiding the need to store and sort

2. Forgetting Early Termination After Finding Valid Substring

The Pitfall: Without the break statement after finding a valid substring for a given length, the algorithm would continue checking longer substrings unnecessarily. This not only affects performance but could also lead to incorrect results if the logic for updating the answer isn't carefully managed.

Example of the issue:

# WRONG: Missing break statement
for substring_length in range(1, string_length + 1):
    for start_position in range(string_length - substring_length + 1):
        substring = current_string[start_position : start_position + substring_length]
        if not result[index] or result[index] > substring:
            if is_unique:
                result[index] = substring
    # Missing break here - will continue to check longer substrings

The Solution: Always include a break statement after the inner loop if a valid substring was found for the current length. This ensures we return the shortest possible unique substring and improves performance significantly.

Discover Your Strengths and Weaknesses: Take Our 5-Minute Quiz to Tailor Your Study Plan:

Which technique can we use to find the middle of a linked list?


Recommended Readings

Want a Structured Path to Master System Design Too? Don’t Miss This!

Load More