3076. Shortest Uncommon Substring in an Array
Problem Description
You are provided with an array called arr
where each element is a non-empty string. The size of the array is denoted as n
. The goal is to identify for each element in arr
, the shortest substring that does not appear in any other strings within the array. If there are several such substrings, the lexicographically smallest one should be chosen. If no such unique substring exists for an element, an empty string should be the output for that element.
The result should be returned as an array answer
with the same size as arr
, where answer[i]
contains the shortest unique substring for arr[i]
.
To grasp what is being asked, one needs to understand basic string operations:
- A substring is a contiguous sequence of characters within a string.
- To be lexicographically smaller means to come first in alphabetical order; for example, 'abc' is lexicographically smaller than 'acd'.
Thus, the problem is essentially about finding a part of each string that is distinctive to that string alone when compared to the entire array of strings.
Intuition
The direct approach to this problem is to enumerate substrings of each string systematically and check their uniqueness. To achieve this, for every string arr[i]
in the input array, we generate all possible substrings beginning with the shortest length (1 character) and increasing in size. For each substring sub
of arr[i]
, we check whether sub
is contained in any other string in the array.
We have to do this methodically:
-
Fixed Length Enumeration: Start with the shortest possible substrings and increase their length incrementally. This ensures that the first unique substring we find will also be the shortest for
arr[i]
. -
Checking Uniqueness: For each potential substring, verify that it does not occur in any other strings except
arr[i]
. If it is unique, we immediately stop the search for this string and move to the next one since this is the shortest unique substring forarr[i]
. -
Lexicographical Order: By starting with the shortest substrings and examining each possible substring from left to right, we guarantee that the first match will be lexicographically the smallest since we are essentially exploring them in "dictionary" order.
Implementing this is quite straightforward, through nested loops, as reflected in the provided Python code. In the first loop, we go through each string arr[i]
, then we enumerate substring lengths j
, and for each length, we slide over the string with another loop starting at position l
to get sub
. With a few if-conditions and an all-encompassing loop, we ensure all conditions are met before setting the unique substring into our answer array.
It is worth noting that this approach is not the most efficient in terms of time complexity, especially for large arrays or strings. However, the problem statement seems to suggest that the array and strings are of reasonable length, allowing for a brute-force enumeration to be a viable solution.
Learn more about Trie patterns.
Solution Approach
The solution to this problem uses brute force enumeration, which means trying out all the possibilities systematically until the correct answer is found. In terms of algorithmic patterns, this is a straightforward approach without involving any advanced data structures or optimizations. The code executes a series of nested loops to examine each substring and verify its uniqueness among all strings in the input array.
Here's how the algorithm is implemented step by step:
-
Initialize an answer list
ans
with empty strings for each element inarr
. This list will eventually hold the shortest unique substrings for each string inarr
. -
Start with the outermost loop that goes through each string
s
inarr
using its indexi
. This allows us to keep track of the current string and its corresponding position in the answer array. -
For each string
s
, determine its lengthm
. Enumerate the lengthj
of all possible substrings starting from 1 up tom
inclusive. This is done using a loop for the length of the substring, ensuring we start with the shortest length and move to the longer ones. -
For each substring length
j
, use another loop to iterate over all possible starting positionsl
in strings
. A substringsub
is formed from strings
starting at positionl
and spanningj
characters. -
Within the innermost loop, check the following conditions for each
sub
:- If
sub
is lexigraphically smaller than the current answer fors
inans[i]
or ifans[i]
is still an empty string, proceed to the next condition. - Verify that
sub
is not a substring of any other stringt
inarr
, except the current strings
. Use the built-in Pythonnot in
operator to check ifsub
is not part of stringt
.
- If
-
If
sub
passes these conditions, it meanssub
is unique and lexicographically smaller than any previously considered valid substrings. Therefore, updateans[i]
with thissub
. -
Once a valid
sub
is found andans[i]
is updated, there is no need to consider longer substrings for the current strings
. So, break the loop that is iterating over the lengthj
to move to the next strings
inarr
. -
After all strings have been processed, the answer list
ans
is complete with the shortest unique substrings and is returned.
Ready to land your dream job?
Unlock your dream job with a 2-minute evaluator for a personalized learning plan!
Start EvaluatorExample Walkthrough
Let's illustrate the solution approach with a small example. Consider the array arr = ["abc", "bca", "dac"]
.
-
Initialize an empty answer list
ans
with the same size asarr
. In this case,ans = ["", "", ""]
. -
Start an outer loop over the elements of
arr
. We first look atarr[0]
which is "abc". -
For string "abc", with length
m = 3
, consider all possible substrings. We start with lengthj = 1
. -
Investigate all starting positions
l
of substrings of "abc" with lengthj = 1
:- The first substring is "a". It appears in "dac" as well, so it is not unique.
- The next substring is "b", which also appears in "bca".
- The last substring is "c", which appears in "bca" as well.
-
Increase
j
to2
since no unique substring was found with length1
. Check all substrings withj = 2
:- Substring "ab" is not in "bca" or "dac", so it is unique.
- Even though we could check "bc", we found the shortest and lexicographically smallest unique substring "ab". Update
ans[0]
to "ab".
-
Move to the next string in
arr
, which is "bca". -
Repeat the same process: check all unique substrings of "bca" starting with
j = 1
.- "b" isn't unique as it appears in "abc"; same for "c".
- "a" does not appear in the other strings, so it is unique.
-
Update
ans[1]
to "a" since we found the shortest unique substring for the second string inarr
. -
Apply the same approach to "dac":
- "d" is unique since it doesn't appear in "abc" or "bca".
-
Now we can update
ans[2]
to "d".
The process is done, and we've populated our answer list: ans = ["ab", "a", "d"]
which contains, for each string in arr
, the shortest unique substring that doesn't appear in any other string in arr
.
This example has demonstrated the effectiveness of the brute force enumeration approach in finding the shortest unique substrings.
Solution Implementation
1from typing import List
2
3class Solution:
4 def shortestSubstrings(self, strings: List[str]) -> List[str]:
5 # Initialize the answer list with empty strings for each input string
6 shortest_substrings = [""] * len(strings)
7
8 # Iterate over each string and its index in the list
9 for index, string in enumerate(strings):
10 string_length = len(string)
11
12 # Iterate over all possible substring lengths
13 for length in range(1, string_length + 1):
14 # Iterate over each starting position for the substring
15 for start_pos in range(string_length - length + 1):
16 # Extract the substring
17 substring = string[start_pos : start_pos + length]
18
19 # Check if this substring is a valid candidate:
20 # - Either it's the first such substring found or
21 # - It's lexicographically smaller than the previously found substring
22 # - And it's not a substring of any of the other input strings
23 if not shortest_substrings[index] or shortest_substrings[index] > substring:
24 if all(substring not in other_string for other_index, other_string in enumerate(strings) if other_index != index):
25 shortest_substrings[index] = substring
26
27 # Move onto the next string if we found a valid substring
28 if shortest_substrings[index]:
29 break
30
31 # Return the list of shortest substrings
32 return shortest_substrings
33
1class Solution {
2
3 /**
4 * Finds the shortest substring for each string in an array
5 * such that the substring is not present in any other string in the array.
6 *
7 * @param strings An array of strings to find substrings from.
8 * @return An array containing the shortest unique substrings.
9 */
10 public String[] shortestSubstrings(String[] strings) {
11 int arrayLength = strings.length;
12 String[] result = new String[arrayLength];
13 // Initialize result array with empty strings
14 Arrays.fill(result, "");
15
16 // Iterate through each string in the array
17 for (int i = 0; i < arrayLength; ++i) {
18 int stringLength = strings[i].length();
19
20 // Try all possible substring lengths starting from 1
21 for (int len = 1; len <= stringLength && result[i].isEmpty(); ++len) {
22 // Try all possible starting positions for the substring of length 'len'
23 for (int start = 0; start <= stringLength - len; ++start) {
24 // Extract the substring
25 String substr = strings[i].substring(start, start + len);
26
27 // Check if we already found a shorter unique substring
28 if (result[i].isEmpty() || substr.compareTo(result[i]) < 0) {
29 boolean isUnique = true;
30
31 // Check if the substring exists in any other string
32 for (int k = 0; k < arrayLength && isUnique; ++k) {
33 if (k != i && strings[k].contains(substr)) {
34 isUnique = false;
35 }
36 }
37
38 // If the substring is unique, update the result
39 if (isUnique) {
40 result[i] = substr;
41 }
42 }
43 }
44 }
45 }
46 return result;
47 }
48}
49
1#include <vector>
2#include <string>
3
4class Solution {
5public:
6 vector<string> shortestSubstrings(vector<string>& strings) {
7 int numberOfStrings = strings.size(); // Get the number of strings in the array
8 vector<string> answers(numberOfStrings); // Initialize the answer vector with the same size
9
10 // Loop through each string in the array of strings
11 for (int i = 0; i < numberOfStrings; ++i) {
12 int stringLength = strings[i].size(); // Get the length of the current string
13
14 // Iterate over all possible substring lengths starting from 1 to the size of the string
15 for (int length = 1; length <= stringLength && answers[i].empty(); ++length) {
16
17 // Try all substrings of the current length
18 for (int startPosition = 0; startPosition <= stringLength - length; ++startPosition) {
19 string substring = strings[i].substr(startPosition, length); // Extract the substring
20
21 // If the answer for the current string is not found or the current substring is lexicographically smaller
22 if (answers[i].empty() || substring < answers[i]) {
23 bool isUnique = true; // Flag to check if the substring is unique among all strings
24
25 // Check if the substring appears in any other string
26 for (int k = 0; k < numberOfStrings && isUnique; ++k) {
27 if (k != i && strings[k].find(substring) != string::npos) {
28 isUnique = false; // The substring is not unique as it is found in another string
29 }
30 }
31
32 // If the substring is unique, update the answer for the current string
33 if (isUnique) {
34 answers[i] = substring;
35 }
36 }
37 }
38 }
39 }
40
41 return answers; // Return the vector containing the shortest unique substrings
42 }
43};
44
1function shortestSubstrings(arr: string[]): string[] {
2 const arrayLength: number = arr.length; // Length of the input array.
3 const result: string[] = Array(arrayLength).fill(''); // Placeholder for the shortest unique substrings.
4
5 // Iterate over each string in the input array.
6 for (let i = 0; i < arrayLength; ++i) {
7 const stringLength: number = arr[i].length; // Length of the current string.
8
9 // Iterate over all possible substring lengths.
10 for (let currentLength = 1; currentLength <= stringLength && result[i] === ''; ++currentLength) {
11 // Iterate over all possible starting indices for the current substring length.
12 for (let startIdx = 0; startIdx <= stringLength - currentLength; ++startIdx) {
13 const substring: string = arr[i].slice(startIdx, startIdx + currentLength); // Extract the substring.
14
15 // If we haven't found a substring yet or the current one is lexicographically smaller.
16 if (result[i] === '' || substring.localeCompare(result[i]) < 0) {
17 let isUnique: boolean = true; // Flag to check if the substring is unique.
18
19 // Check the uniqueness of the substring across all strings in the array.
20 for (let k = 0; k < arrayLength && isUnique; ++k) {
21 if (k !== i && arr[k].includes(substring)) {
22 isUnique = false; // Substring is not unique.
23 }
24 }
25
26 // If the substring is unique, set it as the result for the current string.
27 if (isUnique) {
28 result[i] = substring;
29 }
30 }
31 }
32 }
33 }
34
35 return result; // Return the array containing the shortest unique substrings.
36}
37
Time and Space Complexity
Time Complexity
The time complexity of the provided code can be elaborated as follows:
-
We have
n
strings in our arrayarr
, wheren
is the length of the array. -
For each string
s
of maximum lengthm
, we are looking at all possible substrings. The total number of such substrings that can be formed from a string of lengthm
ism * (m + 1) / 2
, which is derived from the sum of the series1 + 2 + 3 + ... + m
. -
For each substring
sub
, we check if it exists in any of the other strings withn - 1
comparisons and each comparison could take up tom
operations in the worst case. -
Combining the above points, we have
n
(for each string) timesm * (m + 1) / 2
(for all substrings in a particular string) timesn - 1
(for checking each substring against all other strings) timesm
(for the comparison operation), yielding a worst case complexity ofO(n * (m^2) * (n-1) * m) => O(n^2 * m^4)
.
Space Complexity
The space complexity can be determined as follows:
-
The code uses an extra list
ans
to store the shortest unique substrings, which will be the same size as the input arrayarr
. Hence, the space taken by this list isO(n)
. -
Within each loop iteration, a temporary substring
sub
is created, which, in the worst case, has the lengthm
. Since these strings are not stored together and only exist temporarily one at a time, the space for this isO(m)
. -
Considering the fact that the space required for storing the input array
arr
is not included in the space complexity of the algorithm (as it is given), the significant extra space used is for theans
list and the temporary substrings, leading to a total space complexity ofO(n + m)
, which isO(m)
since the problem constraints guaranteem <= 20
, and thusm
will not exceedn
.
Therefore, given that n
is the length of the input array of strings and m
is the maximum length of a single string within the array (m <= 20
), the time complexity of the code is O(n^2 * m^4)
and the space complexity is O(m)
.
Learn more about how to find time and space complexity quickly using problem constraints.
Which of the following uses divide and conquer strategy?
Recommended Readings
Trie Introduction Imagine you have a magic dictionary where as soon as you start spelling a word it immediately tells you how many words that start with those letters How cool would that be This is exactly the kind of magic a data structure called trie performs What's a Trie
LeetCode Patterns Your Personal Dijkstra's Algorithm to Landing Your Dream Job The goal of AlgoMonster is to help you get a job in the shortest amount of time possible in a data driven way We compiled datasets of tech interview problems and broke them down by patterns This way we
Recursion Recursion is one of the most important concepts in computer science Simply speaking recursion is the process of a function calling itself Using a real life analogy imagine a scenario where you invite your friends to lunch https algomonster s3 us east 2 amazonaws com recursion jpg You first
Want a Structured Path to Master System Design Too? Don’t Miss This!