1408. String Matching in an Array
Problem Description
You are given an array of strings called words
. Your task is to find all strings in this array that appear as a substring within at least one other string in the same array.
A substring is a contiguous sequence of characters within a string. For example, "abc" is a substring of "zabcd", but "acd" is not.
The goal is to return a list containing all strings from words
that can be found inside another string from the same array. The strings in your result can be returned in any order.
For example, if words = ["mass", "as", "hero", "superhero"]
:
- "as" is a substring of "mass"
- "hero" is a substring of "superhero"
- So the output would be
["as", "hero"]
The solution iterates through each word in the array and checks if it appears as a substring in any other word (excluding itself). If a word is found to be a substring of another word, it gets added to the answer list.
The implementation uses a nested approach where for each word at index i
, it checks all other words at different indices j
(where i != j
) to see if the current word appears within them using Python's in
operator for substring checking.
Intuition
The key insight here is that we need to identify which strings are "contained" within other strings. Since we're looking for substrings, the most straightforward approach is to check each string against all other strings in the array.
Think of it this way: for each word, we want to answer the question "Does this word appear inside any other word in the array?" If the answer is yes, then this word should be included in our result.
The natural approach that comes to mind is:
- Pick a word from the array
- Check if this word appears as a substring in any of the other words
- If it does, add it to our answer
Why does this work? Because we're essentially performing an exhaustive search - we're not missing any potential substring relationships by checking every possible pair.
The condition i != j
is crucial because we don't want to compare a word with itself (a word is always a substring of itself, but that's not what the problem is asking for).
Using Python's in
operator makes the substring check elegant and simple - s in t
directly tells us if string s
is a substring of string t
. The any()
function helps us determine if at least one other word contains our current word as a substring, which is exactly what we need to decide whether to include it in our answer.
This brute force approach works well for this problem because we need to examine all possible pairs anyway to ensure we don't miss any substring relationships.
Solution Approach
Following the brute force enumeration strategy, we iterate through each string in the words
array and check if it appears as a substring in any other string.
Here's the step-by-step implementation:
-
Initialize an empty result list: We create
ans = []
to store all strings that are substrings of other strings. -
Enumerate through all strings: We use
enumerate(words)
to get both the indexi
and the strings
for each word in the array. The index is crucial to avoid comparing a string with itself. -
Check substring relationship: For each string
s
at indexi
, we need to check if it appears in any other string in the array. This is accomplished using:any(i != j and s in t for j, t in enumerate(words))
This expression does the following:
- Iterates through all strings again with index
j
and stringt
- Checks two conditions:
i != j
: Ensures we don't compare the string with itselfs in t
: Checks if strings
is a substring of stringt
- The
any()
function returnsTrue
if at least one stringt
(wherej != i
) containss
as a substring
- Iterates through all strings again with index
-
Add to result: If the
any()
function returnsTrue
, meaning the current strings
is found as a substring in at least one other string, we append it to our answer list:ans.append(s)
-
Return the result: After checking all strings, we return the
ans
list containing all strings that are substrings of other strings.
The time complexity is O(n² × m)
where n
is the number of words and m
is the average length of the strings, since we check each pair of strings and the substring check takes O(m)
time. The space complexity is O(k)
where k
is the number of strings that are substrings of others.
Ready to land your dream job?
Unlock your dream job with a 5-minute evaluator for a personalized learning plan!
Start EvaluatorExample Walkthrough
Let's walk through a small example with words = ["abc", "ab", "c", "abcd"]
.
Step 1: Initialize result list
ans = []
Step 2: Check first word "abc" (index 0)
- Compare with "ab" (index 1): Is "abc" in "ab"? No
- Compare with "c" (index 2): Is "abc" in "c"? No
- Compare with "abcd" (index 3): Is "abc" in "abcd"? Yes!
- Since we found a match, add "abc" to result:
ans = ["abc"]
Step 3: Check second word "ab" (index 1)
- Compare with "abc" (index 0): Is "ab" in "abc"? Yes!
- Since we found a match, add "ab" to result:
ans = ["abc", "ab"]
Step 4: Check third word "c" (index 2)
- Compare with "abc" (index 0): Is "c" in "abc"? Yes!
- Since we found a match, add "c" to result:
ans = ["abc", "ab", "c"]
Step 5: Check fourth word "abcd" (index 3)
- Compare with "abc" (index 0): Is "abcd" in "abc"? No
- Compare with "ab" (index 1): Is "abcd" in "ab"? No
- Compare with "c" (index 2): Is "abcd" in "c"? No
- No matches found, so "abcd" is not added to result
Final Result: ["abc", "ab", "c"]
All three strings "abc", "ab", and "c" appear as substrings in at least one other string in the array. The string "abcd" is the only one that doesn't appear as a substring in any other string.
Solution Implementation
1class Solution:
2 def stringMatching(self, words: List[str]) -> List[str]:
3 """
4 Find all strings in the array that are substrings of another string.
5
6 Args:
7 words: List of strings to check
8
9 Returns:
10 List of strings that are substrings of at least one other string in the array
11 """
12 result = []
13
14 # Iterate through each word with its index
15 for current_index, current_word in enumerate(words):
16 # Check if current word is a substring of any other word in the list
17 is_substring = False
18 for other_index, other_word in enumerate(words):
19 # Skip comparing the word with itself
20 if current_index != other_index:
21 # Check if current word is a substring of other word
22 if current_word in other_word:
23 is_substring = True
24 break
25
26 # If current word is a substring of at least one other word, add to result
27 if is_substring:
28 result.append(current_word)
29
30 return result
31
1class Solution {
2 /**
3 * Find all strings that are substrings of another string in the array.
4 *
5 * @param words Array of strings to check
6 * @return List of strings that appear as substrings in other strings
7 */
8 public List<String> stringMatching(String[] words) {
9 // Initialize result list to store substring matches
10 List<String> result = new ArrayList<>();
11
12 // Get the total number of words in the array
13 int wordCount = words.length;
14
15 // Check each word to see if it's a substring of any other word
16 for (int currentIndex = 0; currentIndex < wordCount; ++currentIndex) {
17 // Compare current word with all other words in the array
18 for (int compareIndex = 0; compareIndex < wordCount; ++compareIndex) {
19 // Skip if comparing the same word with itself
20 // Check if current word is a substring of the compared word
21 if (currentIndex != compareIndex && words[compareIndex].contains(words[currentIndex])) {
22 // Add to result if it's a substring of another word
23 result.add(words[currentIndex]);
24 // Break inner loop since we only need to find one match
25 break;
26 }
27 }
28 }
29
30 return result;
31 }
32}
33
1class Solution {
2public:
3 vector<string> stringMatching(vector<string>& words) {
4 // Result vector to store all substring words
5 vector<string> result;
6
7 // Get the total number of words in the input vector
8 int wordCount = words.size();
9
10 // Iterate through each word as a potential substring
11 for (int i = 0; i < wordCount; ++i) {
12 // Check if current word is a substring of any other word
13 for (int j = 0; j < wordCount; ++j) {
14 // Skip if comparing the same word with itself
15 // Check if words[i] is a substring of words[j]
16 if (i != j && words[j].find(words[i]) != string::npos) {
17 // Found words[i] as substring, add to result
18 result.push_back(words[i]);
19 // Break inner loop as we only need to add once
20 break;
21 }
22 }
23 }
24
25 return result;
26 }
27};
28
1/**
2 * Finds all strings in the array that are substrings of another string in the array
3 * @param words - Array of strings to check
4 * @returns Array of strings that are substrings of other strings in the input array
5 */
6function stringMatching(words: string[]): string[] {
7 // Initialize result array to store substring matches
8 const result: string[] = [];
9
10 // Get the total number of words
11 const wordCount: number = words.length;
12
13 // Iterate through each word as a potential substring
14 for (let i = 0; i < wordCount; ++i) {
15 // Check if current word is a substring of any other word
16 for (let j = 0; j < wordCount; ++j) {
17 // Skip if comparing the same word with itself
18 // Add to result if current word is found within another word
19 if (i !== j && words[j].includes(words[i])) {
20 result.push(words[i]);
21 // Break inner loop once a match is found to avoid duplicates
22 break;
23 }
24 }
25 }
26
27 return result;
28}
29
Time and Space Complexity
Time Complexity: O(n² × m)
where n
is the number of words and m
is the average length of the words.
The outer loop iterates through all n
words. For each word, the any()
function checks against all other words (up to n-1
comparisons). Each substring check using the in
operator takes O(m)
time in the worst case, where m
is the average length of the strings being compared. Therefore, the total time complexity is O(n × n × m) = O(n² × m)
.
Since the reference answer states O(n³)
, this suggests that n
represents the total length of all strings combined (or treats the length of individual strings as proportional to n
). In this interpretation, if we consider n
as the total character count across all words, then the number of words would be at most n
, and each substring operation would be O(n)
, giving us O(n × n × n) = O(n³)
.
Space Complexity: O(n)
where n
is either the number of words or the total length of the output list.
The space is primarily used by the ans
list to store the results. In the worst case, all words except one could be substrings of another word, so the output list could contain up to n-1
elements. The enumerate function and loop variables use O(1)
additional space. Therefore, the space complexity is O(n)
.
Learn more about how to find time and space complexity quickly.
Common Pitfalls
1. Not Handling Duplicate Strings in the Array
One common pitfall is when the input array contains duplicate strings. If we have words = ["abc", "abc", "xyzabc"]
, a naive implementation might incorrectly identify "abc" as a substring of itself (the other "abc" in the array).
Why this happens: When checking if a string is a substring of others, we only verify i != j
to avoid self-comparison. However, if the same string appears multiple times at different indices, the condition i != j
would pass, and "abc" would match with its duplicate.
Solution: While the current implementation actually handles this correctly (since we want to find strings that appear as substrings in any other string, including duplicates), you should be aware that:
- If duplicates exist and a string only appears as a substring of its own duplicates, it will still be included in the result
- If you need to exclude such cases, you'd need to deduplicate the array first or add additional logic
2. Inefficient Repeated Substring Checks
The current implementation might check the same substring relationship multiple times unnecessarily, especially when dealing with larger arrays.
Why this happens: The nested loop structure checks every pair of strings, which can be redundant when we've already found that a string is a substring of another.
Solution: Use early termination with a break
statement (as shown in the provided code) or use a set to track already-found substrings:
def stringMatching(self, words: List[str]) -> List[str]:
result = set() # Use set to avoid duplicates
for i in range(len(words)):
for j in range(len(words)):
if i != j and words[i] in words[j]:
result.add(words[i])
break # Early termination once found
return list(result)
3. Not Considering Empty Strings or Single Character Edge Cases
The solution might behave unexpectedly with edge cases like empty strings ["", "abc", "def"]
or single characters ["a", "aa", "aaa"]
.
Why this happens: An empty string is technically a substring of every string, and single characters might have special substring relationships that aren't immediately obvious.
Solution: Add validation for edge cases if needed:
def stringMatching(self, words: List[str]) -> List[str]:
result = []
for i, current_word in enumerate(words):
# Skip empty strings if needed
if not current_word:
continue
for j, other_word in enumerate(words):
if i != j and current_word in other_word:
result.append(current_word)
break
return result
Consider the classic dynamic programming of longest increasing subsequence:
Find the length of the longest subsequence of a given sequence such that all elements of the subsequence are sorted in increasing order.
For example, the length of LIS for [50, 3, 10, 7, 40, 80]
is 4
and LIS is
[3, 7, 40, 80]
.
What is the recurrence relation?
Recommended Readings
Coding Interview Patterns Your Personal Dijkstra's Algorithm to Landing Your Dream Job The goal of AlgoMonster is to help you get a job in the shortest amount of time possible in a data driven way We compiled datasets of tech interview problems and broke them down by patterns This way
Recursion Recursion is one of the most important concepts in computer science Simply speaking recursion is the process of a function calling itself Using a real life analogy imagine a scenario where you invite your friends to lunch https assets algo monster recursion jpg You first call Ben and ask
Runtime Overview When learning about algorithms and data structures you'll frequently encounter the term time complexity This concept is fundamental in computer science and offers insights into how long an algorithm takes to complete given a certain input size What is Time Complexity Time complexity represents the amount of time
Want a Structured Path to Master System Design Too? Don’t Miss This!