1813. Sentence Similarity III
Problem Description
You are given two strings sentence1
and sentence2
, each representing a sentence composed of words. A sentence is a list of words that are separated by a single space with no leading or trailing spaces. Each word consists of only uppercase and lowercase English characters.
Two sentences are considered similar if you can insert an arbitrary sentence (possibly empty) inside one of these sentences to make them equal. The key requirement is that the inserted sentence must be separated from existing words by spaces.
For example:
sentence1 = "Hello Jane"
andsentence2 = "Hello my name is Jane"
are similar because you can insert"my name is"
between"Hello"
and"Jane"
in sentence1 to make them equal.sentence1 = "Frog cool"
andsentence2 = "Frogs are cool"
are not similar. Even though you could think of inserting"s are"
into sentence1, this would require modifying the word"Frog"
to"Frogs"
, which violates the rule that inserted text must be separated by spaces.
The task is to determine if sentence1
and sentence2
are similar according to this definition. Return true
if they are similar, otherwise return false
.
The solution uses a two-pointer approach. After splitting both sentences into word arrays, it checks how many words match from the beginning and from the end of both sentences. If the total number of matching words from both ends covers the entire shorter sentence, then the sentences are similar (since the remaining words in the longer sentence could be the "inserted" portion).
Intuition
The key insight is that if two sentences are similar, one sentence can be obtained from the other by inserting a continuous block of words at some position. This means the beginning and/or ending portions of both sentences must match.
Think about it this way: if we can insert words into the shorter sentence to make it equal to the longer sentence, then the shorter sentence must be a "subsequence" of the longer sentence where the matching parts are either at the beginning, at the end, or split between both ends.
For instance, if we have "A B C D E"
and "A B E"
, we can see that "A B"
matches from the start and "E"
matches from the end. The words "C D"
in the middle of the longer sentence are what we would need to insert into the shorter sentence between "B"
and "E"
.
This leads us to a two-pointer strategy:
- Count how many words match from the beginning of both sentences (let's call this
i
) - Count how many words match from the end of both sentences (let's call this
j
) - If
i + j >= n
(wheren
is the length of the shorter sentence), then the sentences are similar
Why does i + j >= n
work? Because if the sum of matching words from both ends covers or exceeds the total words in the shorter sentence, it means all words in the shorter sentence have a corresponding match in the longer sentence, with possibly some words in between that could be the "inserted" portion. The >=
handles the case where the matching portions might overlap, which would happen when the sentences are identical or when the insertion point is at the very beginning or end.
Learn more about Two Pointers patterns.
Solution Approach
The implementation follows a two-pointer approach to check if the sentences are similar:
Step 1: Parse and Prepare
- Split both sentences into word arrays using
split()
:words1
andwords2
- Get the lengths
m
andn
of the two word arrays - To simplify the logic, ensure
words1
is always the longer array by swapping if necessary (ifm < n
, swap the arrays and their lengths)
Step 2: Match from the Beginning
- Initialize pointer
i = 0
- While
i < n
andwords1[i] == words2[i]
, incrementi
- This counts how many words match from the start of both sentences
Step 3: Match from the End
- Initialize pointer
j = 0
- While
j < n
andwords1[m - 1 - j] == words2[n - 1 - j]
, incrementj
- This counts how many words match from the end of both sentences
- Note: We use
m - 1 - j
andn - 1 - j
to access elements from the end
Step 4: Check Similarity Condition
- Return
i + j >= n
- If the total number of matching words from both ends covers the entire shorter sentence, then the sentences are similar
- The
>=
handles cases where the matching portions might overlap (when sentences are identical or when the insertion is at the very beginning or end)
Example Walkthrough:
sentence1 = "Hello Jane"
,sentence2 = "Hello my name is Jane"
- After splitting:
words1 = ["Hello", "Jane"]
,words2 = ["Hello", "my", "name", "is", "Jane"]
- After ensuring longer array:
words1 = ["Hello", "my", "name", "is", "Jane"]
,words2 = ["Hello", "Jane"]
- Matching from start:
i = 1
(only "Hello" matches) - Matching from end:
j = 1
(only "Jane" matches) - Check:
i + j = 2 >= 2
(length of shorter sentence), so returntrue
Ready to land your dream job?
Unlock your dream job with a 5-minute evaluator for a personalized learning plan!
Start EvaluatorExample Walkthrough
Let's walk through a concrete example to understand how the solution works.
Example:
sentence1 = "My name is Haley"
sentence2 = "My Haley"
Step 1: Parse and Prepare
- Split into words:
words1 = ["My", "name", "is", "Haley"]
(length m = 4)words2 = ["My", "Haley"]
(length n = 2)
- Since m > n, no swap needed (words1 remains the longer array)
Step 2: Match from the Beginning
- Start with
i = 0
- Compare
words1[0]
withwords2[0]
: "My" == "My" β - Increment i to 1
- Compare
words1[1]
withwords2[1]
: "name" != "Haley" β - Stop here, so
i = 1
Step 3: Match from the End
- Start with
j = 0
- Compare
words1[3]
withwords2[1]
: "Haley" == "Haley" β - Increment j to 1
- Compare
words1[2]
withwords2[0]
: "is" != "My" β - Stop here, so
j = 1
Step 4: Check Similarity
- Total matches:
i + j = 1 + 1 = 2
- Length of shorter sentence:
n = 2
- Since
2 >= 2
, returntrue
This makes sense because we can insert "name is" between "My" and "Haley" in sentence2 to get sentence1. The algorithm correctly identifies that all words in the shorter sentence have matching positions in the longer sentence.
Solution Implementation
1class Solution:
2 def areSentencesSimilar(self, sentence1: str, sentence2: str) -> bool:
3 """
4 Check if two sentences are similar by determining if one can be obtained
5 from the other by inserting a sentence in the middle.
6
7 Args:
8 sentence1: First sentence as a string
9 sentence2: Second sentence as a string
10
11 Returns:
12 True if sentences are similar, False otherwise
13 """
14 # Split sentences into word lists
15 words1 = sentence1.split()
16 words2 = sentence2.split()
17
18 # Get lengths of both word lists
19 len1 = len(words1)
20 len2 = len(words2)
21
22 # Ensure words1 is the longer list for easier comparison
23 if len1 < len2:
24 words1, words2 = words2, words1
25 len1, len2 = len2, len1
26
27 # Count matching words from the beginning (prefix)
28 prefix_match_count = 0
29 while prefix_match_count < len2 and words1[prefix_match_count] == words2[prefix_match_count]:
30 prefix_match_count += 1
31
32 # Count matching words from the end (suffix)
33 suffix_match_count = 0
34 while suffix_match_count < len2 and words1[len1 - 1 - suffix_match_count] == words2[len2 - 1 - suffix_match_count]:
35 suffix_match_count += 1
36
37 # Check if the total matches cover the entire shorter sentence
38 # This means the shorter sentence can be formed by removing middle part of longer sentence
39 return prefix_match_count + suffix_match_count >= len2
40
1class Solution {
2 public boolean areSentencesSimilar(String sentence1, String sentence2) {
3 // Split both sentences into word arrays
4 String[] wordsArray1 = sentence1.split(" ");
5 String[] wordsArray2 = sentence2.split(" ");
6
7 // Ensure wordsArray1 is the longer array for consistent comparison
8 if (wordsArray1.length < wordsArray2.length) {
9 String[] temp = wordsArray1;
10 wordsArray1 = wordsArray2;
11 wordsArray2 = temp;
12 }
13
14 // Store lengths for easier access
15 int longerLength = wordsArray1.length;
16 int shorterLength = wordsArray2.length;
17
18 // Count matching words from the beginning (prefix)
19 int prefixMatchCount = 0;
20 while (prefixMatchCount < shorterLength &&
21 wordsArray1[prefixMatchCount].equals(wordsArray2[prefixMatchCount])) {
22 prefixMatchCount++;
23 }
24
25 // Count matching words from the end (suffix)
26 int suffixMatchCount = 0;
27 while (suffixMatchCount < shorterLength &&
28 wordsArray1[longerLength - 1 - suffixMatchCount].equals(
29 wordsArray2[shorterLength - 1 - suffixMatchCount])) {
30 suffixMatchCount++;
31 }
32
33 // Check if the total matched words cover the entire shorter sentence
34 // This means a sentence can be inserted in the middle of the longer one
35 return prefixMatchCount + suffixMatchCount >= shorterLength;
36 }
37}
38
1class Solution {
2public:
3 bool areSentencesSimilar(string sentence1, string sentence2) {
4 // Split both sentences into word vectors
5 vector<string> words1 = splitString(sentence1, ' ');
6 vector<string> words2 = splitString(sentence2, ' ');
7
8 // Ensure words1 is the longer sentence for easier comparison
9 if (words1.size() < words2.size()) {
10 swap(words1, words2);
11 }
12
13 int longerSize = words1.size();
14 int shorterSize = words2.size();
15
16 // Count matching words from the beginning
17 int prefixMatchCount = 0;
18 while (prefixMatchCount < shorterSize &&
19 words1[prefixMatchCount] == words2[prefixMatchCount]) {
20 ++prefixMatchCount;
21 }
22
23 // Count matching words from the end
24 int suffixMatchCount = 0;
25 while (suffixMatchCount < shorterSize &&
26 words1[longerSize - 1 - suffixMatchCount] == words2[shorterSize - 1 - suffixMatchCount]) {
27 ++suffixMatchCount;
28 }
29
30 // Check if the shorter sentence can be formed by inserting a sentence
31 // in the middle of itself to get the longer sentence
32 // This happens when prefix + suffix matches cover all words in shorter sentence
33 return prefixMatchCount + suffixMatchCount >= shorterSize;
34 }
35
36private:
37 // Helper function to split a string by delimiter into a vector of words
38 vector<string> splitString(const string& str, char delimiter) {
39 stringstream stringStream(str);
40 string word;
41 vector<string> result;
42
43 // Extract words separated by the delimiter
44 while (getline(stringStream, word, delimiter)) {
45 result.emplace_back(word);
46 }
47
48 return result;
49 }
50};
51
1/**
2 * Determines if two sentences are similar by checking if one can be made equal to the other
3 * by inserting a sentence (possibly empty) in the middle.
4 *
5 * @param sentence1 - First sentence string
6 * @param sentence2 - Second sentence string
7 * @returns true if sentences are similar, false otherwise
8 */
9function areSentencesSimilar(sentence1: string, sentence2: string): boolean {
10 // Split sentences into word arrays
11 const words1: string[] = sentence1.split(' ');
12 const words2: string[] = sentence2.split(' ');
13
14 // Get the lengths of both word arrays
15 const length1: number = words1.length;
16 const length2: number = words2.length;
17
18 // Ensure words1 is the shorter array for consistent processing
19 if (length1 > length2) {
20 return areSentencesSimilar(sentence2, sentence1);
21 }
22
23 // Initialize pointers for matching from left and right
24 let leftMatchCount: number = 0;
25 let rightMatchCount: number = 0;
26
27 // Iterate through the longer sentence to find matching prefixes and suffixes
28 for (let i = 0; i < length2; i++) {
29 // Count matching words from the left side
30 if (leftMatchCount === i && words1[i] === words2[i]) {
31 leftMatchCount++;
32 }
33
34 // Count matching words from the right side
35 if (rightMatchCount === i && words2[length2 - i - 1] === words1[length1 - rightMatchCount - 1]) {
36 rightMatchCount++;
37 }
38 }
39
40 // Check if the total matched words cover the entire shorter sentence
41 return leftMatchCount + rightMatchCount >= length1;
42}
43
Time and Space Complexity
The time complexity is O(L)
, where L
is the sum of the lengths of the two sentences. This is because:
- Splitting both sentences takes
O(L)
time as we need to traverse through all characters to identify word boundaries - The two while loops together iterate through at most
n
words (the length of the shorter word list), and comparing words involves character-by-character comparison - In the worst case, we compare all characters in the sentences, which is bounded by
O(L)
The space complexity is O(L)
, where L
is the sum of the lengths of the two sentences. This is because:
- The
split()
operation creates two new listswords1
andwords2
that store all the words from both sentences - The total space needed to store these word lists is proportional to the total number of characters in both sentences
- The additional variables
m
,n
,i
, andj
only useO(1)
extra space
Learn more about how to find time and space complexity quickly.
Common Pitfalls
Pitfall 1: Overcounting When Prefix and Suffix Overlap
The Problem:
When the prefix and suffix matching regions overlap, we might count the same words twice. For example, if sentence1 = "A B C"
and sentence2 = "A B C"
, the prefix match would count all 3 words, and the suffix match would also count all 3 words, giving us prefix_match_count + suffix_match_count = 6
, which is greater than the actual length of 3.
Why This Isn't Actually a Problem:
The current solution handles this correctly by checking prefix_match_count + suffix_match_count >= len2
. The >=
operator ensures that even if we overcount due to overlap, the condition still evaluates correctly. When sentences are identical or when the insertion point is at the very beginning or end, overlap is expected and handled properly.
Pitfall 2: Not Handling Empty Sentences
The Problem: If either sentence is empty or contains only whitespace, the split operation might produce unexpected results or the algorithm might not handle edge cases correctly.
Solution: Add explicit checks for empty or whitespace-only strings:
def areSentencesSimilar(self, sentence1: str, sentence2: str) -> bool:
# Handle edge cases with empty or whitespace strings
if not sentence1.strip() or not sentence2.strip():
return sentence1.strip() == sentence2.strip()
words1 = sentence1.split()
words2 = sentence2.split()
# ... rest of the code remains the same
Pitfall 3: Incorrect Index Calculation When Matching from End
The Problem:
A common mistake is using incorrect indices when matching from the end. For instance, using words1[len1 - j]
instead of words1[len1 - 1 - j]
would cause an off-by-one error, potentially leading to index out of bounds or incorrect comparisons.
Solution: Always remember that for 0-indexed arrays:
- The last element is at index
length - 1
- The j-th element from the end is at index
length - 1 - j
# Correct way to access from the end while suffix_match_count < len2 and words1[len1 - 1 - suffix_match_count] == words2[len2 - 1 - suffix_match_count]: suffix_match_count += 1
Pitfall 4: Not Ensuring Consistent Array Order
The Problem:
If you forget to swap the arrays when len1 < len2
, the algorithm logic breaks because it assumes words1
is always the longer (or equal length) array. This could lead to incorrect results or index out of bounds errors.
Solution: Always ensure the arrays are ordered consistently before processing:
# Essential swap to maintain the invariant that words1 is longer if len1 < len2: words1, words2 = words2, words1 len1, len2 = len2, len1
Pitfall 5: Misunderstanding the Problem Requirements
The Problem: A common misunderstanding is thinking that word modifications are allowed (like changing "Frog" to "Frogs"). The problem strictly requires that insertions must be complete words separated by spaces, not modifications to existing words.
Solution: The two-pointer approach naturally handles this correctly by only comparing complete words. No partial word matching or modification is performed - words must match exactly or they don't match at all.
Which data structure is used to implement priority queue?
Recommended Readings
Tech Interview Pattern Two Pointers Introduction If you prefer videos here's a super quick introduction to Two Pointers div class responsive iframe iframe src https www youtube com embed xZ4AfXHQ1VQ title YouTube video player frameborder 0 allow accelerometer autoplay clipboard write encrypted media gyroscope picture in picture allowfullscreen iframe div Two pointers is a common interview
Coding Interview Patterns Your Personal Dijkstra's Algorithm to Landing Your Dream Job The goal of AlgoMonster is to help you get a job in the shortest amount of time possible in a data driven way We compiled datasets of tech interview problems and broke them down by patterns This way
Recursion Recursion is one of the most important concepts in computer science Simply speaking recursion is the process of a function calling itself Using a real life analogy imagine a scenario where you invite your friends to lunch https assets algo monster recursion jpg You first call Ben and ask
Want a Structured Path to Master System Design Too? Donβt Miss This!