734. Sentence Similarity 🔒
Problem Description
You are given two sentences represented as arrays of words, and a list of similar word pairs. Your task is to determine if the two sentences are similar based on specific rules.
Input:
sentence1
: An array of strings representing the first sentencesentence2
: An array of strings representing the second sentencesimilarPairs
: An array of string pairs[xi, yi]
where each pair indicates that wordsxi
andyi
are similar
Rules for sentence similarity:
- Both sentences must have the same length (same number of words)
- For each position
i
, the wordssentence1[i]
andsentence2[i]
must be similar
Important notes about word similarity:
- A word is always similar to itself (e.g., "happy" is similar to "happy")
- If
[x, y]
is insimilarPairs
, thenx
is similar toy
ANDy
is similar tox
(symmetric relation) - Similarity is NOT transitive: if
a
is similar tob
, andb
is similar toc
, it does NOT meana
is similar toc
Example:
If sentence1 = ["I", "am", "happy", "with", "leetcode"]
and sentence2 = ["I", "am", "joyful", "with", "leetcode"]
, and similarPairs = [["happy", "joyful"]]
, then the sentences are similar because:
- They have the same length (5 words)
- Words at positions 0, 1, 3, 4 are identical
- Words at position 2 ("happy" and "joyful") are similar according to
similarPairs
Return true
if the sentences are similar according to these rules, false
otherwise.
Intuition
The key insight is that we need to check if corresponding words at each position in both sentences are either identical or form a similar pair.
Since we need to frequently check if two words form a similar pair, we should preprocess the similarPairs
into a data structure that allows fast lookups. A hash set is perfect for this - we can check if a pair exists in O(1)
time.
For the similarity check, we need to consider:
- Words can be identical - if
sentence1[i] == sentence2[i]
, they're automatically similar - Words can be explicitly similar - we need to check if the pair
(sentence1[i], sentence2[i])
exists in our similar pairs - Similarity is symmetric - if
(a, b)
is a similar pair, then(b, a)
is also similar. So we need to check both(x, y)
and(y, x)
in our set
The approach becomes straightforward:
- First, check if sentences have the same length - if not, they can't be similar
- Store all similar pairs in a set as tuples for
O(1)
lookup - For each position, check if the words are either:
- The same word (automatically similar), OR
- Form a similar pair in either direction
(x, y)
or(y, x)
- If any position fails both checks, the sentences aren't similar
The beauty of this solution is its simplicity - we're essentially just validating a series of conditions position by position, using a hash set to make the similarity lookups efficient. The time complexity is O(n + p)
where n
is the sentence length and p
is the number of similar pairs.
Solution Approach
The implementation follows a straightforward approach using a hash table (set) for efficient similarity lookups.
Step 1: Length Check
if len(sentence1) != len(sentence2):
return False
First, we verify that both sentences have the same length. If they don't, they cannot be similar by definition, so we immediately return False
.
Step 2: Build Similarity Set
s = {(x, y) for x, y in similarPairs}
We create a set s
containing all similar word pairs as tuples. Using a set gives us O(1)
average-case lookup time. We store each pair as (x, y)
exactly as given in similarPairs
. Note that we're only storing pairs in one direction here.
Step 3: Compare Words at Each Position
for x, y in zip(sentence1, sentence2):
if x != y and (x, y) not in s and (y, x) not in s:
return False
We iterate through both sentences simultaneously using zip(sentence1, sentence2)
. For each pair of words (x, y)
at the same position:
- First, we check if
x != y
(words are different) - If they are different, we need to verify they form a similar pair
- We check both
(x, y) not in s
and(y, x) not in s
to handle the symmetric nature of similarity - If the words are different AND neither
(x, y)
nor(y, x)
exists in our similarity set, the sentences are not similar, so we returnFalse
The condition effectively says: "If words are different and not similar in either direction, return False."
Step 4: Return True
return True
If we complete the iteration without finding any mismatched pairs, all corresponding words are either identical or similar, so we return True
.
Time Complexity: O(n + p)
where n
is the length of the sentences and p
is the number of similar pairs. We spend O(p)
building the set and O(n)
comparing the sentences.
Space Complexity: O(p)
for storing the similarity pairs in the set.
Ready to land your dream job?
Unlock your dream job with a 3-minute evaluator for a personalized learning plan!
Start EvaluatorExample Walkthrough
Let's walk through a concrete example to illustrate the solution approach:
Input:
sentence1 = ["great", "acting", "skills"]
sentence2 = ["fine", "drama", "talent"]
similarPairs = [["great", "fine"], ["drama", "acting"], ["skills", "talent"]]
Step 1: Length Check
len(sentence1) = 3
len(sentence2) = 3
- Lengths are equal ✓, continue to next step
Step 2: Build Similarity Set
Create set s
from similar pairs:
s = {("great", "fine"), ("drama", "acting"), ("skills", "talent")}
Step 3: Compare Words at Each Position
Position 0: Compare "great" and "fine"
- Are they equal? No (
"great" != "fine"
) - Check if
("great", "fine")
in s? Yes ✓ - Words are similar, continue
Position 1: Compare "acting" and "drama"
- Are they equal? No (
"acting" != "drama"
) - Check if
("acting", "drama")
in s? No - Check if
("drama", "acting")
in s? Yes ✓ (symmetric check) - Words are similar, continue
Position 2: Compare "skills" and "talent"
- Are they equal? No (
"skills" != "talent"
) - Check if
("skills", "talent")
in s? Yes ✓ - Words are similar, continue
Step 4: Return Result
All corresponding word pairs are similar, so return True
.
Counter-example to illustrate non-transitivity: If we had:
sentence1 = ["a", "b", "c"]
sentence2 = ["a", "d", "e"]
similarPairs = [["b", "d"], ["d", "e"]]
Even though "b" is similar to "d" and "d" is similar to "e", this doesn't make "b" similar to "e". At position 2, we'd compare "c" and "e" - they're not equal and ("c", "e")
is not in our similarity pairs, so the sentences would NOT be similar.
Solution Implementation
1class Solution:
2 def areSentencesSimilar(
3 self, sentence1: List[str], sentence2: List[str], similarPairs: List[List[str]]
4 ) -> bool:
5 # Check if sentences have different lengths - they cannot be similar
6 if len(sentence1) != len(sentence2):
7 return False
8
9 # Create a set of similar word pairs for O(1) lookup
10 # Store both (word1, word2) tuples from the similarPairs list
11 similar_pairs_set = {(word1, word2) for word1, word2 in similarPairs}
12
13 # Compare corresponding words in both sentences
14 for word1, word2 in zip(sentence1, sentence2):
15 # Words are considered similar if:
16 # 1. They are identical, OR
17 # 2. The pair (word1, word2) exists in similar_pairs_set, OR
18 # 3. The pair (word2, word1) exists in similar_pairs_set (symmetric relation)
19 if word1 != word2 and (word1, word2) not in similar_pairs_set and (word2, word1) not in similar_pairs_set:
20 return False
21
22 # All corresponding words are similar
23 return True
24
1class Solution {
2 /**
3 * Checks if two sentences are similar based on given similarity pairs.
4 * Two sentences are similar if:
5 * 1. They have the same length
6 * 2. Each word pair at the same position is either identical or exists in similarPairs
7 *
8 * @param sentence1 First sentence as array of words
9 * @param sentence2 Second sentence as array of words
10 * @param similarPairs List of word pairs that are considered similar
11 * @return true if sentences are similar, false otherwise
12 */
13 public boolean areSentencesSimilar(
14 String[] sentence1, String[] sentence2, List<List<String>> similarPairs) {
15
16 // Check if sentences have the same length
17 if (sentence1.length != sentence2.length) {
18 return false;
19 }
20
21 // Create a set of similar pairs for O(1) lookup
22 Set<List<String>> similarPairSet = new HashSet<>();
23 for (List<String> pair : similarPairs) {
24 similarPairSet.add(pair);
25 }
26
27 // Compare each word pair at the same position
28 for (int i = 0; i < sentence1.length; i++) {
29 String word1 = sentence1[i];
30 String word2 = sentence2[i];
31
32 // Check if words are identical or form a similar pair (in either direction)
33 boolean isIdentical = word1.equals(word2);
34 boolean isSimilarForward = similarPairSet.contains(List.of(word1, word2));
35 boolean isSimilarBackward = similarPairSet.contains(List.of(word2, word1));
36
37 if (!isIdentical && !isSimilarForward && !isSimilarBackward) {
38 return false;
39 }
40 }
41
42 return true;
43 }
44}
45
1class Solution {
2public:
3 bool areSentencesSimilar(vector<string>& sentence1, vector<string>& sentence2, vector<vector<string>>& similarPairs) {
4 // Two sentences can only be similar if they have the same length
5 if (sentence1.size() != sentence2.size()) {
6 return false;
7 }
8
9 // Create a hash set to store all similar word pairs for O(1) lookup
10 // Use "#" as delimiter to create unique keys for each pair
11 unordered_set<string> similarPairSet;
12
13 // Add both (word1, word2) and (word2, word1) to handle bidirectional similarity
14 for (const auto& pair : similarPairs) {
15 similarPairSet.insert(pair[0] + "#" + pair[1]);
16 similarPairSet.insert(pair[1] + "#" + pair[0]);
17 }
18
19 // Check each word pair at the same position in both sentences
20 for (int i = 0; i < sentence1.size(); ++i) {
21 // Words must either be identical or form a similar pair
22 if (sentence1[i] != sentence2[i] &&
23 !similarPairSet.count(sentence1[i] + "#" + sentence2[i])) {
24 return false;
25 }
26 }
27
28 // All word pairs are either identical or similar
29 return true;
30 }
31};
32
1/**
2 * Checks if two sentences are similar based on word pairs similarity rules.
3 * Two sentences are similar if they have the same length and each corresponding
4 * pair of words is either identical or forms a similar pair.
5 *
6 * @param sentence1 - First sentence represented as an array of words
7 * @param sentence2 - Second sentence represented as an array of words
8 * @param similarPairs - Array of word pairs that are considered similar
9 * @returns true if sentences are similar, false otherwise
10 */
11function areSentencesSimilar(
12 sentence1: string[],
13 sentence2: string[],
14 similarPairs: string[][],
15): boolean {
16 // Sentences must have the same length to be similar
17 if (sentence1.length !== sentence2.length) {
18 return false;
19 }
20
21 // Create a set to store similarity relationships for O(1) lookup
22 // Store both directions (x->y and y->x) since similarity is bidirectional
23 const similaritySet = new Set<string>();
24
25 // Populate the similarity set with both directions of each pair
26 for (const [word1, word2] of similarPairs) {
27 // Use '#' as delimiter to create unique keys for word pairs
28 similaritySet.add(`${word1}#${word2}`);
29 similaritySet.add(`${word2}#${word1}`);
30 }
31
32 // Check each corresponding word pair in both sentences
33 for (let index = 0; index < sentence1.length; index++) {
34 const currentWord1 = sentence1[index];
35 const currentWord2 = sentence2[index];
36
37 // Words must be either identical or form a similar pair
38 if (currentWord1 !== currentWord2 &&
39 !similaritySet.has(`${currentWord1}#${currentWord2}`)) {
40 return false;
41 }
42 }
43
44 // All word pairs are either identical or similar
45 return true;
46}
47
Time and Space Complexity
Time Complexity: O(L)
The algorithm consists of:
- Creating a set from
similarPairs
: This requires iterating through all pairs and hashing each string pair. If there arep
pairs and the average string length isl
, this takesO(p * l)
time for hashing operations. - Comparing sentences: We iterate through
min(len(sentence1), len(sentence2))
words. For each pair of words(x, y)
:- String comparison
x != y
takesO(l)
time - Set lookups
(x, y) not in s
and(y, x) not in s
each takeO(l)
time for hashing
- String comparison
Since the total length of all strings in the problem is L
, and we process each character a constant number of times (during hashing and comparisons), the overall time complexity is O(L)
.
Space Complexity: O(L)
The space is used for:
- The set
s
which stores all similar pairs. Each pair consists of two strings, and storing all pairs requires space proportional to the total length of all strings insimilarPairs
. - The hash table implementation of the set needs to store the actual string data.
Therefore, the space complexity is O(L)
, where L
is the sum of the lengths of all strings in the problem.
Learn more about how to find time and space complexity quickly.
Common Pitfalls
1. Assuming Transitivity of Similarity
One of the most common mistakes is treating similarity as a transitive relation. Consider this scenario:
Pitfall Example:
# INCORRECT approach - assuming transitivity
def areSentencesSimilar_wrong(sentence1, sentence2, similarPairs):
if len(sentence1) != len(sentence2):
return False
# Building a graph to find all "connected" similar words
graph = defaultdict(set)
for x, y in similarPairs:
graph[x].add(y)
graph[y].add(x)
# Wrong: Using DFS/BFS to find if words are connected through similarity chain
def are_connected(word1, word2, visited):
if word1 == word2:
return True
visited.add(word1)
for neighbor in graph[word1]:
if neighbor not in visited and are_connected(neighbor, word2, visited):
return True
return False
for word1, word2 in zip(sentence1, sentence2):
if not are_connected(word1, word2, set()):
return False
return True
Why it's wrong: If we have pairs [["a", "b"], ["b", "c"]]
, this approach would incorrectly conclude that "a" and "c" are similar, but according to the problem, similarity is NOT transitive.
Correct Solution: Only check direct similarity relationships:
# Check only direct pairs, not chains if word1 != word2 and (word1, word2) not in similar_pairs_set and (word2, word1) not in similar_pairs_set: return False
2. Forgetting Symmetric Property
Another common mistake is only checking similarity in one direction.
Pitfall Example:
# INCORRECT - only checking one direction
similar_pairs_set = {(word1, word2) for word1, word2 in similarPairs}
for word1, word2 in zip(sentence1, sentence2):
# Wrong: Only checking (word1, word2), not (word2, word1)
if word1 != word2 and (word1, word2) not in similar_pairs_set:
return False
Why it's wrong: If similarPairs = [["happy", "joyful"]]
and we're comparing "joyful" with "happy", we'd miss the similarity because we're only checking ("joyful", "happy")
but the set only contains ("happy", "joyful")
.
Correct Solution: Check both directions:
if word1 != word2 and (word1, word2) not in similar_pairs_set and (word2, word1) not in similar_pairs_set: return False
Alternative Correct Solution: Add both directions to the set initially:
similar_pairs_set = set()
for word1, word2 in similarPairs:
similar_pairs_set.add((word1, word2))
similar_pairs_set.add((word2, word1)) # Add reverse pair
# Now we only need to check one direction
for word1, word2 in zip(sentence1, sentence2):
if word1 != word2 and (word1, word2) not in similar_pairs_set:
return False
3. Not Handling Self-Similarity Properly
Some implementations might overcomplicate the check for identical words.
Pitfall Example:
# Overcomplicated - adding self-pairs unnecessarily similar_pairs_set = {(word1, word2) for word1, word2 in similarPairs} # Adding all words to be similar to themselves for word in sentence1 + sentence2: similar_pairs_set.add((word, word))
Why it's inefficient: This adds unnecessary overhead. The check word1 != word2
naturally handles self-similarity since identical words will pass this condition.
Correct Solution: Simply check equality first:
if word1 != word2 and (word1, word2) not in similar_pairs_set and (word2, word1) not in similar_pairs_set: return False
This approach implicitly handles self-similarity because when word1 == word2
, the condition short-circuits and doesn't check the set at all.
Which of these pictures shows the visit order of a depth-first search?
Recommended Readings
Coding Interview Patterns Your Personal Dijkstra's Algorithm to Landing Your Dream Job The goal of AlgoMonster is to help you get a job in the shortest amount of time possible in a data driven way We compiled datasets of tech interview problems and broke them down by patterns This way
Recursion Recursion is one of the most important concepts in computer science Simply speaking recursion is the process of a function calling itself Using a real life analogy imagine a scenario where you invite your friends to lunch https assets algo monster recursion jpg You first call Ben and ask
Runtime Overview When learning about algorithms and data structures you'll frequently encounter the term time complexity This concept is fundamental in computer science and offers insights into how long an algorithm takes to complete given a certain input size What is Time Complexity Time complexity represents the amount of time
Want a Structured Path to Master System Design Too? Don’t Miss This!