Facebook Pixel

734. Sentence Similarity 🔒

EasyArrayHash TableString
Leetcode Link

Problem Description

You are given two sentences represented as arrays of words, and a list of similar word pairs. Your task is to determine if the two sentences are similar based on specific rules.

Input:

  • sentence1: An array of strings representing the first sentence
  • sentence2: An array of strings representing the second sentence
  • similarPairs: An array of string pairs [xi, yi] where each pair indicates that words xi and yi are similar

Rules for sentence similarity:

  1. Both sentences must have the same length (same number of words)
  2. For each position i, the words sentence1[i] and sentence2[i] must be similar

Important notes about word similarity:

  • A word is always similar to itself (e.g., "happy" is similar to "happy")
  • If [x, y] is in similarPairs, then x is similar to y AND y is similar to x (symmetric relation)
  • Similarity is NOT transitive: if a is similar to b, and b is similar to c, it does NOT mean a is similar to c

Example: If sentence1 = ["I", "am", "happy", "with", "leetcode"] and sentence2 = ["I", "am", "joyful", "with", "leetcode"], and similarPairs = [["happy", "joyful"]], then the sentences are similar because:

  • They have the same length (5 words)
  • Words at positions 0, 1, 3, 4 are identical
  • Words at position 2 ("happy" and "joyful") are similar according to similarPairs

Return true if the sentences are similar according to these rules, false otherwise.

Quick Interview Experience
Help others by sharing your interview experience
Have you seen this problem before?

Intuition

The key insight is that we need to check if corresponding words at each position in both sentences are either identical or form a similar pair.

Since we need to frequently check if two words form a similar pair, we should preprocess the similarPairs into a data structure that allows fast lookups. A hash set is perfect for this - we can check if a pair exists in O(1) time.

For the similarity check, we need to consider:

  1. Words can be identical - if sentence1[i] == sentence2[i], they're automatically similar
  2. Words can be explicitly similar - we need to check if the pair (sentence1[i], sentence2[i]) exists in our similar pairs
  3. Similarity is symmetric - if (a, b) is a similar pair, then (b, a) is also similar. So we need to check both (x, y) and (y, x) in our set

The approach becomes straightforward:

  1. First, check if sentences have the same length - if not, they can't be similar
  2. Store all similar pairs in a set as tuples for O(1) lookup
  3. For each position, check if the words are either:
    • The same word (automatically similar), OR
    • Form a similar pair in either direction (x, y) or (y, x)
  4. If any position fails both checks, the sentences aren't similar

The beauty of this solution is its simplicity - we're essentially just validating a series of conditions position by position, using a hash set to make the similarity lookups efficient. The time complexity is O(n + p) where n is the sentence length and p is the number of similar pairs.

Solution Approach

The implementation follows a straightforward approach using a hash table (set) for efficient similarity lookups.

Step 1: Length Check

if len(sentence1) != len(sentence2):
    return False

First, we verify that both sentences have the same length. If they don't, they cannot be similar by definition, so we immediately return False.

Step 2: Build Similarity Set

s = {(x, y) for x, y in similarPairs}

We create a set s containing all similar word pairs as tuples. Using a set gives us O(1) average-case lookup time. We store each pair as (x, y) exactly as given in similarPairs. Note that we're only storing pairs in one direction here.

Step 3: Compare Words at Each Position

for x, y in zip(sentence1, sentence2):
    if x != y and (x, y) not in s and (y, x) not in s:
        return False

We iterate through both sentences simultaneously using zip(sentence1, sentence2). For each pair of words (x, y) at the same position:

  • First, we check if x != y (words are different)
  • If they are different, we need to verify they form a similar pair
  • We check both (x, y) not in s and (y, x) not in s to handle the symmetric nature of similarity
  • If the words are different AND neither (x, y) nor (y, x) exists in our similarity set, the sentences are not similar, so we return False

The condition effectively says: "If words are different and not similar in either direction, return False."

Step 4: Return True

return True

If we complete the iteration without finding any mismatched pairs, all corresponding words are either identical or similar, so we return True.

Time Complexity: O(n + p) where n is the length of the sentences and p is the number of similar pairs. We spend O(p) building the set and O(n) comparing the sentences.

Space Complexity: O(p) for storing the similarity pairs in the set.

Ready to land your dream job?

Unlock your dream job with a 3-minute evaluator for a personalized learning plan!

Start Evaluator

Example Walkthrough

Let's walk through a concrete example to illustrate the solution approach:

Input:

  • sentence1 = ["great", "acting", "skills"]
  • sentence2 = ["fine", "drama", "talent"]
  • similarPairs = [["great", "fine"], ["drama", "acting"], ["skills", "talent"]]

Step 1: Length Check

  • len(sentence1) = 3
  • len(sentence2) = 3
  • Lengths are equal ✓, continue to next step

Step 2: Build Similarity Set Create set s from similar pairs:

s = {("great", "fine"), ("drama", "acting"), ("skills", "talent")}

Step 3: Compare Words at Each Position

Position 0: Compare "great" and "fine"

  • Are they equal? No ("great" != "fine")
  • Check if ("great", "fine") in s? Yes
  • Words are similar, continue

Position 1: Compare "acting" and "drama"

  • Are they equal? No ("acting" != "drama")
  • Check if ("acting", "drama") in s? No
  • Check if ("drama", "acting") in s? Yes ✓ (symmetric check)
  • Words are similar, continue

Position 2: Compare "skills" and "talent"

  • Are they equal? No ("skills" != "talent")
  • Check if ("skills", "talent") in s? Yes
  • Words are similar, continue

Step 4: Return Result All corresponding word pairs are similar, so return True.


Counter-example to illustrate non-transitivity: If we had:

  • sentence1 = ["a", "b", "c"]
  • sentence2 = ["a", "d", "e"]
  • similarPairs = [["b", "d"], ["d", "e"]]

Even though "b" is similar to "d" and "d" is similar to "e", this doesn't make "b" similar to "e". At position 2, we'd compare "c" and "e" - they're not equal and ("c", "e") is not in our similarity pairs, so the sentences would NOT be similar.

Solution Implementation

1class Solution:
2    def areSentencesSimilar(
3        self, sentence1: List[str], sentence2: List[str], similarPairs: List[List[str]]
4    ) -> bool:
5        # Check if sentences have different lengths - they cannot be similar
6        if len(sentence1) != len(sentence2):
7            return False
8      
9        # Create a set of similar word pairs for O(1) lookup
10        # Store both (word1, word2) tuples from the similarPairs list
11        similar_pairs_set = {(word1, word2) for word1, word2 in similarPairs}
12      
13        # Compare corresponding words in both sentences
14        for word1, word2 in zip(sentence1, sentence2):
15            # Words are considered similar if:
16            # 1. They are identical, OR
17            # 2. The pair (word1, word2) exists in similar_pairs_set, OR
18            # 3. The pair (word2, word1) exists in similar_pairs_set (symmetric relation)
19            if word1 != word2 and (word1, word2) not in similar_pairs_set and (word2, word1) not in similar_pairs_set:
20                return False
21      
22        # All corresponding words are similar
23        return True
24
1class Solution {
2    /**
3     * Checks if two sentences are similar based on given similarity pairs.
4     * Two sentences are similar if:
5     * 1. They have the same length
6     * 2. Each word pair at the same position is either identical or exists in similarPairs
7     * 
8     * @param sentence1 First sentence as array of words
9     * @param sentence2 Second sentence as array of words
10     * @param similarPairs List of word pairs that are considered similar
11     * @return true if sentences are similar, false otherwise
12     */
13    public boolean areSentencesSimilar(
14            String[] sentence1, String[] sentence2, List<List<String>> similarPairs) {
15      
16        // Check if sentences have the same length
17        if (sentence1.length != sentence2.length) {
18            return false;
19        }
20      
21        // Create a set of similar pairs for O(1) lookup
22        Set<List<String>> similarPairSet = new HashSet<>();
23        for (List<String> pair : similarPairs) {
24            similarPairSet.add(pair);
25        }
26      
27        // Compare each word pair at the same position
28        for (int i = 0; i < sentence1.length; i++) {
29            String word1 = sentence1[i];
30            String word2 = sentence2[i];
31          
32            // Check if words are identical or form a similar pair (in either direction)
33            boolean isIdentical = word1.equals(word2);
34            boolean isSimilarForward = similarPairSet.contains(List.of(word1, word2));
35            boolean isSimilarBackward = similarPairSet.contains(List.of(word2, word1));
36          
37            if (!isIdentical && !isSimilarForward && !isSimilarBackward) {
38                return false;
39            }
40        }
41      
42        return true;
43    }
44}
45
1class Solution {
2public:
3    bool areSentencesSimilar(vector<string>& sentence1, vector<string>& sentence2, vector<vector<string>>& similarPairs) {
4        // Two sentences can only be similar if they have the same length
5        if (sentence1.size() != sentence2.size()) {
6            return false;
7        }
8      
9        // Create a hash set to store all similar word pairs for O(1) lookup
10        // Use "#" as delimiter to create unique keys for each pair
11        unordered_set<string> similarPairSet;
12      
13        // Add both (word1, word2) and (word2, word1) to handle bidirectional similarity
14        for (const auto& pair : similarPairs) {
15            similarPairSet.insert(pair[0] + "#" + pair[1]);
16            similarPairSet.insert(pair[1] + "#" + pair[0]);
17        }
18      
19        // Check each word pair at the same position in both sentences
20        for (int i = 0; i < sentence1.size(); ++i) {
21            // Words must either be identical or form a similar pair
22            if (sentence1[i] != sentence2[i] && 
23                !similarPairSet.count(sentence1[i] + "#" + sentence2[i])) {
24                return false;
25            }
26        }
27      
28        // All word pairs are either identical or similar
29        return true;
30    }
31};
32
1/**
2 * Checks if two sentences are similar based on word pairs similarity rules.
3 * Two sentences are similar if they have the same length and each corresponding
4 * pair of words is either identical or forms a similar pair.
5 * 
6 * @param sentence1 - First sentence represented as an array of words
7 * @param sentence2 - Second sentence represented as an array of words  
8 * @param similarPairs - Array of word pairs that are considered similar
9 * @returns true if sentences are similar, false otherwise
10 */
11function areSentencesSimilar(
12    sentence1: string[],
13    sentence2: string[],
14    similarPairs: string[][],
15): boolean {
16    // Sentences must have the same length to be similar
17    if (sentence1.length !== sentence2.length) {
18        return false;
19    }
20  
21    // Create a set to store similarity relationships for O(1) lookup
22    // Store both directions (x->y and y->x) since similarity is bidirectional
23    const similaritySet = new Set<string>();
24  
25    // Populate the similarity set with both directions of each pair
26    for (const [word1, word2] of similarPairs) {
27        // Use '#' as delimiter to create unique keys for word pairs
28        similaritySet.add(`${word1}#${word2}`);
29        similaritySet.add(`${word2}#${word1}`);
30    }
31  
32    // Check each corresponding word pair in both sentences
33    for (let index = 0; index < sentence1.length; index++) {
34        const currentWord1 = sentence1[index];
35        const currentWord2 = sentence2[index];
36      
37        // Words must be either identical or form a similar pair
38        if (currentWord1 !== currentWord2 && 
39            !similaritySet.has(`${currentWord1}#${currentWord2}`)) {
40            return false;
41        }
42    }
43  
44    // All word pairs are either identical or similar
45    return true;
46}
47

Time and Space Complexity

Time Complexity: O(L)

The algorithm consists of:

  1. Creating a set from similarPairs: This requires iterating through all pairs and hashing each string pair. If there are p pairs and the average string length is l, this takes O(p * l) time for hashing operations.
  2. Comparing sentences: We iterate through min(len(sentence1), len(sentence2)) words. For each pair of words (x, y):
    • String comparison x != y takes O(l) time
    • Set lookups (x, y) not in s and (y, x) not in s each take O(l) time for hashing

Since the total length of all strings in the problem is L, and we process each character a constant number of times (during hashing and comparisons), the overall time complexity is O(L).

Space Complexity: O(L)

The space is used for:

  1. The set s which stores all similar pairs. Each pair consists of two strings, and storing all pairs requires space proportional to the total length of all strings in similarPairs.
  2. The hash table implementation of the set needs to store the actual string data.

Therefore, the space complexity is O(L), where L is the sum of the lengths of all strings in the problem.

Learn more about how to find time and space complexity quickly.

Common Pitfalls

1. Assuming Transitivity of Similarity

One of the most common mistakes is treating similarity as a transitive relation. Consider this scenario:

Pitfall Example:

# INCORRECT approach - assuming transitivity
def areSentencesSimilar_wrong(sentence1, sentence2, similarPairs):
    if len(sentence1) != len(sentence2):
        return False
  
    # Building a graph to find all "connected" similar words
    graph = defaultdict(set)
    for x, y in similarPairs:
        graph[x].add(y)
        graph[y].add(x)
  
    # Wrong: Using DFS/BFS to find if words are connected through similarity chain
    def are_connected(word1, word2, visited):
        if word1 == word2:
            return True
        visited.add(word1)
        for neighbor in graph[word1]:
            if neighbor not in visited and are_connected(neighbor, word2, visited):
                return True
        return False
  
    for word1, word2 in zip(sentence1, sentence2):
        if not are_connected(word1, word2, set()):
            return False
    return True

Why it's wrong: If we have pairs [["a", "b"], ["b", "c"]], this approach would incorrectly conclude that "a" and "c" are similar, but according to the problem, similarity is NOT transitive.

Correct Solution: Only check direct similarity relationships:

# Check only direct pairs, not chains
if word1 != word2 and (word1, word2) not in similar_pairs_set and (word2, word1) not in similar_pairs_set:
    return False

2. Forgetting Symmetric Property

Another common mistake is only checking similarity in one direction.

Pitfall Example:

# INCORRECT - only checking one direction
similar_pairs_set = {(word1, word2) for word1, word2 in similarPairs}

for word1, word2 in zip(sentence1, sentence2):
    # Wrong: Only checking (word1, word2), not (word2, word1)
    if word1 != word2 and (word1, word2) not in similar_pairs_set:
        return False

Why it's wrong: If similarPairs = [["happy", "joyful"]] and we're comparing "joyful" with "happy", we'd miss the similarity because we're only checking ("joyful", "happy") but the set only contains ("happy", "joyful").

Correct Solution: Check both directions:

if word1 != word2 and (word1, word2) not in similar_pairs_set and (word2, word1) not in similar_pairs_set:
    return False

Alternative Correct Solution: Add both directions to the set initially:

similar_pairs_set = set()
for word1, word2 in similarPairs:
    similar_pairs_set.add((word1, word2))
    similar_pairs_set.add((word2, word1))  # Add reverse pair

# Now we only need to check one direction
for word1, word2 in zip(sentence1, sentence2):
    if word1 != word2 and (word1, word2) not in similar_pairs_set:
        return False

3. Not Handling Self-Similarity Properly

Some implementations might overcomplicate the check for identical words.

Pitfall Example:

# Overcomplicated - adding self-pairs unnecessarily
similar_pairs_set = {(word1, word2) for word1, word2 in similarPairs}
# Adding all words to be similar to themselves
for word in sentence1 + sentence2:
    similar_pairs_set.add((word, word))

Why it's inefficient: This adds unnecessary overhead. The check word1 != word2 naturally handles self-similarity since identical words will pass this condition.

Correct Solution: Simply check equality first:

if word1 != word2 and (word1, word2) not in similar_pairs_set and (word2, word1) not in similar_pairs_set:
    return False

This approach implicitly handles self-similarity because when word1 == word2, the condition short-circuits and doesn't check the set at all.

Discover Your Strengths and Weaknesses: Take Our 3-Minute Quiz to Tailor Your Study Plan:

Which of these pictures shows the visit order of a depth-first search?


Recommended Readings

Want a Structured Path to Master System Design Too? Don’t Miss This!

Load More