Facebook Pixel

1061. Lexicographically Smallest Equivalent String

Problem Description

You are given two strings s1 and s2 of equal length, and another string baseStr. The characters at the same positions in s1 and s2 define equivalence relationships between characters.

For instance, if s1 = "abc" and s2 = "cde", then:

  • s1[0] = 'a' is equivalent to s2[0] = 'c' (so 'a' == 'c')
  • s1[1] = 'b' is equivalent to s2[1] = 'd' (so 'b' == 'd')
  • s1[2] = 'c' is equivalent to s2[2] = 'e' (so 'c' == 'e')

These equivalence relationships follow three fundamental properties:

  • Reflexivity: Every character is equivalent to itself ('a' == 'a')
  • Symmetry: If 'a' == 'b', then 'b' == 'a'
  • Transitivity: If 'a' == 'b' and 'b' == 'c', then 'a' == 'c'

Due to transitivity, equivalence relationships can form groups. In the example above, since 'a' == 'c' and 'c' == 'e', we can deduce that 'a' == 'e'. This means characters 'a', 'c', and 'e' all belong to the same equivalence group.

Your task is to transform baseStr by replacing each character with the lexicographically smallest character from its equivalence group.

For example, with s1 = "abc", s2 = "cde", and baseStr = "eed":

  • 'e' belongs to the group {'a', 'c', 'e'}, so it gets replaced with 'a' (the smallest)
  • 'e' gets replaced with 'a' again
  • 'd' belongs to the group {'b', 'd'}, so it gets replaced with 'b' (the smallest)
  • The result is "aab"

The solution uses Union Find to efficiently group equivalent characters and always maintains the lexicographically smallest character as the representative of each group. When processing baseStr, each character is replaced with its group's representative (smallest) character.

Quick Interview Experience
Help others by sharing your interview experience
Have you seen this problem before?

Intuition

The key insight is recognizing that this problem is about grouping characters based on equivalence relationships. When we see that characters can be equivalent to each other with transitivity properties, it naturally points to a graph connectivity problem where characters are nodes and equivalence relationships are edges.

Consider what happens when we establish equivalences: if 'a' == 'c' and 'c' == 'e', then all three characters 'a', 'c', and 'e' must be in the same group. This is exactly what happens in connected components of a graph - all nodes that can reach each other belong to the same component.

Union Find (Disjoint Set Union) is the perfect data structure for this because:

  1. It efficiently groups elements that are related
  2. It can quickly tell us which group any element belongs to
  3. It handles the transitivity property automatically

The clever part is how we handle the "lexicographically smallest" requirement. Instead of finding all members of each group and then selecting the smallest one, we can optimize by always making the smaller character the representative (parent) during union operations. When we union two groups, we compare their current representatives and always make the smaller one the parent of the larger one.

For implementation, we can map each lowercase letter to indices 0-25. The parent array p initially has each element pointing to itself (p[i] = i). As we process the equivalence pairs from s1 and s2, we union the corresponding character groups. The find operation with path compression ensures efficient lookups, and our custom union logic (always making the smaller index the parent) guarantees that the root of each group is always the lexicographically smallest character in that group.

Finally, for each character in baseStr, we simply find its group representative (which is guaranteed to be the smallest character in its equivalence group) and use that in our result string.

Learn more about Union Find patterns.

Solution Approach

The solution implements Union Find with a specific optimization to maintain lexicographical order. Let's walk through the implementation step by step:

1. Initialize the Union Find Structure

p = list(range(26))

We create a parent array where each index represents a lowercase letter (0 for 'a', 1 for 'b', etc.). Initially, each character is its own parent, meaning each character forms its own group.

2. Build Equivalence Groups

for a, b in zip(s1, s2):
    x, y = ord(a) - ord("a"), ord(b) - ord("a")
    px, py = find(x), find(y)
    if px < py:
        p[py] = px
    else:
        p[px] = py

For each pair of equivalent characters from s1 and s2:

  • Convert characters to indices ('a' → 0, 'b' → 1, etc.)
  • Find the root representatives of both characters using the find function
  • Union the two groups by making the smaller root the parent of the larger root. This ensures the smallest character in any group is always the root.

3. Find Operation with Path Compression

def find(x: int) -> int:
    if p[x] != x:
        p[x] = find(p[x])
    return p[x]

This recursive function finds the root of a character's group. Path compression (p[x] = find(p[x])) optimizes future lookups by directly connecting each node to its root during the search.

4. Transform the Base String

return "".join(chr(find(ord(c) - ord("a")) + ord("a")) for c in baseStr)

For each character in baseStr:

  • Convert the character to its index
  • Find its group's root (which is the smallest character in the group)
  • Convert the root index back to a character
  • Join all transformed characters to form the result

Time Complexity: O(n × α(26)) where n is the length of s1 (or s2), and α is the inverse Ackermann function. Since we're dealing with at most 26 characters, this is effectively O(n).

Space Complexity: O(1) since the parent array has a fixed size of 26 regardless of input size.

The beauty of this approach is that by always making the smaller-indexed character the parent during union operations, we guarantee that find(x) always returns the lexicographically smallest character in the equivalence group of character x.

Ready to land your dream job?

Unlock your dream job with a 5-minute evaluator for a personalized learning plan!

Start Evaluator

Example Walkthrough

Let's walk through a concrete example with s1 = "abc", s2 = "cde", and baseStr = "eed".

Step 1: Initialize Union Find

p = [0, 1, 2, 3, 4, ..., 25]

Each index represents a letter ('a'=0, 'b'=1, 'c'=2, 'd'=3, 'e'=4, etc.). Initially, each character is its own parent.

Step 2: Process equivalence pairs from s1 and s2

First pair: 'a' and 'c' (s1[0]='a', s2[0]='c')

  • Convert to indices: 'a'=0, 'c'=2
  • Find roots: find(0)=0, find(2)=2
  • Since 0 < 2, make p[2]=0
  • Updated p: [0, 1, 0, 3, 4, ...] (now 'c' points to 'a')

Second pair: 'b' and 'd' (s1[1]='b', s2[1]='d')

  • Convert to indices: 'b'=1, 'd'=3
  • Find roots: find(1)=1, find(3)=3
  • Since 1 < 3, make p[3]=1
  • Updated p: [0, 1, 0, 1, 4, ...] (now 'd' points to 'b')

Third pair: 'c' and 'e' (s1[2]='c', s2[2]='e')

  • Convert to indices: 'c'=2, 'e'=4
  • Find roots: find(2)=0 (follows 2→0), find(4)=4
  • Since 0 < 4, make p[4]=0
  • Updated p: [0, 1, 0, 1, 0, ...] (now 'e' points to 'a')

Step 3: Transform baseStr = "eed"

Transform 'e':

  • Index of 'e' = 4
  • find(4) = 0 (follows 4→0)
  • Character at index 0 = 'a'

Transform 'e' (second occurrence):

  • Index of 'e' = 4
  • find(4) = 0
  • Character at index 0 = 'a'

Transform 'd':

  • Index of 'd' = 3
  • find(3) = 1 (follows 3→1)
  • Character at index 1 = 'b'

Final Result: "aab"

The equivalence groups formed are:

  • Group 1: {'a', 'c', 'e'} with representative 'a'
  • Group 2: {'b', 'd'} with representative 'b'

Each character in baseStr gets replaced with its group's smallest (representative) character.

Solution Implementation

1class Solution:
2    def smallestEquivalentString(self, s1: str, s2: str, baseStr: str) -> str:
3        """
4        Find the lexicographically smallest equivalent string using Union-Find.
5      
6        Given equivalence relationships between characters in s1 and s2,
7        return the lexicographically smallest equivalent string for baseStr.
8      
9        Args:
10            s1: First string defining equivalence relationships
11            s2: Second string defining equivalence relationships (s1[i] ~ s2[i])
12            baseStr: The base string to transform
13          
14        Returns:
15            The lexicographically smallest equivalent string
16        """
17      
18        def find(char_index: int) -> int:
19            """
20            Find the root parent of a character using path compression.
21          
22            Args:
23                char_index: Index of the character (0-25 for 'a'-'z')
24              
25            Returns:
26                The root parent index of the character
27            """
28            if parent[char_index] != char_index:
29                # Path compression: directly connect to root
30                parent[char_index] = find(parent[char_index])
31            return parent[char_index]
32      
33        # Initialize parent array where each character is its own parent
34        # Index 0 represents 'a', 1 represents 'b', etc.
35        parent = list(range(26))
36      
37        # Build equivalence relationships from s1 and s2
38        for char1, char2 in zip(s1, s2):
39            # Convert characters to indices (0-25)
40            index1 = ord(char1) - ord('a')
41            index2 = ord(char2) - ord('a')
42          
43            # Find root parents of both characters
44            root1 = find(index1)
45            root2 = find(index2)
46          
47            # Union by rank: always attach larger root to smaller root
48            # This ensures lexicographically smallest parent
49            if root1 < root2:
50                parent[root2] = root1
51            else:
52                parent[root1] = root2
53      
54        # Transform baseStr to its lexicographically smallest equivalent
55        result_chars = []
56        for char in baseStr:
57            # Find the root parent (smallest equivalent character)
58            char_index = ord(char) - ord('a')
59            smallest_index = find(char_index)
60            # Convert back to character and add to result
61            result_chars.append(chr(smallest_index + ord('a')))
62      
63        return ''.join(result_chars)
64
1class Solution {
2    // Parent array for Union-Find data structure
3    // Each index represents a character (0-25 for 'a'-'z')
4    private final int[] parent = new int[26];
5
6    public String smallestEquivalentString(String s1, String s2, String baseStr) {
7        // Initialize Union-Find: each character is its own parent initially
8        for (int i = 0; i < parent.length; i++) {
9            parent[i] = i;
10        }
11      
12        // Build equivalence relationships between characters
13        // Characters at the same position in s1 and s2 are equivalent
14        for (int i = 0; i < s1.length(); i++) {
15            // Convert characters to indices (0-25)
16            int charIndex1 = s1.charAt(i) - 'a';
17            int charIndex2 = s2.charAt(i) - 'a';
18          
19            // Find the root parents of both characters
20            int root1 = find(charIndex1);
21            int root2 = find(charIndex2);
22          
23            // Union operation: always attach the larger root to the smaller root
24            // This ensures the lexicographically smallest character becomes the root
25            if (root1 < root2) {
26                parent[root2] = root1;
27            } else {
28                parent[root1] = root2;
29            }
30        }
31      
32        // Build the result string by replacing each character with its smallest equivalent
33        char[] resultChars = baseStr.toCharArray();
34        for (int i = 0; i < resultChars.length; i++) {
35            // Find the root parent (smallest equivalent) for each character
36            int charIndex = resultChars[i] - 'a';
37            int smallestEquivalent = find(charIndex);
38            resultChars[i] = (char) ('a' + smallestEquivalent);
39        }
40      
41        return String.valueOf(resultChars);
42    }
43
44    /**
45     * Find operation with path compression
46     * Returns the root parent of the given character index
47     * @param charIndex the index of the character (0-25)
48     * @return the root parent index
49     */
50    private int find(int charIndex) {
51        // Path compression: directly connect to root parent for optimization
52        if (parent[charIndex] != charIndex) {
53            parent[charIndex] = find(parent[charIndex]);
54        }
55        return parent[charIndex];
56    }
57}
58
1class Solution {
2public:
3    string smallestEquivalentString(string s1, string s2, string baseStr) {
4        // Initialize parent array for Union-Find (Disjoint Set Union)
5        // Each character initially points to itself as parent
6        vector<int> parent(26);
7        iota(parent.begin(), parent.end(), 0);
8      
9        // Find function with path compression
10        // Recursively finds the root parent of a character and compresses the path
11        auto findRoot = [&](this auto&& findRoot, int x) -> int {
12            if (parent[x] != x) {
13                // Path compression: directly connect x to its root parent
14                parent[x] = findRoot(parent[x]);
15            }
16            return parent[x];
17        };
18      
19        // Build equivalence groups by processing each pair of equivalent characters
20        for (int i = 0; i < s1.length(); ++i) {
21            // Convert characters to indices (0-25)
22            int charIndex1 = s1[i] - 'a';
23            int charIndex2 = s2[i] - 'a';
24          
25            // Find root parents of both characters
26            int root1 = findRoot(charIndex1);
27            int root2 = findRoot(charIndex2);
28          
29            // Union operation: always make the lexicographically smaller root as parent
30            // This ensures the smallest character becomes the representative of the group
31            if (root1 < root2) {
32                parent[root2] = root1;
33            } else {
34                parent[root1] = root2;
35            }
36        }
37      
38        // Build result string by replacing each character with its smallest equivalent
39        string result;
40        for (char ch : baseStr) {
41            // Find the root parent (smallest equivalent) of current character
42            int smallestEquivalent = findRoot(ch - 'a');
43            // Convert index back to character and append to result
44            result.push_back('a' + smallestEquivalent);
45        }
46      
47        return result;
48    }
49};
50
1/**
2 * Finds the lexicographically smallest equivalent string using Union-Find algorithm
3 * @param s1 - First string defining character equivalences
4 * @param s2 - Second string defining character equivalences (same length as s1)
5 * @param baseStr - The string to transform using the equivalence rules
6 * @returns The lexicographically smallest equivalent string
7 */
8function smallestEquivalentString(s1: string, s2: string, baseStr: string): string {
9    // Initialize parent array for Union-Find structure
10    // Each character initially points to itself (0-25 representing 'a'-'z')
11    const parent: number[] = Array.from({ length: 26 }, (_, index) => index);
12
13    /**
14     * Find operation with path compression for Union-Find
15     * @param x - The character index to find root for
16     * @returns The root parent of the character
17     */
18    const find = (x: number): number => {
19        // Path compression: directly connect x to its root
20        if (parent[x] !== x) {
21            parent[x] = find(parent[x]);
22        }
23        return parent[x];
24    };
25
26    // Process each pair of equivalent characters from s1 and s2
27    for (let i = 0; i < s1.length; i++) {
28        // Convert characters to indices (0-25)
29        const charIndex1: number = s1.charCodeAt(i) - 'a'.charCodeAt(0);
30        const charIndex2: number = s2.charCodeAt(i) - 'a'.charCodeAt(0);
31      
32        // Find roots of both characters
33        const root1: number = find(charIndex1);
34        const root2: number = find(charIndex2);
35      
36        // Union operation: always attach larger root to smaller root
37        // This ensures lexicographically smallest representative
38        if (root1 < root2) {
39            parent[root2] = root1;
40        } else {
41            parent[root1] = root2;
42        }
43    }
44
45    // Build result string by finding smallest equivalent for each character
46    const resultChars: string[] = [];
47    for (let i = 0; i < baseStr.length; i++) {
48        // Convert character to index
49        const charIndex: number = baseStr.charCodeAt(i) - 'a'.charCodeAt(0);
50        // Find root (smallest equivalent) and convert back to character
51        const smallestEquivalent: string = String.fromCharCode('a'.charCodeAt(0) + find(charIndex));
52        resultChars.push(smallestEquivalent);
53    }
54  
55    // Join all characters to form the final string
56    return resultChars.join('');
57}
58

Time and Space Complexity

The time complexity is O((n + m) × α(26)) which simplifies to O(n + m) where n is the length of strings s1 and s2, m is the length of baseStr, and α(26) is the inverse Ackermann function which is effectively constant for practical values.

The reference answer states O((n + m) × log|Σ|) where |Σ| = 26, giving O((n + m) × log 26). However, this appears to be an overestimate. The code uses Union-Find with path compression but without union by rank. Each find operation with path compression has amortized time complexity of O(α(26)), where α is the inverse Ackermann function, which is effectively O(1) for all practical purposes.

Breaking down the operations:

  • The initialization of p takes O(26) = O(1) time
  • The loop over zip(s1, s2) runs n times, and each iteration performs:
    • Two find operations: O(α(26)) each
    • One union operation (updating parent): O(1)
    • Total per iteration: O(α(26))
  • The final string construction iterates through baseStr of length m, performing one find operation per character: O(m × α(26))

Therefore, the overall time complexity is O(n × α(26) + m × α(26)) = O((n + m) × α(26)) ≈ O(n + m).

The space complexity is O(26) = O(1) or O(|Σ|) where |Σ| = 26 is the size of the character set. The algorithm uses a fixed-size array p of length 26 to store the parent pointers for the Union-Find structure, regardless of the input size.

Learn more about how to find time and space complexity quickly.

Common Pitfalls

1. Incorrect Union Operation - Not Maintaining Lexicographical Order

The Pitfall: A common mistake is to perform the union operation without considering which character should be the parent. Some implementations might randomly assign one as the parent of the other:

# INCORRECT - doesn't guarantee smallest character as root
def union(x, y):
    px, py = find(x), find(y)
    parent[px] = py  # Always making py the parent - WRONG!

This breaks the invariant that the root should always be the lexicographically smallest character in the group.

Why It Fails: Consider s1 = "ab", s2 = "ba", baseStr = "b":

  • If we union 'a' and 'b' incorrectly (making 'b' the parent of 'a')
  • When we look up 'b', we get 'b' instead of 'a'
  • Result: "b" instead of the correct "a"

The Solution: Always compare the roots and make the smaller one the parent:

# CORRECT - ensures smallest character is always the root
px, py = find(x), find(y)
if px < py:
    parent[py] = px  # Smaller becomes parent
else:
    parent[px] = py  # Smaller becomes parent

2. Forgetting Path Compression

The Pitfall: Implementing find without path compression:

# INEFFICIENT - no path compression
def find(x):
    while parent[x] != x:
        x = parent[x]
    return x

Why It's Problematic: Without path compression, chains of parent pointers can become long, leading to O(n) find operations in the worst case. While this won't cause incorrect results, it significantly impacts performance.

The Solution: Always implement path compression to flatten the tree structure:

# EFFICIENT - with path compression
def find(x):
    if parent[x] != x:
        parent[x] = find(parent[x])  # Path compression
    return parent[x]

3. Not Handling Self-Loops Properly

The Pitfall: Some might think they need special handling when a character appears in both s1 and s2 at the same position:

# UNNECESSARY - adding special case
for char1, char2 in zip(s1, s2):
    if char1 == char2:
        continue  # Skip if same - UNNECESSARY!
    # ... union logic

Why It's Wrong: The Union-Find structure naturally handles self-loops. When char1 == char2, finding and unioning the same element is a no-op and doesn't break anything. Adding special cases makes the code more complex without benefit.

The Solution: Let the Union-Find handle it naturally - no special cases needed. The algorithm works correctly whether characters are the same or different.

Discover Your Strengths and Weaknesses: Take Our 5-Minute Quiz to Tailor Your Study Plan:

Problem: Given a list of tasks and a list of requirements, compute a sequence of tasks that can be performed, such that we complete every task once while satisfying all the requirements.

Which of the following method should we use to solve this problem?


Recommended Readings

Want a Structured Path to Master System Design Too? Don’t Miss This!

Load More