Facebook Pixel

2157. Groups of Strings

HardBit ManipulationUnion FindString
Leetcode Link

Problem Description

You have an array of strings where each string contains only lowercase English letters, and no letter appears more than once in any single string.

Two strings are considered connected if you can transform the set of letters from one string to the other by performing exactly one of these operations:

  • Add one letter to the set
  • Delete one letter from the set
  • Replace one letter with any other letter (including the same letter)

Your task is to group these strings into connected components where:

  • Each string in a group is connected to at least one other string in the same group, OR
  • The group contains only a single string
  • No string in one group can be connected to any string in a different group

You need to return an array with two values:

  1. The total number of groups formed
  2. The size of the largest group

Example to understand connections:

  • "abc" and "ab" are connected (delete 'c')
  • "abc" and "abcd" are connected (add 'd')
  • "abc" and "abd" are connected (replace 'c' with 'd')

The solution uses bit manipulation to represent each string as a bitmask (where bit i represents whether character 'a' + i is present) and Union-Find to group connected strings. For each string, it checks all possible add/delete/replace operations by manipulating bits and unions the connected representations.

Quick Interview Experience
Help others by sharing your interview experience
Have you seen this problem before?

Intuition

The key insight is recognizing this as a graph connectivity problem where strings are nodes and "connected" relationships are edges. We need to find connected components in this graph.

Since each string contains unique letters only, we can think of each string as a set of characters. This naturally leads to using bit manipulation - we can represent each string as a bitmask where bit position i indicates whether character 'a' + i is present. For example, "abc" becomes 0...0111 (bits 0, 1, 2 are set).

Why is this representation powerful? Because the three connection operations translate elegantly to bit operations:

  • Adding a letter: Set a bit that was 0 → use OR operation with (1 << i)
  • Deleting a letter: Clear a bit that was 1 → use XOR operation with (1 << i)
  • Replacing a letter: Clear one bit and set another → combine delete and add operations

Instead of checking connections between all pairs of original strings (which would be expensive), we can:

  1. Convert each string to its bitmask representation
  2. For each bitmask, generate all possible connected bitmasks by:
    • Toggling each bit (handles add/delete operations)
    • For each set bit, trying to replace it with every unset bit

This transforms the problem into finding connected components among bitmasks, which is perfectly suited for Union-Find (Disjoint Set Union). We union bitmasks that are connected, and the Union-Find structure naturally tracks:

  • The number of disjoint sets (number of groups)
  • The size of each set (to find the maximum group size)

The brilliance is that identical strings automatically map to the same bitmask, so they're inherently in the same group without extra work. The Union-Find efficiently handles the merging of groups as we discover connections.

Learn more about Union Find patterns.

Solution Approach

The solution uses Union-Find (Disjoint Set Union) data structure with bit manipulation to efficiently group connected strings.

Step 1: Initialize Data Structures

  • p: Dictionary to store parent pointers for Union-Find
  • size: Counter to track the size of each group
  • n: Number of groups (initially equals number of words)
  • mx: Maximum group size

Step 2: Convert Strings to Bitmasks For each word, create a bitmask representation:

x = 0
for c in word:
    x |= 1 << (ord(c) - ord('a'))
  • Each bit position represents a letter ('a' at bit 0, 'b' at bit 1, etc.)
  • Set bit i if character 'a' + i is present
  • Store in Union-Find: p[x] = x (initially, each bitmask is its own parent)
  • Track size: size[x] += 1
  • If multiple words map to same bitmask, decrease group count: n -= 1

Step 3: Connect Related Bitmasks For each bitmask x, generate and union with all connected bitmasks:

  1. Handle Add/Delete Operations:

    for i in range(26):
        union(x, x ^ (1 << i))
    • x ^ (1 << i) toggles bit i (adds letter if absent, deletes if present)
  2. Handle Replace Operations:

    if (x >> i) & 1:  # if bit i is set
        for j in range(26):
            if ((x >> j) & 1) == 0:  # if bit j is not set
                union(x, x ^ (1 << i) | (1 << j))
    • For each set bit i, try replacing with every unset bit j
    • x ^ (1 << i) removes letter at position i
    • | (1 << j) adds letter at position j

Step 4: Union-Find Operations

The find function with path compression:

def find(x):
    if p[x] != x:
        p[x] = find(p[x])  # Path compression
    return p[x]

The union function merges groups:

def union(a, b):
    if b not in p:  # Skip if bitmask b doesn't exist
        return
    pa, pb = find(a), find(b)
    if pa == pb:  # Already in same group
        return
    p[pa] = pb  # Merge groups
    size[pb] += size[pa]  # Update size
    mx = max(mx, size[pb])  # Track maximum
    n -= 1  # Decrease number of groups

Step 5: Return Results

  • n: Final number of groups after all unions
  • mx: Maximum group size encountered

The time complexity is O(N × 26 × 26) where N is the number of unique bitmasks, and space complexity is O(N) for storing the Union-Find structure.

Ready to land your dream job?

Unlock your dream job with a 5-minute evaluator for a personalized learning plan!

Start Evaluator

Example Walkthrough

Let's walk through a small example with words = ["a", "b", "ab", "abc"]

Step 1: Convert to Bitmasks

  • "a" → bitmask = 001 (bit 0 set)
  • "b" → bitmask = 010 (bit 1 set)
  • "ab" → bitmask = 011 (bits 0,1 set)
  • "abc" → bitmask = 111 (bits 0,1,2 set)

Initial state: 4 separate groups, each of size 1.

Step 2: Process Connections for Each Bitmask

For bitmask 001 ("a"):

  • Add/Delete: Try 001 ^ (1<<0) = 000 (delete 'a'), 001 ^ (1<<1) = 011 (add 'b')
  • Union 001 with 011 (exists!) → Groups: {001, 011}, {010}, {111}
  • Replace: Can replace 'a' with any other letter, but none of those bitmasks exist

For bitmask 010 ("b"):

  • Add/Delete: Try 010 ^ (1<<0) = 011 (add 'a'), 010 ^ (1<<1) = 000 (delete 'b')
  • Union 010 with 011 (exists!) → Groups: {001, 011, 010}, {111}

For bitmask 011 ("ab"):

  • Add/Delete: Try 011 ^ (1<<2) = 111 (add 'c')
  • Union 011 with 111 (exists!) → Groups: {001, 011, 010, 111}
  • All 4 strings are now in one connected group!

For bitmask 111 ("abc"):

  • Already connected to the main group through previous operations

Final Result:

  • Number of groups: 1
  • Largest group size: 4

The key insight: "a" connects to "ab" (add 'b'), "b" connects to "ab" (add 'a'), and "ab" connects to "abc" (add 'c'), forming one large connected component.

Solution Implementation

1class Solution:
2    def groupStrings(self, words: List[str]) -> List[int]:
3        def find(node):
4            """Find the root parent of a node with path compression."""
5            if parent[node] != node:
6                parent[node] = find(parent[node])  # Path compression
7            return parent[node]
8
9        def union(node_a, node_b):
10            """Union two nodes if node_b exists in the parent dictionary."""
11            nonlocal max_group_size, num_groups
12          
13            # Check if node_b exists in our set
14            if node_b not in parent:
15                return
16          
17            # Find roots of both nodes
18            root_a = find(node_a)
19            root_b = find(node_b)
20          
21            # Already in the same group
22            if root_a == root_b:
23                return
24          
25            # Merge groups: make root_b the parent of root_a
26            parent[root_a] = root_b
27            group_size[root_b] += group_size[root_a]
28          
29            # Update maximum group size and decrease group count
30            max_group_size = max(max_group_size, group_size[root_b])
31            num_groups -= 1
32
33        # Initialize Union-Find structure
34        parent = {}  # Maps bitmask to its parent
35        group_size = Counter()  # Tracks size of each group
36        num_groups = len(words)  # Initially each word is its own group
37        max_group_size = 0
38      
39        # Convert each word to bitmask representation
40        for word in words:
41            bitmask = 0
42            for char in word:
43                bitmask |= 1 << (ord(char) - ord('a'))
44          
45            # Initialize parent and size for this bitmask
46            parent[bitmask] = bitmask
47            group_size[bitmask] += 1
48            max_group_size = max(max_group_size, group_size[bitmask])
49          
50            # If multiple words map to same bitmask, they're already grouped
51            if group_size[bitmask] > 1:
52                num_groups -= 1
53      
54        # Try all possible transformations for each unique bitmask
55        for bitmask in list(parent.keys()):
56            # Try adding/removing each letter (single bit flip)
57            for bit_pos in range(26):
58                union(bitmask, bitmask ^ (1 << bit_pos))
59              
60                # If this bit is set (letter exists), try replacing it
61                if (bitmask >> bit_pos) & 1:
62                    # Replace with any other letter not in the word
63                    for new_bit_pos in range(26):
64                        if ((bitmask >> new_bit_pos) & 1) == 0:
65                            # Remove old letter, add new letter
66                            new_bitmask = bitmask ^ (1 << bit_pos) | (1 << new_bit_pos)
67                            union(bitmask, new_bitmask)
68      
69        return [num_groups, max_group_size]
70
1class Solution {
2    // Union-Find parent mapping: key -> parent
3    private Map<Integer, Integer> parent;
4    // Size of each component (number of words in the group)
5    private Map<Integer, Integer> componentSize;
6    // Maximum size among all components
7    private int maxGroupSize;
8    // Number of connected components (groups)
9    private int numComponents;
10
11    public int[] groupStrings(String[] words) {
12        // Initialize data structures
13        parent = new HashMap<>();
14        componentSize = new HashMap<>();
15        numComponents = words.length;
16        maxGroupSize = 0;
17      
18        // Process each word and convert to bitmask representation
19        for (String word : words) {
20            int bitmask = 0;
21            // Convert word to bitmask where bit i represents presence of character ('a' + i)
22            for (char c : word.toCharArray()) {
23                bitmask |= 1 << (c - 'a');
24            }
25          
26            // Initialize Union-Find for this bitmask
27            parent.put(bitmask, bitmask);
28            // Increment count for duplicate words (same bitmask)
29            componentSize.put(bitmask, componentSize.getOrDefault(bitmask, 0) + 1);
30            maxGroupSize = Math.max(maxGroupSize, componentSize.get(bitmask));
31          
32            // If this is a duplicate word, decrease component count
33            if (componentSize.get(bitmask) > 1) {
34                numComponents--;
35            }
36        }
37      
38        // Connect words that can be transformed into each other
39        for (int currentMask : parent.keySet()) {
40            // Try all possible single character additions/deletions
41            for (int i = 0; i < 26; i++) {
42                // Connect with word formed by toggling bit i (add/delete character)
43                union(currentMask, currentMask ^ (1 << i));
44              
45                // If character i exists in current word
46                if (((currentMask >> i) & 1) != 0) {
47                    // Try replacing character i with each possible character j
48                    for (int j = 0; j < 26; j++) {
49                        // If character j doesn't exist in current word
50                        if (((currentMask >> j) & 1) == 0) {
51                            // Connect with word formed by replacing i with j
52                            union(currentMask, currentMask ^ (1 << i) | (1 << j));
53                        }
54                    }
55                }
56            }
57        }
58      
59        // Return [number of groups, size of largest group]
60        return new int[] {numComponents, maxGroupSize};
61    }
62
63    /**
64     * Find operation with path compression for Union-Find
65     * @param x the element to find root for
66     * @return root of the component containing x
67     */
68    private int find(int x) {
69        if (parent.get(x) != x) {
70            // Path compression: make x point directly to root
71            parent.put(x, find(parent.get(x)));
72        }
73        return parent.get(x);
74    }
75
76    /**
77     * Union operation to merge two components
78     * @param a first element
79     * @param b second element
80     */
81    private void union(int a, int b) {
82        // If b doesn't exist in our map, nothing to union
83        if (!parent.containsKey(b)) {
84            return;
85        }
86      
87        // Find roots of both components
88        int rootA = find(a);
89        int rootB = find(b);
90      
91        // Already in same component
92        if (rootA == rootB) {
93            return;
94        }
95      
96        // Merge components: make rootA point to rootB
97        parent.put(rootA, rootB);
98        // Update size of merged component
99        componentSize.put(rootB, componentSize.get(rootB) + componentSize.get(rootA));
100        // Update maximum group size if needed
101        maxGroupSize = Math.max(maxGroupSize, componentSize.get(rootB));
102        // Decrease number of components after merging
103        numComponents--;
104    }
105}
106
1class Solution {
2public:
3    int maxGroupSize;
4    int numGroups;
5
6    vector<int> groupStrings(vector<string>& words) {
7        // Map from bitmask to parent in Union-Find
8        unordered_map<int, int> parent;
9        // Map from bitmask to size of the group
10        unordered_map<int, int> groupSize;
11      
12        maxGroupSize = 0;
13        numGroups = words.size();
14      
15        // Convert each word to a bitmask representation
16        // Each bit represents whether a character exists in the word
17        for (auto& word : words) {
18            int bitmask = 0;
19            for (auto& ch : word) {
20                bitmask |= 1 << (ch - 'a');
21            }
22          
23            // Initialize Union-Find structure
24            parent[bitmask] = bitmask;
25            ++groupSize[bitmask];
26            maxGroupSize = max(maxGroupSize, groupSize[bitmask]);
27          
28            // If duplicate bitmask found, decrease number of groups
29            if (groupSize[bitmask] > 1) {
30                --numGroups;
31            }
32        }
33      
34        // Try to connect groups based on allowed operations
35        for (auto& [currentMask, _] : parent) {
36            // Try all possible single-bit operations
37            for (int bitPos = 0; bitPos < 26; ++bitPos) {
38                // Operation 1: Add or remove a character (flip single bit)
39                unite(currentMask, currentMask ^ (1 << bitPos), parent, groupSize);
40              
41                // Operation 2: Replace a character (remove one, add another)
42                // Only if current bit is set (character exists)
43                if ((currentMask >> bitPos) & 1) {
44                    for (int newBitPos = 0; newBitPos < 26; ++newBitPos) {
45                        // Only replace with a character that doesn't exist
46                        if (((currentMask >> newBitPos) & 1) == 0) {
47                            int newMask = currentMask ^ (1 << bitPos) | (1 << newBitPos);
48                            unite(currentMask, newMask, parent, groupSize);
49                        }
50                    }
51                }
52            }
53        }
54      
55        return {numGroups, maxGroupSize};
56    }
57
58private:
59    // Find operation with path compression for Union-Find
60    int find(int mask, unordered_map<int, int>& parent) {
61        if (parent[mask] != mask) {
62            parent[mask] = find(parent[mask], parent);
63        }
64        return parent[mask];
65    }
66
67    // Unite two groups if both exist
68    void unite(int maskA, int maskB, unordered_map<int, int>& parent, 
69                unordered_map<int, int>& groupSize) {
70        // Check if maskB exists in our map
71        if (!parent.count(maskB)) {
72            return;
73        }
74      
75        // Find roots of both masks
76        int rootA = find(maskA, parent);
77        int rootB = find(maskB, parent);
78      
79        // Already in the same group
80        if (rootA == rootB) {
81            return;
82        }
83      
84        // Merge groups: make rootB the parent of rootA
85        parent[rootA] = rootB;
86        groupSize[rootB] += groupSize[rootA];
87      
88        // Update maximum group size
89        maxGroupSize = max(maxGroupSize, groupSize[rootB]);
90      
91        // Decrease number of groups after merging
92        --numGroups;
93    }
94};
95
1// Global variables to track the maximum group size and number of groups
2let maxGroupSize: number;
3let numGroups: number;
4
5function groupStrings(words: string[]): number[] {
6    // Map from bitmask to parent in Union-Find data structure
7    const parent: Map<number, number> = new Map();
8    // Map from bitmask to size of the group
9    const groupSize: Map<number, number> = new Map();
10  
11    maxGroupSize = 0;
12    numGroups = words.length;
13  
14    // Convert each word to a bitmask representation
15    // Each bit represents whether a character exists in the word
16    for (const word of words) {
17        let bitmask = 0;
18        for (const ch of word) {
19            bitmask |= 1 << (ch.charCodeAt(0) - 'a'.charCodeAt(0));
20        }
21      
22        // Initialize Union-Find structure
23        parent.set(bitmask, bitmask);
24        const currentSize = (groupSize.get(bitmask) || 0) + 1;
25        groupSize.set(bitmask, currentSize);
26        maxGroupSize = Math.max(maxGroupSize, currentSize);
27      
28        // If duplicate bitmask found, decrease number of groups
29        if (currentSize > 1) {
30            numGroups--;
31        }
32    }
33  
34    // Try to connect groups based on allowed operations
35    for (const [currentMask, _] of parent) {
36        // Try all possible single-bit operations
37        for (let bitPos = 0; bitPos < 26; bitPos++) {
38            // Operation 1: Add or remove a character (flip single bit)
39            unite(currentMask, currentMask ^ (1 << bitPos), parent, groupSize);
40          
41            // Operation 2: Replace a character (remove one, add another)
42            // Only if current bit is set (character exists)
43            if ((currentMask >> bitPos) & 1) {
44                for (let newBitPos = 0; newBitPos < 26; newBitPos++) {
45                    // Only replace with a character that doesn't exist
46                    if (((currentMask >> newBitPos) & 1) === 0) {
47                        const newMask = currentMask ^ (1 << bitPos) | (1 << newBitPos);
48                        unite(currentMask, newMask, parent, groupSize);
49                    }
50                }
51            }
52        }
53    }
54  
55    return [numGroups, maxGroupSize];
56}
57
58// Find operation with path compression for Union-Find
59function find(mask: number, parent: Map<number, number>): number {
60    const currentParent = parent.get(mask);
61    if (currentParent !== mask) {
62        parent.set(mask, find(currentParent!, parent));
63    }
64    return parent.get(mask)!;
65}
66
67// Unite two groups if both exist
68function unite(
69    maskA: number, 
70    maskB: number, 
71    parent: Map<number, number>, 
72    groupSize: Map<number, number>
73): void {
74    // Check if maskB exists in our map
75    if (!parent.has(maskB)) {
76        return;
77    }
78  
79    // Find roots of both masks
80    const rootA = find(maskA, parent);
81    const rootB = find(maskB, parent);
82  
83    // Already in the same group
84    if (rootA === rootB) {
85        return;
86    }
87  
88    // Merge groups: make rootB the parent of rootA
89    parent.set(rootA, rootB);
90    const newSize = groupSize.get(rootB)! + groupSize.get(rootA)!;
91    groupSize.set(rootB, newSize);
92  
93    // Update maximum group size
94    maxGroupSize = Math.max(maxGroupSize, newSize);
95  
96    // Decrease number of groups after merging
97    numGroups--;
98}
99

Time and Space Complexity

Time Complexity: O(n * 26 * 26 * α(n)) where n is the number of unique bitmask representations and α(n) is the inverse Ackermann function.

  • Converting each word to a bitmask takes O(L) where L is the length of the word. For all words, this is O(N * L) where N is the total number of words.
  • The main loop iterates through each unique bitmask x in the dictionary p.
  • For each bitmask x, we iterate through 26 possible bit positions (first inner loop).
  • For each set bit at position i, we iterate through another 26 positions (second inner loop) to try replacement operations.
  • Each union operation involves find operations which take O(α(n)) amortized time with path compression.
  • In the worst case, we have n unique bitmasks, and for each we perform up to 26 * 26 union operations.
  • Total: O(N * L + n * 26 * 26 * α(n)). Since n ≤ N and typically L is small (bounded by 26), the dominant term is O(n * 26 * 26 * α(n)).

Space Complexity: O(n)

  • The dictionary p stores at most n unique bitmask representations as parent pointers.
  • The size Counter also stores at most n entries.
  • The recursive call stack for find can go up to O(log n) depth in the average case with path compression, but this doesn't dominate the space complexity.
  • Additional variables use O(1) space.
  • Total space complexity is O(n) where n is the number of unique bitmask representations of the input words.

Learn more about how to find time and space complexity quickly.

Common Pitfalls

1. Forgetting to Handle Duplicate Words

Pitfall: When multiple words map to the same bitmask (e.g., "abc", "bca", "cab"), they need to be counted as already connected. Failing to decrease the group count for duplicates will result in an incorrect number of groups.

Solution: Track when multiple words map to the same bitmask and decrease num_groups accordingly:

if group_size[bitmask] > 1:
    num_groups -= 1  # These words are already in the same group

2. Incorrect Replace Operation Implementation

Pitfall: A common mistake is trying to replace any letter with any other letter, including replacing a letter with itself or with a letter already in the word. This creates invalid transformations.

Solution: Only replace existing letters with letters NOT in the word:

if (bitmask >> bit_pos) & 1:  # Letter exists at bit_pos
    for new_bit_pos in range(26):
        if ((bitmask >> new_bit_pos) & 1) == 0:  # Letter doesn't exist at new_bit_pos
            new_bitmask = bitmask ^ (1 << bit_pos) | (1 << new_bit_pos)

3. Not Checking if Target Bitmask Exists Before Union

Pitfall: Attempting to union with bitmasks that don't exist in the parent dictionary (no word maps to that bitmask) will cause KeyError or create phantom groups.

Solution: Always check existence before union:

def union(node_a, node_b):
    if node_b not in parent:  # Critical check
        return
    # ... rest of union logic

4. Using Mutable Dictionary Keys During Iteration

Pitfall: Modifying the parent dictionary while iterating over parent.keys() can cause runtime errors or skip elements.

Solution: Convert keys to a list before iteration:

for bitmask in list(parent.keys()):  # Create a copy of keys
    # Safe to modify parent dictionary now

5. Inefficient Path Compression

Pitfall: Not implementing path compression in the find operation leads to degraded performance with deep trees, potentially causing TLE on large inputs.

Solution: Always implement path compression:

def find(node):
    if parent[node] != node:
        parent[node] = find(parent[node])  # Compress the path
    return parent[node]
Discover Your Strengths and Weaknesses: Take Our 5-Minute Quiz to Tailor Your Study Plan:

Which data structure is used to implement recursion?


Recommended Readings

Want a Structured Path to Master System Design Too? Don’t Miss This!

Load More