249. Group Shifted Strings

MediumArrayHash TableString
Leetcode Link

Problem Description

The given problem involves identifying sequences of strings that are formed by shifting characters. A string shift is where each letter of the string is moved to its next successive character in the alphabet. The task is to group all strings from a provided array that are part of the same shifting sequence. A shifting sequence is a series of strings where each string can be transformed into the next by performing this described shift operation. It's important to note that when a shift operation is applied to 'z', it wraps around to 'a'. The output can be in any order, and the aim is to group strings in the array which belong to the same shifting sequence.

For example, the string "abc" when shifted yields "bcd", and subsequently "cde", all the way up to "xyz". These shifted versions all belong to the same shifting sequence started by "abc". If "abc" and "bcd" are both in the input array, they should be grouped together in the output.

Intuition

To create a solution for grouping strings into shifting sequences, we need a way to determine if two strings belong to the same sequence. The intuition here is to "normalize" each string by shifting it back by the same amount so we can compare them. The amount to shift back by is determined by the difference between the ASCII code of the first character in the string and the ASCII code for 'a'. By doing this for every string, strings that belong to the same shifting sequence will end up looking the same after normalization.

This normalization process transforms each string into a "canonical form". For example, both "abc" and "bcd" would be normalized to "abc". These normalized strings act as keys in a dictionary (using a defaultdict for convenience) where each key maps to a list of original strings sharing the same normalized form.

Here’s the process in detail:

  1. Go through each string in the input array.
  2. Calculate the difference in ASCII values between the first character of the string and 'a'.
  3. Shift all characters in the string back by this calculated difference.
    • If during shifting any character goes before 'a', wrap it around by adding 26 (total number of letters in the English alphabet).
  4. Join the shifted characters to get the normalized string.
  5. Add the original string to the list in the dictionary where the key is the normalized string.
  6. Once all strings are processed, extract all values from the dictionary. These are the groups of strings that belong to the same shifting sequence.

The provided code implements this approach, resulting in a dictionary where each entry is a list of strings that are in the same shifting sequence. These lists of strings are the solution we are looking for.

Not Sure What to Study? Take the 2-min Quiz to Find Your Missing Piece:

Which of the following array represent a max heap?

Solution Approach

The implementation of the solution involves a key algorithmic pattern known as hashing. Originally, all strings placed into a shifting sequence are hashed based on their normalized forms. The central data structure used for this is a defaultdict from Python's collections module, which simplifies the process of appending to lists within a dictionary without having to check if the key is already present.

Here's a step-by-step breakdown of the code implementation:

  1. Initialize a defaultdict of lists:

    1mp = defaultdict(list)

    This creates a mp dictionary where each value is initialized as an empty list that keys, representing the normalized form of strings, will point to.

  2. Loop through each string in the provided list strings:

    1for s in strings:

    Each string s from the input list needs to be normalized.

  3. Calculate the difference between the ASCII value of the first character of s and 'a' to determine the amount to shift:

    1diff = ord(s[0]) - ord('a')

    This difference diff is used to shift the characters back.

  4. Create a new list t that will hold the shifted characters.

  5. Loop through the characters of the string s, shifting each by diff:

    1for c in s:
    2    d = ord(c) - diff

    Here, each character c from string s is shifted back by the previously calculated diff.

  6. Check if the shifted ASCII value goes below 'a'. If it does, wrap it around by adding 26 to get the correct character:

    1if d < ord('a'):
    2    d += 26

    This check makes sure the shifting maintains the correct alphabetical order.

  7. Add the shifted character to the list t:

    1t.append(chr(d))

    By converting the ASCII value back to the character using chr, we get the normalized character.

  8. Join all the normalized characters in t to get the normalized form of the string k:

    1k = ''.join(t)

    This string k can now be used as a key in our dictionary.

  9. Append the original string s to the list in mp corresponding to the key k:

    1mp[k].append(s)

    Strings with the same normalized form k will be grouped together.

  10. Finally, we obtain the groups of strings as lists of the dictionary's values:

    1return list(mp.values())

    This line returns all lists from mp, which contains the groups of strings according to their shifting sequences.

By using a combination of hashing with normalization and a defaultdict, the solution efficiently groups the strings according to their belonging shifting sequence while maintaining an O(n * m) time complexity, where n is the number of strings and m is the length of the longest string.

Discover Your Strengths and Weaknesses: Take Our 2-Minute Quiz to Tailor Your Study Plan:

Which algorithm should you use to find a node that is close to the root of the tree?

Example Walkthrough

Let's assume we have an array of strings: ["abc", "bcd", "acef", "xyz", "az", "ba", "a", "z"]. According to our problem, strings that are a part of the same shifting sequence should be grouped together.

We start by initializing our defaultdict of lists, which will hold our groups:

1mp = defaultdict(list)

Next, we iterate over the given array of strings and normalize each string to find the shift sequence it belongs to:

  1. For the first string "abc", we calculate the shift necessary to normalize it. The ASCII difference between 'a' of "abc" and 'a' is 0. Thus, after normalization, "abc" remains "abc". We append "abc" to the list in our dictionary under the key "abc".
1mp["abc"].append("abc")
  1. For the second string "bcd", the shift from 'b' to 'a' is 1. Shifting each letter back by 1, "bcd" becomes "abc". This string is grouped under the key "abc".
1mp["abc"].append("bcd")
  1. Moving on to "acef", the shift from 'a' to 'a' is again 0, so it's already normalized. We add "acef" to a new group:
1mp["acef"].append("acef")
  1. The fourth string "xyz" has a shift of 23 from 'x' to 'a'. Shifting back each letter by 23, we wrap around the alphabet and "xyz" becomes "abc". We add "xyz" to the "abc" group:
1mp["abc"].append("xyz")
  1. For the string "az", there is no uniform shift that can change 'a' to 'z' in the forward direction. But for our normalization, we only look at the first character. Since the first character is 'a', the string "az" stays as is. A new group is created:
1mp["az"].append("az")
  1. "ba" is a bit more complex. The shift from 'b' to 'a' is 1. Shifting back the first 'b' gives us 'a', but for 'a', we need to wrap it around the alphabet which makes it 'z', resulting in the normalized form "az". Thus it's grouped with "az":
1mp["az"].append("ba")
  1. The string "a" has no shift needed and is already normalized. It starts a new group on its own:
1mp["a"].append("a")
  1. Finally, "z" needs to be shifted by 25 to become "a", starting a new group:
1mp["a"].append("z")

At the end of the iteration, we have the following groups:

1{
2    "abc": ["abc", "bcd", "xyz"],
3    "acef": ["acef"],
4    "az": ["az", "ba"],
5    "a": ["a", "z"]
6}

Returning all the values from our mp dictionary gives us the required grouped strings according to their shifting sequences:

1return [["abc", "bcd", "xyz"], ["acef"], ["az", "ba"], ["a", "z"]]

By following this approach, we make sure to normalize each string so it can be compared with others, and by using hashing with a defaultdict, we efficiently group all strings into their respective shifting sequences.

Solution Implementation

1from collections import defaultdict
2
3class Solution:
4    def groupStrings(self, strings):
5        # A mapping from normalized string to the list of strings that match the pattern
6        normalized_to_group = defaultdict(list)
7      
8        # Iterate over every string in the input list
9        for s in strings:
10            # List to store normalized characters
11            normalized_chars = []
12
13            # Calculate the difference between the first character and 'a'
14            shift = ord(s[0]) - ord('a')
15
16            # Normalize each character in the string
17            for char in s:
18                normalized_char_code = ord(char) - shift
19
20                # Ensure the normalized character code wraps around if it goes below 'a'
21                if normalized_char_code < ord('a'):
22                    normalized_char_code += 26
23
24                # Add the normalized character to the list
25                normalized_chars.append(chr(normalized_char_code))
26
27            # Create the normalized string from the list of normalized characters
28            normalized_string = ''.join(normalized_chars)
29
30            # Add the original string to the group corresponding to the normalized string
31            normalized_to_group[normalized_string].append(s)
32
33        # Return all groups as a list of lists
34        return list(normalized_to_group.values())
35
1class Solution {
2
3    // This function groups the strings which can be obtained by shifting every character of any string in the group by the same number of positions
4    public List<List<String>> groupStrings(String[] strings) {
5        // HashMap to store groups of shifted strings
6        Map<String, List<String>> groupedStringsMap = new HashMap<>();
7      
8        // Loop through each string in the array
9        for (String s : strings) {
10            // Find the shift required to bring the first character back to 'a'
11            int shift = s.charAt(0) - 'a';
12            // Convert the current string into character array for manipulation
13            char[] shiftedChars = s.toCharArray();
14          
15            // Shift every character in the array such that the first character becomes 'a'
16            for (int i = 0; i < shiftedChars.length; ++i) {
17                // Calculate the shifted character considering the circular nature of the alphabet
18                char shifted = (char) (shiftedChars[i] - shift);
19                // If shifting left goes below 'a', wrap around from 'z'
20                if (shifted < 'a') {
21                    shifted += 26;
22                }
23                // Update the character in the array
24                shiftedChars[i] = shifted;
25            }
26          
27            // Convert the normalized character array to a string which can be used as a key in the map
28            String normalizedKey = new String(shiftedChars);
29            // Add the original string to the list corresponding to the normalized key
30            groupedStringsMap.computeIfAbsent(normalizedKey, k -> new ArrayList<>()).add(s);
31        }
32      
33        // Convert the values of the hashMap into a list and return
34        return new ArrayList<>(groupedStringsMap.values());
35    }
36}
37
1#include <vector>
2#include <string>
3#include <unordered_map>
4#include <iostream>
5using namespace std;
6
7class Solution {
8public:
9    vector<vector<string>> groupStrings(vector<string>& strings) {
10        // Map to hold the grouped strings by normalized form
11        unordered_map<string, vector<string>> groupedStringsMap;
12
13        // Iterate over each string to normalize and group
14        for (auto& str : strings) {
15            // Calculate the offset to normalize the string
16            int offset = str[0] - 'a';
17            string normalizedStr = str;
18
19            // Normalize each character in the string
20            for (int i = 0; i < normalizedStr.size(); ++i) {
21                char normalizedChar = normalizedStr[i] - offset;
22                // Wrap around the alphabet if the normalized character is before 'a'
23                if (normalizedChar < 'a') {
24                    normalizedChar += 26;
25                }
26                normalizedStr[i] = normalizedChar;
27            }
28
29            // For debugging: print the normalized string
30            cout << normalizedStr << endl;
31
32            // Group the original string with its normalized form
33            groupedStringsMap[normalizedStr].push_back(str);
34        }
35
36        // Placeholder for the result
37        vector<vector<string>> result;
38
39        // Build the result from the map
40        for (auto& entry : groupedStringsMap) {
41            result.push_back(entry.second);
42        }
43
44        return result;
45    }
46};
47
1// An array of strings representing the input to be grouped
2const strings: string[] = []; // Populate this array with the desired strings to be grouped
3
4// A function that calculates the difference needed to shift a character to 'a'
5function calculateOffset(char: string): number {
6    return char.charCodeAt(0) - 'a'.charCodeAt(0);
7}
8
9// A function that normalizes a string based on its first character
10function normalizeString(str: string): string {
11    let offset = calculateOffset(str[0]);
12    return str.split('').map(char => {
13        let normalizedCharCode = char.charCodeAt(0) - offset;
14        if (normalizedCharCode < 'a'.charCodeAt(0)) {
15            normalizedCharCode += 26;
16        }
17        return String.fromCharCode(normalizedCharCode);
18    }).join('');
19}
20
21// A function that groups the strings into vectors by their normalized forms
22function groupStrings(strings: string[]): string[][] {
23    const groupedStringsMap: { [key: string]: string[] } = {};
24
25    // Iterate over each string to normalize and group
26    strings.forEach(str => {
27        const normalizedStr = normalizeString(str);
28
29        // Debugging: console logging the normalized string
30        console.log(normalizedStr);
31
32        // If the normalized string is not yet a key in the map, initialize it with an empty array
33        groupedStringsMap[normalizedStr] = groupedStringsMap[normalizedStr] || [];
34        // Add the original string to the array corresponding to the normalized string
35        groupedStringsMap[normalizedStr].push(str);
36    });
37
38    // Extract the grouped strings into a result array
39    const result = Object.values(groupedStringsMap);
40
41    return result;
42}
43
44// Example usage:
45// let inputStrings = ['abc', 'bcd', 'acef', 'xyz', 'az', 'ba', 'a', 'z'];
46// let grouped = groupStrings(inputStrings);
47// console.log(grouped);
48
Not Sure What to Study? Take the 2-min Quiz:

How many ways can you arrange the three letters A, B and C?

Time and Space Complexity

Time Complexity

The time complexity of the given code involves iterating over each string in the input list and then iterating over each character in the string.

  • Let n be the total number of strings in the input list strings.
  • Let k be the average length of the strings.

For each string, we iterate through each character to compute a normalized representation by adjusting the character's ASCII values. This operation takes O(k) time, as we have k characters in an average string. Doing this for all strings results in O(nk) time complexity for this step.

After we have the normalized strings, we insert them into a hashmap (mp). The insertion into the hashmap will typically be O(1) on average for each key-value pair, assuming that hash collisions are handled efficiently. We are inserting n strings, so in total, this also takes O(n) time.

Therefore, the total time complexity is O(nk) + O(n). Since O(nk) will be the dominating term when k is significant, we can consider the overall time complexity of the algorithm to be O(nk).

Space Complexity

For space complexity, we are using a hashmap to store the groups of strings. In the worst case, if all strings are distinct after normalization, the hashmap will have n entries with a single string in each. Therefore, the space required for the hashmap is O(n).

Each string is being normalized, which may require up to k space for each string. However, since this space is reused for each string, we don't count it n times.

Hence, the total space complexity is O(n) for the hashmap storage. This is the dominating term in the space complexity calculation irrespective of the average length k of the strings.

Learn more about how to find time and space complexity quickly using problem constraints.

Fast Track Your Learning with Our Quick Skills Quiz:

Which algorithm should you use to find a node that is close to the root of the tree?


Recommended Readings


Got a question? Ask the Teaching Assistant anything you don't understand.

Still not clear? Ask in the Forum,  Discord or Submit the part you don't understand to our editors.


TA 👨‍🏫