Facebook Pixel

467. Unique Substrings in Wraparound String

Problem Description

We have an infinite string base that consists of the alphabet "abcdefghijklmnopqrstuvwxyz" repeated infinitely. So base looks like: "...zabcdefghijklmnopqrstuvwxyzabcdefghijklmnopqrstuvwxyzabcd....".

Given a string s, we need to find how many unique non-empty substrings of s can be found in the infinite string base.

For example, if s = "zab", the substrings are:

  • Single characters: "z", "a", "b" - all present in base
  • Two characters: "za", "ab" - both present in base (note that "za" appears when wrapping from 'z' to 'a')
  • Three characters: "zab" - present in base

The key insight is that valid substrings in base must be consecutive letters in alphabetical order, with the special case that 'z' can be followed by 'a' (wrapping around).

The solution uses dynamic programming to track the longest valid substring ending with each character. For each character in s, if it follows the previous character in alphabetical order (or wraps from 'z' to 'a'), we extend the current consecutive sequence. Otherwise, we start a new sequence of length 1. We keep track of the maximum length substring ending with each letter to avoid counting duplicates.

The final answer is the sum of the maximum lengths for all distinct ending characters, as this represents all unique substrings that can be found in base.

Quick Interview Experience
Help others by sharing your interview experience
Have you seen this problem before?

Intuition

The key observation is that any valid substring in base must consist of consecutive letters in alphabetical order (with wrapping from 'z' to 'a'). This means we're looking for substrings where each character is exactly one position after the previous character in the alphabet.

Let's think about how to count unique substrings efficiently. If we have a valid consecutive sequence like "abc", it contains the substrings "a", "b", "c", "ab", "bc", and "abc". Notice that all substrings ending with 'c' are: "c", "bc", "abc" - exactly 3 substrings, which equals the length of the consecutive sequence ending at 'c'.

This leads to an important insight: instead of tracking all possible substrings, we can track the longest consecutive sequence ending with each letter. Why the longest? Because if we have "abc" somewhere and "bc" somewhere else, the longer sequence "abc" already accounts for all substrings that "bc" would contribute (both "bc" and "c").

For each character in the alphabet, we only need to remember the maximum length of a consecutive sequence ending with that character. This avoids counting duplicates - if we see 'c' multiple times in different contexts, we only count the substrings from the longest sequence ending with 'c'.

The algorithm naturally follows: as we scan through the string, we maintain the current consecutive sequence length k. When we see a character that continues the sequence (differs from the previous by 1 in the circular alphabet), we extend k. Otherwise, we start fresh with k = 1. We then update our record for the maximum sequence length ending with the current character.

The final count is the sum of all maximum lengths, as each maximum length for character c represents the number of unique substrings ending with c.

Solution Approach

We implement the solution using dynamic programming with a hash map to track the maximum length of consecutive sequences ending with each character.

Data Structure:

  • We use a dictionary f (implemented as defaultdict(int)) where f[c] stores the maximum length of a consecutive substring ending with character c.
  • A variable k tracks the current consecutive sequence length.

Algorithm Steps:

  1. Initialize tracking variables:

    • Create f as a defaultdict to store maximum lengths for each character
    • Set k = 0 to track the current consecutive sequence length
  2. Iterate through the string: For each character c at position i:

    • Check if it continues the sequence:
      • Calculate (ord(c) - ord(s[i - 1])) % 26
      • If this equals 1, the current character follows the previous one in the circular alphabet
      • For example: 'b' after 'a' gives (98 - 97) % 26 = 1
      • Special case: 'a' after 'z' gives (97 - 122) % 26 = 1
    • Update the consecutive length k:
      • If consecutive: increment k by 1
      • Otherwise: reset k to 1 (start a new sequence)
    • Update the maximum for this character:
      • Set f[c] = max(f[c], k)
      • This ensures we keep only the longest sequence ending with c
  3. Calculate the result:

    • Return sum(f.values())
    • Each value in f represents the count of unique substrings ending with that character

Why this works:

  • A consecutive sequence of length n ending with character c contributes exactly n unique substrings (all suffixes ending with c)
  • By keeping only the maximum length for each character, we avoid counting duplicate substrings
  • The modulo 26 operation handles the wraparound from 'z' to 'a' elegantly

Time Complexity: O(n) where n is the length of string s
Space Complexity: O(1) as the dictionary can have at most 26 entries

Ready to land your dream job?

Unlock your dream job with a 5-minute evaluator for a personalized learning plan!

Start Evaluator

Example Walkthrough

Let's walk through the example s = "zabc" to illustrate the solution approach.

Initial Setup:

  • f = {} (empty dictionary to track max lengths)
  • k = 0 (current consecutive sequence length)

Step 1: Process 'z' (index 0)

  • First character, so we start a new sequence
  • k = 1
  • f['z'] = 1
  • State: f = {'z': 1}

Step 2: Process 'a' (index 1)

  • Check if 'a' follows 'z': (ord('a') - ord('z')) % 26 = (97 - 122) % 26 = -25 % 26 = 1
  • It's consecutive! So k = k + 1 = 2
  • f['a'] = max(f.get('a', 0), 2) = 2
  • State: f = {'z': 1, 'a': 2}
  • This means we found substrings "a" and "za" ending with 'a'

Step 3: Process 'b' (index 2)

  • Check if 'b' follows 'a': (ord('b') - ord('a')) % 26 = (98 - 97) % 26 = 1
  • It's consecutive! So k = k + 1 = 3
  • f['b'] = max(f.get('b', 0), 3) = 3
  • State: f = {'z': 1, 'a': 2, 'b': 3}
  • This means we found substrings "b", "ab", and "zab" ending with 'b'

Step 4: Process 'c' (index 3)

  • Check if 'c' follows 'b': (ord('c') - ord('b')) % 26 = (99 - 98) % 26 = 1
  • It's consecutive! So k = k + 1 = 4
  • f['c'] = max(f.get('c', 0), 4) = 4
  • Final state: f = {'z': 1, 'a': 2, 'b': 3, 'c': 4}
  • This means we found substrings "c", "bc", "abc", and "zabc" ending with 'c'

Calculate Result:

  • Sum all values: 1 + 2 + 3 + 4 = 10

Unique substrings found:

  1. Ending with 'z': "z" (1 substring)
  2. Ending with 'a': "a", "za" (2 substrings)
  3. Ending with 'b': "b", "ab", "zab" (3 substrings)
  4. Ending with 'c': "c", "bc", "abc", "zabc" (4 substrings)

Total: 10 unique substrings that exist in the infinite base string.

Solution Implementation

1from collections import defaultdict
2
3class Solution:
4    def findSubstringInWraproundString(self, s: str) -> int:
5        # Dictionary to store the maximum length of substring ending with each character
6        max_length_ending_with = defaultdict(int)
7      
8        # Current consecutive substring length
9        consecutive_length = 0
10      
11        for index, char in enumerate(s):
12            # Check if current character follows the previous one in alphabetical order
13            # (considering wraparound: 'z' -> 'a')
14            if index > 0 and (ord(char) - ord(s[index - 1])) % 26 == 1:
15                # Extend the current consecutive substring
16                consecutive_length += 1
17            else:
18                # Start a new consecutive substring
19                consecutive_length = 1
20          
21            # Update the maximum length of substring ending with current character
22            # This avoids counting duplicate substrings
23            max_length_ending_with[char] = max(max_length_ending_with[char], consecutive_length)
24      
25        # Sum all maximum lengths to get total unique substrings
26        return sum(max_length_ending_with.values())
27
1class Solution {
2    public int findSubstringInWraproundString(String s) {
3        // Array to store the maximum length of valid substring ending with each letter
4        // maxLengthEndingWith[0] corresponds to 'a', maxLengthEndingWith[1] to 'b', etc.
5        int[] maxLengthEndingWith = new int[26];
6      
7        int stringLength = s.length();
8      
9        // Iterate through the string to find all valid substrings
10        for (int currentIndex = 0, currentSubstringLength = 0; currentIndex < stringLength; currentIndex++) {
11            // Check if current character continues the sequence from previous character
12            // Valid sequences: a->b, b->c, ..., z->a (wraparound)
13            if (currentIndex > 0 && isConsecutiveInWraparound(s.charAt(currentIndex - 1), s.charAt(currentIndex))) {
14                // Extend the current valid substring
15                currentSubstringLength++;
16            } else {
17                // Start a new valid substring with length 1
18                currentSubstringLength = 1;
19            }
20          
21            // Update the maximum length of valid substring ending with current character
22            // We only keep the maximum because longer substrings contain all shorter ones
23            char currentChar = s.charAt(currentIndex);
24            int charIndex = currentChar - 'a';
25            maxLengthEndingWith[charIndex] = Math.max(maxLengthEndingWith[charIndex], currentSubstringLength);
26        }
27      
28        // Sum all maximum lengths to get total count of unique valid substrings
29        return Arrays.stream(maxLengthEndingWith).sum();
30    }
31  
32    /**
33     * Helper method to check if two characters are consecutive in wraparound order
34     * @param prevChar The previous character
35     * @param currChar The current character
36     * @return true if currChar follows prevChar in wraparound order
37     */
38    private boolean isConsecutiveInWraparound(char prevChar, char currChar) {
39        // Calculate difference with wraparound handling
40        // Examples: 'b' - 'a' = 1, 'a' - 'z' = -25, but (-25 + 26) % 26 = 1
41        return (currChar - prevChar + 26) % 26 == 1;
42    }
43}
44
1class Solution {
2public:
3    int findSubstringInWraproundString(string s) {
4        // Array to store the maximum length of valid substring ending with each letter
5        // maxLength[i] represents the max length of substring ending with character ('a' + i)
6        int maxLength[26] = {0};
7      
8        int stringLength = s.length();
9      
10        // Iterate through the string
11        for (int i = 0, currentLength = 0; i < stringLength; ++i) {
12            // Check if current character follows the previous one in circular alphabet order
13            // (s[i] - s[i-1] + 26) % 26 == 1 handles both normal succession (b after a)
14            // and wraparound case (a after z)
15            if (i > 0 && (s[i] - s[i - 1] + 26) % 26 == 1) {
16                // Extend the current valid substring
17                ++currentLength;
18            } else {
19                // Start a new valid substring
20                currentLength = 1;
21            }
22          
23            // Update the maximum length for substrings ending with current character
24            int charIndex = s[i] - 'a';
25            maxLength[charIndex] = max(maxLength[charIndex], currentLength);
26        }
27      
28        // Sum up all maximum lengths to get total unique substrings
29        // Each maxLength[i] represents unique substrings ending with that character
30        return accumulate(begin(maxLength), end(maxLength), 0);
31    }
32};
33
1/**
2 * Finds the number of unique non-empty substrings of s that are present in the infinite wraparound string.
3 * The wraparound string is "abcdefghijklmnopqrstuvwxyzabcdefghijklmnopqrstuvwxyz..."
4 * 
5 * @param s - The input string containing only lowercase English letters
6 * @returns The count of unique substrings of s found in the wraparound string
7 */
8function findSubstringInWraproundString(s: string): number {
9    // Helper function to convert a character to its index (0-25)
10    const getCharIndex = (char: string): number => {
11        return char.charCodeAt(0) - 97;
12    };
13  
14    // Array to store the maximum length of substring ending with each letter
15    // maxLengthEndingWith[i] represents the max length of valid substring ending with character (i + 'a')
16    const maxLengthEndingWith: number[] = Array(26).fill(0);
17  
18    const stringLength: number = s.length;
19  
20    // Iterate through the string to find all valid substrings
21    for (let i = 0, currentLength = 0; i < stringLength; ++i) {
22        const currentCharIndex: number = getCharIndex(s[i]);
23      
24        // Check if current character continues the sequence from previous character
25        // In wraparound string, each character follows the previous one (z -> a is also valid)
26        if (i > 0 && (currentCharIndex - getCharIndex(s[i - 1]) + 26) % 26 === 1) {
27            // Extend the current valid substring length
28            ++currentLength;
29        } else {
30            // Start a new valid substring
31            currentLength = 1;
32        }
33      
34        // Update the maximum length for substrings ending with current character
35        maxLengthEndingWith[currentCharIndex] = Math.max(
36            maxLengthEndingWith[currentCharIndex], 
37            currentLength
38        );
39    }
40  
41    // Sum up all maximum lengths to get total unique substrings
42    return maxLengthEndingWith.reduce((total, length) => total + length, 0);
43}
44

Time and Space Complexity

The time complexity is O(n), where n is the length of the string s. The algorithm iterates through the string exactly once using a single for loop with enumerate, performing constant-time operations for each character (checking conditions, updating variables, and dictionary operations).

The space complexity is O(|Σ|), where Σ is the character set. In this case, since we're dealing with lowercase English letters, |Σ| = 26. The defaultdict f stores at most one entry for each unique character in the alphabet, regardless of the input string length. Therefore, the space used by the dictionary is bounded by the size of the alphabet rather than the input size.

Common Pitfalls

Pitfall 1: Incorrect Wraparound Check

Issue: A common mistake is incorrectly checking if characters are consecutive, especially for the wraparound case from 'z' to 'a'.

Incorrect approach:

# Wrong: This doesn't handle wraparound correctly
if ord(char) - ord(s[index - 1]) == 1:
    consecutive_length += 1

Why it fails: When transitioning from 'z' to 'a', ord('a') - ord('z') = -25, not 1. This would miss valid consecutive sequences like "zab".

Correct approach:

# Use modulo 26 to handle wraparound
if (ord(char) - ord(s[index - 1])) % 26 == 1:
    consecutive_length += 1

Pitfall 2: Not Tracking Maximum Length Per Character

Issue: Simply counting all substrings without tracking the maximum length for each ending character leads to counting duplicates.

Incorrect approach:

# Wrong: This counts duplicates
total_count = 0
for i in range(len(s)):
    if consecutive:
        consecutive_length += 1
    else:
        consecutive_length = 1
    total_count += consecutive_length  # Adds all substrings

Why it fails: If "abc" appears twice in the string, this would count "a", "ab", "abc" twice, leading to incorrect results.

Correct approach:

# Keep only the maximum length for each ending character
max_length_ending_with[char] = max(max_length_ending_with[char], consecutive_length)
# Then sum all maximum values
return sum(max_length_ending_with.values())

Pitfall 3: Starting Index Edge Case

Issue: Forgetting to handle the first character properly or checking index bounds incorrectly.

Incorrect approach:

# Wrong: Doesn't handle i=0 case
for i in range(len(s)):
    if (ord(s[i]) - ord(s[i-1])) % 26 == 1:  # Error when i=0
        consecutive_length += 1

Correct approach:

# Check index > 0 before comparing with previous character
for index, char in enumerate(s):
    if index > 0 and (ord(char) - ord(s[index - 1])) % 26 == 1:
        consecutive_length += 1
    else:
        consecutive_length = 1

Pitfall 4: Understanding What Constitutes a Valid Substring

Issue: Misunderstanding that only consecutively increasing alphabetical sequences (with wraparound) are valid in the infinite base string.

Example: The substring "ca" is NOT valid in base because 'c' doesn't immediately follow 'a' in the alphabet. Only patterns like "abc", "xyz", "zabc" are valid.

Key insight: A substring is valid if and only if each character follows the previous one in alphabetical order (considering 'z' wraps to 'a').

Discover Your Strengths and Weaknesses: Take Our 5-Minute Quiz to Tailor Your Study Plan:

What does the following code do?

1def f(arr1, arr2):
2  i, j = 0, 0
3  new_arr = []
4  while i < len(arr1) and j < len(arr2):
5      if arr1[i] < arr2[j]:
6          new_arr.append(arr1[i])
7          i += 1
8      else:
9          new_arr.append(arr2[j])
10          j += 1
11  new_arr.extend(arr1[i:])
12  new_arr.extend(arr2[j:])
13  return new_arr
14
1public static List<Integer> f(int[] arr1, int[] arr2) {
2  int i = 0, j = 0;
3  List<Integer> newArr = new ArrayList<>();
4
5  while (i < arr1.length && j < arr2.length) {
6      if (arr1[i] < arr2[j]) {
7          newArr.add(arr1[i]);
8          i++;
9      } else {
10          newArr.add(arr2[j]);
11          j++;
12      }
13  }
14
15  while (i < arr1.length) {
16      newArr.add(arr1[i]);
17      i++;
18  }
19
20  while (j < arr2.length) {
21      newArr.add(arr2[j]);
22      j++;
23  }
24
25  return newArr;
26}
27
1function f(arr1, arr2) {
2  let i = 0, j = 0;
3  let newArr = [];
4  
5  while (i < arr1.length && j < arr2.length) {
6      if (arr1[i] < arr2[j]) {
7          newArr.push(arr1[i]);
8          i++;
9      } else {
10          newArr.push(arr2[j]);
11          j++;
12      }
13  }
14  
15  while (i < arr1.length) {
16      newArr.push(arr1[i]);
17      i++;
18  }
19  
20  while (j < arr2.length) {
21      newArr.push(arr2[j]);
22      j++;
23  }
24  
25  return newArr;
26}
27

Recommended Readings

Want a Structured Path to Master System Design Too? Don’t Miss This!

Load More