467. Unique Substrings in Wraparound String
Problem Description
The given problem is to count the number of unique non-empty substrings of a string s
that can also be found in the infinite base
string, which is essentially the English lowercase alphabet letters cycled indefinitely. The string s
is a finite string, possibly containing repeated characters.
The task is to figure out how many unique substrings from s
fit consecutively into base
. The challenge here is figuring out an efficient way to calculate this without the need to generate all possible substrings of s
and check them against base
, which would be computationally expensive.
Intuition
The intuition behind the solution leverages the properties of the base
string. Since base
is comprised of the English alphabet in order and repeated indefinitely, any substring following the pattern of contiguous letters ('abc', 'cde', etc.) will be found in base
. Furthermore, if we have found a substring that fits into base
, any shorter substrings starting from the same character will also fit.
The solution uses dynamic programming to keep track of the maximum length for substrings starting with each letter of the alphabet found in s
. The intuition is that if you can form a substring up to a certain length starting with a particular character (like 'a'), then you can also form all smaller substrings that start with that character.
Here's our approach in steps:
-
Create a list
dp
of 26 elements representing each letter of the alphabet to store the maximum substring length starting with that letter found ins
. -
Go through each character of string
s
and determine if it is part of a substring that follows the contiguous pattern. We do this by checking if the current and previous characters are adjacent inbase
. -
If adjacent, increment our running length
k
. If not, resetk
to one, because the current character does not continue from the previous character. -
Determine the list index corresponding to the current character and update
dp[idx]
with the maximum of its current value ork
. -
After processing the entire string, the sum of the values in
dp
is the total number of unique substrings found inbase
.
This approach prevents checking each possible substring and efficiently aggregates the counts by keeping track of the longest contiguous substring ending at each character.
Learn more about Dynamic Programming patterns.
Solution Approach
The solution uses dynamic programming, which is a method for solving complex problems by breaking them down into simpler subproblems. It utilizes additional storage to save the result of subproblems to avoid redundant calculations. Here’s how the solution approach is implemented:
-
Initial DP Array: We create a DP (dynamic programming) array
dp
with 26 elements corresponding to the letters of the alphabet set to 0. This array will store the maximum length of the unique substrings ending with the respective alphabet character. -
Iterating over
p
: We iterate over the stringp
, checking each character. -
Checking Continuity: For each character
c
inp
, we check if it forms a continuous substring with the previous character. This is done by checking the difference in ASCII values — specifically, whether(ord(c) - ord(p[i - 1])) % 26 == 1
. This takes care of the wraparound from 'z' to 'a' by using the modulus operation. -
Updating Length of Substring
k
: If the condition is true, it means the current characterc
continues the substring, and we increase our substring length counterk
. If not, we resetk
to 1 since the continuity is broken, andc
itself is a valid substring of length 1. -
Updating DP Array: We calculate the index of the current character in the alphabet
idx = ord(c) - ord('a')
and update thedp
array at this index. We setdp[idx]
to be the maximum of its current value ork
. This step ensures we're keeping track of the longest unique substring ending with each letter without explicitly creating all substring combinations. -
Result by Summing DP values: After completing the iteration over
p
, every index of thedp
array contains the maximum length of substrings ending with the respective character that can be found inbase
. Summing up all these maximum lengths will give us the number of unique non-empty substrings ofs
inbase
.
This algorithm uses the DP array as a way to eliminate redundant checks and store only the necessary information. By keeping track of the lengths of contiguous substrings in this manner, we avoid having to store or iterate over each of the potentially very many substrings explicitly.
Ready to land your dream job?
Unlock your dream job with a 2-minute evaluator for a personalized learning plan!
Start EvaluatorExample Walkthrough
Let’s walk through an example to illustrate the solution approach.
Suppose our string s
is "abcdbef"
.
Following the steps of the solution approach:
-
Initial DP Array: We initiate a DP array
dp
with 26 elements set to 0. The array indices represent letters'a'
to'z'
. -
Iterating over
s
: We begin iterating over strings
. -
Checking Continuity: As we check each character, we look for continuity from its predecessor. Since the first character 'a' has no predecessor, we skip to the second character 'b'. The ASCII difference from 'a' to 'b' is 1, thus we continue to the third character, 'c', and the same applies. However, when we get to the fourth character 'd', 'c' and 'd' are continuous, but for the fifth character 'b', 'd' and 'b' are not adjacent characters in the English alphabet.
-
Updating Length of Substring
k
: As we continue iterating:- 'a' starts a new substring with
k=1
. - 'b' is contiguous with 'a', so
k=2
. - 'c' is contiguous with 'b', so
k=3
. - 'd' is contiguous with 'c', so
k=4
. - 'b' breaks the pattern, resetting
k=1
. - 'e' is not contiguous with 'b',
k=1
. - 'f' is contiguous with 'e', so
k=2
.
- 'a' starts a new substring with
-
Updating DP Array: Throughout the iteration, we update our
dp
. Here's howdp
changes:- After 'a':
dp[0] = max(dp[0], 1) => 1
- After 'b':
dp[1] = max(dp[1], 2) => 2
- After 'c':
dp[2] = max(dp[2], 3) => 3
- After 'd':
dp[3] = max(dp[3], 4) => 4
- After reset at 'b':
dp[1] = max(dp[1], 1) => 2
(no change because 'b' already had 2 from 'ab') - After reset at 'e':
dp[4] = max(dp[4], 1) => 1
- After 'f':
dp[5] = max(dp[5], 2) => 2
- After 'a':
-
Result by Summing DP Values: Summing up the values in
dp
, we get the total count of unique non-empty substrings ofs
that can be found inbase
:1 + 2 + 3 + 4 + 1 + 2 = 13
So, our string s
"abcdbef" has 13 unique non-empty substrings that fit consecutively into the infinite base string of the English alphabet cycled indefinitely.
Solution Implementation
1class Solution:
2 def findSubstringInWraproundString(self, p: str) -> int:
3 # Initialize an array to keep track of the max length of substrings that end with each letter of the alphabet.
4 max_length_end_with = [0] * 26
5
6 # Initialize a variable to keep track of the current substring length.
7 current_length = 0
8
9 # Iterate through the string, character by character.
10 for i in range(len(p)):
11 # Check if the current character and the one before it are consecutive in the 'z-a' wraparound string.
12 # The condition checks if they are in alphabetical order or if it's a 'z' followed by 'a'.
13 if i > 0 and (ord(p[i]) - ord(p[i - 1])) % 26 == 1:
14 # If consecutive, increment the current length.
15 current_length += 1
16 else:
17 # Otherwise, reset the current length to 1.
18 current_length = 1
19
20 # Calculate the index in the alphabet for the current character.
21 index = ord(p[i]) - ord('a')
22 # Update the max length for this character if the current length is greater.
23 max_length_end_with[index] = max(max_length_end_with[index], current_length)
24
25 # Return the sum of max lengths, which gives the total number of distinct substrings.
26 return sum(max_length_end_with)
27
1class Solution {
2 public int findSubstringInWraproundString(String p) {
3 // dp array to store the maximum length substring ending with each alphabet
4 int[] maxSubstringLengths = new int[26];
5 int currentLength = 0; // Length of current substring
6
7 // Iterate through the string
8 for (int i = 0; i < p.length(); ++i) {
9 char currentChar = p.charAt(i);
10
11 // If the current and previous characters are consecutive in the wraparound string
12 if (i > 0 && (currentChar - p.charAt(i - 1) + 26) % 26 == 1) {
13 // Increment length of current substring
14 currentLength++;
15 } else {
16 // Restart length if not consecutive
17 currentLength = 1;
18 }
19
20 // Find the index in the alphabet for the current character
21 int index = currentChar - 'a';
22
23 // Update the maximum length for the particular character if it's greater than the previous value
24 maxSubstringLengths[index] = Math.max(maxSubstringLengths[index], currentLength);
25 }
26
27 int totalCount = 0; // Holds the total count of unique substrings
28
29 // Sum up all the maximum lengths as each length contributes to the distinct substrings
30 for (int maxLength : maxSubstringLengths) {
31 totalCount += maxLength;
32 }
33
34 return totalCount; // Return total count of all unique substrings
35 }
36}
37
1class Solution {
2public:
3 int findSubstringInWraproundString(string p) {
4 // dp array to keep track of the max length of unique substrings ending with each letter.
5 vector<int> maxLengthEndingWith(26, 0);
6
7 // 'k' will be used to store the length of the current valid substring.
8 int currentLength = 0;
9
10 // Loop through all characters in string 'p'.
11 for (int i = 0; i < p.size(); ++i) {
12 char currentChar = p[i];
13
14 // Check if the current character forms a consecutive sequence with the previous character.
15 if (i > 0 && (currentChar - p[i - 1] + 26) % 26 == 1) {
16 // If consecutive, increment the length of the current valid substring.
17 ++currentLength;
18 } else {
19 // If not consecutive, start a new substring of length 1.
20 currentLength = 1;
21 }
22
23 // Update the maximum length of substrings that end with the current character.
24 int index = currentChar - 'a';
25 maxLengthEndingWith[index] = max(maxLengthEndingWith[index], currentLength);
26 }
27
28 // Compute the result by summing the max lengths of unique substrings ending with each letter.
29 int result = 0;
30 for (int length : maxLengthEndingWith) result += length;
31
32 // Return the total number of all unique substrings.
33 return result;
34 }
35};
36
1function findSubstringInWraproundString(p: string): number {
2 // Total length of the input string 'p'
3 const length = p.length;
4 // Initialize an array to store the maximum length of unique substrings that end with each alphabet letter
5 const maxSubstringLengthByCh = new Array(26).fill(0);
6 // Current substring length
7 let currentLength = 1;
8
9 // Set the maximum length for the first character
10 maxSubstringLengthByCh[p.charCodeAt(0) - 'a'.charCodeAt(0)] = 1;
11
12 // Iterate through the string starting from the second character
13 for (let i = 1; i < length; i++) {
14 // Check if the current and the previous characters are consecutive in the wraparound string
15 if ((p.charCodeAt(i) - p.charCodeAt(i - 1) + 26) % 26 === 1) {
16 // If they are consecutive, increment the length of the substring that includes the current character
17 currentLength++;
18 } else {
19 // If not, reset the current substring length to 1
20 currentLength = 1;
21 }
22 // Determine the index for the current character in the alphabet array
23 const index = p.charCodeAt(i) - 'a'.charCodeAt(0);
24 // Update the maximum substring length for the current character if necessary
25 maxSubstringLengthByCh[index] = Math.max(maxSubstringLengthByCh[index], currentLength);
26 }
27
28 // Compute the sum of all maximum substring lengths to get the final count of distinct non-empty substrings
29 return maxSubstringLengthByCh.reduce((sum, value) => sum + value);
30}
31
Time and Space Complexity
Time Complexity
The given code iterates through each character of the string p
only once with a constant time operation for each character including arithmetic operations and a maximum function. This results in a time complexity of O(n)
, where n
is the length of the string p
.
Space Complexity
The space complexity is determined by the additional space used besides the input itself. Here, a constant size array dp
of size 26 is used, which does not grow with the size of the input string p
. Therefore, the space complexity is O(1)
, since the space used is constant and does not depend on the input size.
Learn more about how to find time and space complexity quickly using problem constraints.
In a binary min heap, the minimum element can be found in:
Recommended Readings
What is Dynamic Programming Prerequisite DFS problems dfs_intro Backtracking problems backtracking Memoization problems memoization_intro Pruning problems backtracking_pruning Dynamic programming is an algorithmic optimization technique that breaks down a complicated problem into smaller overlapping sub problems in a recursive manner and uses solutions to the sub problems to construct a solution
LeetCode Patterns Your Personal Dijkstra's Algorithm to Landing Your Dream Job The goal of AlgoMonster is to help you get a job in the shortest amount of time possible in a data driven way We compiled datasets of tech interview problems and broke them down by patterns This way we
Recursion Recursion is one of the most important concepts in computer science Simply speaking recursion is the process of a function calling itself Using a real life analogy imagine a scenario where you invite your friends to lunch https algomonster s3 us east 2 amazonaws com recursion jpg You first
Want a Structured Path to Master System Design Too? Don’t Miss This!