2156. Find Substring With Given Hash Value
Problem Description
In this problem, we're given a string s
and four integers named power
, modulo
, k
, and hashValue
. The objective is to find a specific substring of s
of length k
whose hash, calculated by a given hash function, equals hashValue
. The hash function multiplies the value of each character in the substring (decided by its position in the alphabet where 'a' = 1, 'b' = 2, ..., 'z' = 26) by power
raised to the index of the character, sums these products, and then takes the modulo with modulo
.
More formally, for a substring of s
starting at index i
and ending at index i + k - 1
, where 0 ≤ i ≤ len(s) - k
, the hash is defined as:
(hash(s[i]) * power^0 + hash(s[i + 1]) * power^1 + ... + hash(s[i + k - 1]) * power^(k-1)) mod modulo
We must find the first such substring from the left that matches the given hashValue
.
Intuition
The problem can be solved by using a sliding window technique that moves across the string from right to left. Since a straightforward approach would lead to a time complexity of O(n * k) which could be prohibitive for large strings, we need to devise a more efficient way to update the hash value as we slide the window. The algorithm uses the property that the hash of a window (substring) depends on the character leaving the window and the character entering it.
The solution relies on calculating the hash backwards, starting from the end of the string. As we move the window to the left by one character, we update the calculated hash by subtracting the value of the character that is leaving the window, multiplying the entire hash by power
(to account for the positional shifts), and adding the value of the new character entering the window on the left. We take the modulo after each operation to ensure the result stays within bounds considering the modulo
parameter.
Calculating power^k
for each new window would be computationally expensive, so we compute it only once at the beginning, and as we move the window, we use this precomputed value to subtract the influence of the character that left the window.
The implementation uses BigInt for arithmetic operations, to handle the case where the numbers get very large. Each time when a new hash matches the given hashValue
, we update the position of the answer. Once we've scanned through the string, we extract the substring from the correct position and return it. If by any chance we don't find any such substring, which shouldn't happen according to the problem statement, we'd return an empty string.
Learn more about Sliding Window patterns.
Solution Approach
The solution approach involves the use of the sliding window technique along with modular arithmetic principles to efficiently calculate and update the hash values of substrings. Let's dissect the solution code to understand its implementation:
-
The
power
andmodulo
parameters are converted toBigInt
, as we will be dealing with potentially large integers due to the hash calculations and thepower
operations. -
The
pk
variable is initialized to1n
, which representspower^k
inBigInt
form. This will be used later to discount the character value that is leaving the window as we slide. -
The
ac
variable holds the current hash value of the windowed substring as it slides. It is alsoBigInt
. -
We first compute the hash value of the last
k
characters of the strings
, since we are dealing with a reverse iteration. We multiplyac
bypower
and add the character value of the current character using thegetCode
function, then take the remainder after dividing bymodulo
. -
If the hash value matches
hashValue
at this initial position, we mark this position asans
. It is the starting index of a substring with the desired hash value. -
Then, for each iteration moving the window one step to the left, we perform the following operations:
- We calculate
pre
, which is the modular value that needs to be removed from the current hash. It represents the influence of the character that is about to leave the window. - Then we update
ac
by removingpre
, and adding the new character's value, after which we multiply bypower
. We also need to addmodulo
before taking the remainder to ensure we are dealing with non-negative numbers.
- We calculate
-
If the newly calculated
ac
equalshashValue
, we updateans
with the current starting index of the windowed substring. Since we are iterating from right to left, the first match we find will be the first from the left in the strings
. -
The
getCode
helper function calculates the character value based on the alphabet index:str.charCodeAt(index) - 'a'.charCodeAt(0) + 1
.
After the loop is finished, ans
will hold the starting index of the correct substring or -1
if no such substring is found (as per the problem constraints, an answer always exists). We return the substring using s.substring(ans, ans + k)
if a match is found; otherwise, we return an empty string.
The elegance of this approach is in the sliding window and the way we update the hash value in constant time, avoiding recomputation from scratch for each new window. This optimizes the solution and allows it to run efficiently even for large inputs.
Ready to land your dream job?
Unlock your dream job with a 2-minute evaluator for a personalized learning plan!
Start EvaluatorExample Walkthrough
Let's consider a small example to illustrate the solution approach:
Suppose we have a string s = "abcde"
, and the integers power = 2
, modulo = 100
, k = 2
, and hashValue = 9
.
Here, the character values based on their position in the alphabet are 'a' = 1, 'b' = 2, 'c' = 3, 'd' = 4, and 'e' = 5.
The hash of a substring is calculated as follows:
(hash(s[i]) * power^0 + hash(s[i + 1]) * power^1 + ... + hash(s[i + k - 1]) * power^(k-1)) mod modulo
We want to find a substring of length k = 2
whose hash value is 9
.
-
Convert
power
andmodulo
into BigInt:power = 2n
andmodulo = 100n
. -
The initial value of
pk
(which representspower^k
) will be2^2 = 4
. Sopk = 4n
. -
Calculate the hash value
ac
of the last 2 characters 'de':ac = (4 * power^0 + 5 * power^1) mod modulo = (4 + 5 * 2) mod 100 = (4 + 10) mod 100 = 14 mod 100 = 14
. Since14
is not equal tohashValue
, we proceed. -
Now slide the window to the left to compute the hash for the substring 'cd':
pre = hash(s[i + k - 1]) * pk mod modulo = 4 * 4 mod 100 = 16 mod 100 = 16
.- Update
ac
:ac = (ac * power - pre + hash(s[i])) mod modulo = (14 * 2 - 16 + 3) mod 100 = (28 - 16 + 3) mod 100 = 15 mod 100 = 15
. Again,15
is not equal tohashValue
, so we continue to move the window.
-
Next, for 'bc':
pre = hash(s[i + k - 1]) * pk mod modulo = 3 * 4 mod 100 = 12 mod 100 = 12
.- Update
ac
:ac = (ac * power - pre + hash(s[i])) mod modulo = (15 * 2 - 12 + 2) mod 100 = (30 - 12 + 2) mod 100 = 20 mod 100 = 20
.20
does not matchhashValue
.
-
Lastly, check 'ab':
pre = hash(s[i + k - 1]) * pk mod modulo = 2 * 4 mod 100 = 8 mod 100 = 8
.- Update
ac
:ac = (ac * power - pre + hash(s[i])) mod modulo = (20 * 2 - 8 + 1) mod 100 = (40 - 8 + 1) mod 100 = 33 mod 100 = 33
.33
does not matchhashValue
.
In this case, no substring of length 2
with the hash value of 9
can be found using the hash function described. However, this is purely an illustrative example, and according to the problem constraints, there will always be a solution.
Here, the power of the described method lies in its constant-time updates of the hash value for each sliding step, rather than recalculating from scratch. Each step involves just a few arithmetic operations, regardless of k
, which makes the approach efficient for larger data.
Solution Implementation
1def sub_str_hash(s, power, modulo, k, hash_value):
2 """
3 This function searches for a substring in a given string 's' where the hash value of the
4 substring matches 'hash_value'. It uses a sliding window and rolling hash technique to
5 compute the hash of substrings in an efficient manner.
6
7 :param s: The string to search within.
8 :param power: The base value for hashing.
9 :param modulo: The modulo value for hashing to prevent overflow.
10 :param k: The length of the substring to find.
11 :param hash_value: The target hash value.
12 :return: The first substring that matches the hash_value, or an empty string if not found.
13 """
14 power_bigint = pow(power, 1, modulo) # Power without 'n', in Python we use pow with 3 arguments to manage modulo
15 hash_value_bigint = hash_value # No need for conversion, in Python all integers are of arbitrary size
16 n = len(s)
17 power_k = 1
18 rolling_hash = 0
19
20 # Prepare the hash for the last 'k' characters of the string 's'
21 for i in range(n - 1, n - k - 1, -1): # Use range for reverse counting
22 rolling_hash = (rolling_hash * power_bigint + get_char_value(s, i)) % modulo
23 power_k = (power_k * power_bigint) % modulo
24
25 found_position = -1
26 # If the rolling hash of the last 'k' characters matches 'hash_value', store this position
27 if rolling_hash == hash_value_bigint:
28 found_position = n - k
29
30 # Slide the window, now processing from right to left
31 for i in range(n - k - 1, -1, -1):
32 pre_value = (get_char_value(s, i + k) * power_k) % modulo
33 rolling_hash = (rolling_hash * power_bigint + get_char_value(s, i) - pre_value + modulo) % modulo
34 if rolling_hash == hash_value_bigint:
35 found_position = i
36
37 if found_position == -1:
38 return ''
39 else:
40 return s[found_position: found_position + k]
41
42def get_char_value(str, index):
43 """
44 This function computes the hash code of a character at a specific index in a string.
45
46 :param str: The string which contains the character.
47 :param index: The index of the character.
48 :return: The hash code of the character.
49 """
50 return ord(str[index]) - ord('a') + 1
51
52# Example usage:
53print(sub_str_hash("abcdef", 31, 100, 3, 53)) # Example output: "abc"
54
1public class SubStringHasher {
2
3 /**
4 * Converts a character to its corresponding hash value.
5 *
6 * @param ch - The character to be converted.
7 * @return - The hash code of the character.
8 */
9 private static long getCharValue(char ch) {
10 return ch - 'a' + 1;
11 }
12
13 /**
14 * Finds the substring of a given string 's' where the hash value of the substring
15 * matches 'hashValue', using a sliding window and rolling hash technique.
16 *
17 * @param s - The string to search within.
18 * @param power - The base value for hashing.
19 * @param modulo - The modulo value for hashing to avoid overflow.
20 * @param k - The length of the substring to find.
21 * @param hashValue - The target hash value.
22 * @return - The first substring matching the hashValue, or an empty string if not found.
23 */
24 public String subStrHash(String s, int power, int modulo, int k, int hashValue) {
25 long powerLong = power;
26 long moduloLong = modulo;
27 long hashValueLong = hashValue;
28 int n = s.length();
29
30 long powerK = 1;
31 long rollingHash = 0;
32
33 // Prepare the initial value of powerK to be power^k mod modulo
34 for (int i = 0; i < k; i++) {
35 powerK = (powerK * powerLong) % moduloLong;
36 }
37
38 // Calculate the hash of the last k characters of string 's'
39 for (int i = n - 1; i >= n - k; i--) {
40 rollingHash = (rollingHash * powerLong + getCharValue(s.charAt(i))) % moduloLong;
41 }
42
43 // Start as "not found"
44 int foundPosition = -1;
45
46 // Check if the initial hash already matches the target hash
47 if (rollingHash == hashValueLong) {
48 foundPosition = n - k;
49 }
50
51 // Slide the window from the end to the beginning of the string 's'
52 for (int i = n - k - 1; i >= 0; i--) {
53 // Remove the value of the previous character and add the next character's value to update rolling hash
54 rollingHash =
55 ((rollingHash - getCharValue(s.charAt(i + k)) * powerK % moduloLong + moduloLong) * powerLong
56 + getCharValue(s.charAt(i))) % moduloLong;
57
58 // If a matching hash is found, record the position
59 if (rollingHash == hashValueLong) {
60 foundPosition = i;
61 }
62 }
63
64 // Return the found substring or an empty string if no match is found
65 return foundPosition != -1 ? s.substring(foundPosition, foundPosition + k) : "";
66 }
67}
68
1#include <string>
2
3/**
4 * Calculates the rolling hash value of a substring.
5 * Uses a reverse sliding window and rolling hash technique to efficiently calculate the hash of substrings.
6 *
7 * @param s The string to search within.
8 * @param power The base value for hashing.
9 * @param modulo The modulo value for hashing to prevent overflow.
10 * @param k The length of the substring to find.
11 * @param hash_value The target hash value.
12 * @return The first substring matching the hash value or an empty string if not found.
13 */
14std::string subStrHash(const std::string& s, int power, int modulo, int k, int hash_value) {
15 long long power_k = 1;
16 long long rolling_hash = 0;
17 int n = s.length();
18
19 // Calculate the power of base raised to the length of the substring (k)
20 for (int i = 0; i < k; ++i) {
21 power_k = (power_k * power) % modulo;
22 }
23
24 // Calculate the hash of the last k characters of the string using the reverse sliding window technique
25 // Start from the end and work backwards
26 for (int i = n - 1; i >= n - k; --i) {
27 rolling_hash = (rolling_hash * power + getCharValue(s[i])) % modulo;
28 }
29
30 int found_position = -1;
31 // Check if the hash matches the target hash value
32 if (rolling_hash == hash_value) {
33 found_position = n - k;
34 }
35
36 // Slide the window from the end towards the beginning of the string 's'
37 for (int i = n - k - 1; i >= 0; --i) {
38 // Roll the hash one position backward and include the new character at the beginning while excluding the last character
39 rolling_hash = (rolling_hash * power - getCharValue(s[i + k]) * power_k % modulo + getCharValue(s[i])) % modulo;
40 // Since modulo operation can yield negative values, ensure the hash value is positive
41 if (rolling_hash < 0) {
42 rolling_hash += modulo;
43 }
44 // If a matching hash is found, update the found position
45 if (rolling_hash == hash_value) {
46 found_position = i;
47 }
48 }
49
50 // Return the matched substring or an empty string if no match is found
51 return found_position == -1 ? "" : s.substr(found_position, k);
52}
53
54/**
55 * Converts a character to an integer value based on its alphabet position.
56 *
57 * @param c The character to be converted.
58 * @return The integer value of the character.
59 */
60long long getCharValue(char c) {
61 // Convert character to its position in the alphabet (1-based indexing)
62 return c - 'a' + 1;
63}
64
1/**
2 * Finds the substring of a given string 's', where the hash value of the substring matches 'hashValue'.
3 * Uses a sliding window and rolling hash technique to efficiently calculate the hash of substrings.
4 *
5 * @param {string} s - The string to search within.
6 * @param {number} power - The base value for hashing.
7 * @param {number} modulo - The modulo value for hashing to avoid overflow.
8 * @param {number} k - The length of the substring to find.
9 * @param {number} hashValue - The target hash value.
10 * @returns {string} - The first substring matching the hashValue, or an empty string if not found.
11 */
12function subStrHash(s: string, power: number, modulo: number, k: number, hashValue: number): string {
13 const powerBigInt: bigint = BigInt(power);
14 const moduloBigInt: bigint = BigInt(modulo);
15 const hashValueBigInt: bigint = BigInt(hashValue);
16 const n: number = s.length;
17 let powerK: bigint = 1n;
18 let rollingHash: bigint = 0n;
19 // Reverse sliding window to calculate the hash of the substring
20 for (let i: number = n - 1; i > n - 1 - k; i--) {
21 rollingHash = (rollingHash * powerBigInt + getCharValue(s, i)) % moduloBigInt;
22 powerK = (powerK * powerBigInt) % moduloBigInt;
23 }
24 let foundPosition: number = -1;
25 // Check if the last k characters match the hashValue
26 if (rollingHash == hashValueBigInt) {
27 foundPosition = n - k;
28 }
29 // Slide the window from the end to the beginning of string 's'
30 for (let i: number = n - k - 1; i >= 0; i--) {
31 const preValue: bigint = (getCharValue(s, i + k) * powerK) % moduloBigInt;
32 rollingHash = (rollingHash * powerBigInt + getCharValue(s, i) - preValue + moduloBigInt) % moduloBigInt;
33 // If a matching hash is found, update the found position
34 if (rollingHash == hashValueBigInt) {
35 foundPosition = i;
36 }
37 }
38 // Return the found substring or an empty string if no match is found
39 return foundPosition === -1 ? '' : s.substring(foundPosition, foundPosition + k);
40}
41
42/**
43 * Gets the hash code of a character at a specific index in a string.
44 *
45 * @param {string} str - The string containing the character.
46 * @param {number} index - The index of the character in the string.
47 * @returns {bigint} - The hash code as a bigint.
48 */
49function getCharValue(str: string, index: number): bigint {
50 return BigInt(str.charCodeAt(index) - 'a'.charCodeAt(0) + 1);
51}
52
Time and Space Complexity
Time Complexity
The time complexity of the function subStrHash
is O(n * k). Here's the breakdown:
- The first for loop runs for
k
iterations (wherek
is the length of substring we're looking for). Inside this loop, operations are constant time (with the exception of BigInt operations, which we'll assume to be O(1) for large numbers that can fit in the memory). This loop contributes O(k). - The second for loop runs for
n - k
iterations wheren
is the length of the strings
. The operations inside this loop are also constant time, thus this loop contributes O(n - k). - Therefore, when you add both the contributions together, the worst case time complexity is O(n).
Space Complexity
The space complexity of the function subStrHash
is O(1). Here's why:
- No additional data structures that grow with the input size are used. Only a fixed number of BigInt variables (
ac
,pk
,power
,modulo
,hashValue
) and integers (n
,ans
,i
) are used. - Since the algorithm doesn't utilize any space relative to the input size (ignoring the space taken by input arguments), the space complexity is constant.
Learn more about how to find time and space complexity quickly using problem constraints.
Depth first search is equivalent to which of the tree traversal order?
Recommended Readings
https algomonster s3 us east 2 amazonaws com cover_photos stack svg Sliding Window Maximum Monotonic Stack We have an array and a sliding window defined by a start index and an end index The sliding window moves from left of the array to right There are always k elements in
LeetCode Patterns Your Personal Dijkstra's Algorithm to Landing Your Dream Job The goal of AlgoMonster is to help you get a job in the shortest amount of time possible in a data driven way We compiled datasets of tech interview problems and broke them down by patterns This way we
Recursion Recursion is one of the most important concepts in computer science Simply speaking recursion is the process of a function calling itself Using a real life analogy imagine a scenario where you invite your friends to lunch https algomonster s3 us east 2 amazonaws com recursion jpg You first
Want a Structured Path to Master System Design Too? Don’t Miss This!