3006. Find Beautiful Indices in the Given Array I

MediumTwo PointersStringBinary SearchString MatchingHash FunctionRolling Hash
Leetcode Link

Problem Description

The problem provides a string s and two more strings a and b, along with an integer k. We need to find all beautiful indices in s. An index i is defined as beautiful if:

  • It is possible to select a substring from s starting at i which is equal to string a.
  • There must exist another index j within s such that j is the start of a substring equal to string b, and the absolute difference between i and j is less than or equal to k.

After finding all the beautiful indices, the problem requires us to return them in sorted order.

Intuition

To solve this problem efficiently, we can use the Knuth-Morris-Pratt (KMP) string matching algorithm. The key intuition is to find all the occurrences of strings a and b as substrings within s. The KMP algorithm does this efficiently by precomputing a prefix function which helps in avoiding unnecessary comparisons when a mismatch occurs.

Here's the high-level intuition for the solution:

  1. Precompute the prefix functions for both a and b using KMP algorithm.
  2. Find all the start indices within s where a and b occur by using the previously computed prefix functions.
  3. Now we have arrays with start indices for occurrences of a and b, respectively.
  4. Iterate through all the start indices of a, and for each index, find a matching index in the b indices array that satisfies the beautiful index property.

While checking for the beautiful index property, we aim to find an index j such that s[j..(j + b.length - 1)] == b and the absolute difference between i and j (|i - j|) is less than or equal to k. If such an index is found, i is a beautiful index.

The challenge here is to perform these checks efficiently. The sorted nature of indices obtained from KMP search allows using two pointers or a binary search method to quickly find the right pair (i, j) that satisfies the condition without having to compare each possible pair.

In short, the solution arrives at the answer by:

  1. Utilizing the KMP algorithm to find occurrences because it's efficient for string matching.
  2. Carefully iterating through the occurrences to find pairs that satisfy the given conditions, using a simple while loop and comparison logic.

Learn more about Two Pointers and Binary Search patterns.

Not Sure What to Study? Take the 2-min Quiz to Find Your Missing Piece:

Which data structure is used to implement recursion?

Solution Approach

The solution follows a two-step approach. The first step is to find all occurrences of a and b within s using the KMP string matching algorithm. The second step is to parse through these occurrences to determine the beautiful indices.

Step 1: KMP String Matching

The KMP algorithm is used here because of its efficiency in searching for occurrences of a pattern within a string. The algorithm avoids re-inspecting characters by using a prefix function to store how much the pattern can be shifted when a mismatch occurs.

The build_prefix_function function computes this prefix function which is an array where the value at each index i is the length of the longest proper suffix which is also a proper prefix for the substring pattern[0..i].

1def build_prefix_function(pattern):
2    prefix_function = [0] * len(pattern)
3    j = 0
4    for i in range(1, len(pattern)):
5        while j > 0 and pattern[i] != pattern[j]:
6            j = prefix_function[j - 1]
7        if pattern[i] == pattern[j]:
8            j += 1
9        prefix_function[i] = j
10    return prefix_function

The kmp_search function then uses the prefix function to efficiently match the pattern (a or b) within the text (s). If a character does not match, the function uses the prefix function to skip some comparisons.

1def kmp_search(pattern, text, prefix_function):
2    occurrences = []
3    j = 0
4    for i in range(len(text)):
5        while j > 0 and text[i] != pattern[j]:
6            j = prefix_function[j - 1]
7        if text[i] == pattern[j]:
8            j += 1
9        if j == len(pattern):
10            occurrences.append(i - j + 1)
11            j = prefix_function[j - 1]
12    return occurrences

Step 2: Finding the Beautiful Indices

Once we have the occurrences of a and b, we try to find all indices i for a where there is a corresponding index j for b that satisfies |i - j| <= k. We iterate over the indices and, for each i, find a compatible j. The algorithm uses two pointers i and j, one for each list of occurrences (for a and b), to find pairs that fulfill the condition.

1resa = kmp_search(a, s, prefix_a)
2resb = kmp_search(b, s, prefix_b)
3
4res = []
5i = 0
6j = 0
7while i < len(resa):
8    while j < len(resb):
9        if abs(resb[j] - resa[i]) <= k:
10            res.append(resa[i])
11            break
12        elif j + 1 < len(resb) and abs(resb[j + 1] - resa[i]) < abs(resb[j] - resa[i]):
13            j += 1
14        else:
15            break
16    i += 1

Notice that we only update j if doing so brings it closer to the current i. This is an optimization to reduce the number of comparisons that need to be made, since both lists are sorted.

Finally, the result res contains all beautiful indices, which the code then returns. The result is inherently sorted, as the KMP algorithm finds occurrences in order and we iterate through 'resa' in sequence.

1return res
Discover Your Strengths and Weaknesses: Take Our 2-Minute Quiz to Tailor Your Study Plan:

Suppose k is a very large integer(2^64). Which of the following is the largest as n grows to infinity?

Example Walkthrough

Let's walk through a small example to illustrate the solution approach. Consider the following input: s = "abcxabcdabxabcdabcdabcy", a = "abcdabcy", b = "abcy", and k = 10.

Step 1: Building Prefix Functions

First, we compute the prefix functions for strings a and b using the build_prefix_function.

  • For a = "abcdabcy", the prefix function will be [0, 0, 0, 0, 1, 2, 3, 0].
  • For b = "abcy", the prefix function will be [0, 0, 0, 0].

Step 2: KMP Search for Occurrences in s

Next, we use the computed prefix functions to find occurrences of a and b within s using kmp_search.

  • The kmp_search for a in s will find that a fully matches s starting at index 15.
  • The kmp_search for b in s will return matches at indices 0 and 15.

Step 3: Finding Beautiful Indices

Finally, we want to find beautiful indices for a. We must ensure there exists an index j starting at b, such that the absolute difference between any beautiful index i for a and j is <= k.

We iterate through occurrences of a:

  • At i = 15 (where a is found within s), we check against the occurrences of b.
  • For b at index 0, we have |15 - 0| = 15 which is > k.
  • For b at index 15, we have |15 - 15| = 0 which is <= k, so index 15 for a is beautiful.

Since we find that index 15 for a has a matching b within k distance, we add 15 to the list of beautiful indices.

Final Output

Our output is [15], as that's the only beautiful index found through this process. The output is sorted as the algorithm checks each index sequentially and s doesn't have any other occurrence of a that would produce a beautiful index.

Solution Implementation

1from typing import List
2
3class Solution:
4    def beautifulIndices(self, text: str, pattern_a: str, pattern_b: str, k: int) -> List[int]:
5        def build_prefix_array(pattern):
6            # Initialize the prefix function array with zeros
7            prefix_arr = [0] * len(pattern)
8            # Iterate through the pattern to compute the prefix array
9            j = 0
10            for i in range(1, len(pattern)):
11                while j > 0 and pattern[i] != pattern[j]:
12                    j = prefix_arr[j - 1]
13                if pattern[i] == pattern[j]:
14                    j += 1
15                prefix_arr[i] = j
16            return prefix_arr
17
18        def kmp_search(pattern, text, prefix_arr):
19            # List to record the start indices of pattern occurrences in text
20            occurrences = []
21            # Initialize the pattern index j
22            j = 0
23            for i in range(len(text)):
24                while j > 0 and text[i] != pattern[j]:
25                    j = prefix_arr[j - 1]
26                if text[i] == pattern[j]:
27                    j += 1
28                if j == len(pattern):
29                    # Full pattern found, append starting index to occurrences
30                    occurrences.append(i - j + 1)
31                    # Update j based on prefix array
32                    j = prefix_arr[j - 1]
33            return occurrences
34
35        # Generate prefix arrays for both patterns
36        prefix_a = build_prefix_array(pattern_a)
37        prefix_b = build_prefix_array(pattern_b)
38
39        # Find all occurrences of pattern_a in text
40        occurrences_a = kmp_search(pattern_a, text, prefix_a)
41        # Find all occurrences of pattern_b in text
42        occurrences_b = kmp_search(pattern_b, text, prefix_b)
43
44        # Prepare the list to hold the indices of "beautiful" occurrences
45        result_indices = []
46        # Print occurrences for debugging purposes (you may remove this in production code)
47        print(occurrences_a, occurrences_b)
48      
49        # Two pointers to traverse the separate occurrence lists
50        a_index = 0
51        b_index = 0
52      
53        # Traverse through all occurrences of pattern_a
54        while a_index < len(occurrences_a):
55            # Compare with occurrences of pattern_b
56            while b_index < len(occurrences_b):
57                # Check if the index difference is within the range `k`
58                if abs(occurrences_b[b_index] - occurrences_a[a_index]) <= k:
59                    result_indices.append(occurrences_a[a_index])
60                    break  # Found a valid "beautiful" index, move to the next a_index
61                elif (b_index + 1 < len(occurrences_b)
62                      and abs(occurrences_b[b_index + 1] - occurrences_a[a_index]) 
63                      < abs(occurrences_b[b_index] - occurrences_a[a_index])):
64                    b_index += 1
65                else:
66                    break
67            a_index += 1
68
69        return result_indices
70
1import java.util.ArrayList;
2import java.util.List;
3
4public class Solution {
5
6    // Computes the Longest Prefix Suffix (LPS) array used in KMP algorithm
7    public void computeLPS(String pattern, int[] lps) {
8        int lengthPattern = pattern.length();
9        int length = 0;
10
11        lps[0] = 0; // LPS of the first character is always 0
12
13        int index = 1;
14        while (index < lengthPattern) {
15            if (pattern.charAt(index) == pattern.charAt(length)) {
16                length++;
17                lps[index] = length;
18                index++;
19            } else {
20                if (length != 0) {
21                    length = lps[length - 1];
22                } else {
23                    lps[index] = 0;
24                    index++;
25                }
26            }
27        }
28    }
29
30    // KMP algorithm to find occurrences of a pattern in a given text
31    public List<Integer> KMP_codestorywithMIK(String pat, String txt) {
32        int lengthTxt = txt.length();
33        int lengthPat = pat.length();
34        List<Integer> result = new ArrayList<>();
35
36        int[] lps = new int[lengthPat];
37        computeLPS(pat, lps);
38
39        int indexTxt = 0; // Index for text
40        int indexPat = 0; // Index for pattern
41
42        while (indexTxt < lengthTxt) {
43            if (pat.charAt(indexPat) == txt.charAt(indexTxt)) {
44                indexTxt++;
45                indexPat++;
46            }
47
48            if (indexPat == lengthPat) {
49                result.add(indexTxt - indexPat);
50                indexPat = lps[indexPat - 1];
51            } else if (indexTxt < lengthTxt && pat.charAt(indexPat) != txt.charAt(indexTxt)) {
52                if (indexPat != 0) {
53                    indexPat = lps[indexPat - 1];
54                } else {
55                    indexTxt++;
56                }
57            }
58        }
59
60        return result;
61    }
62
63    // Utility method to perform binary search to find the lower bound of the target
64    private int lowerBound(List<Integer> list, int target) {
65        int left = 0;
66        int right = list.size() - 1;
67        int result = list.size();
68
69        while (left <= right) {
70            int mid = left + (right - left) / 2;
71
72            if (list.get(mid) >= target) {
73                result = mid;
74                right = mid - 1;
75            } else {
76                left = mid + 1;
77            }
78        }
79
80        return result;
81    }
82
83    // Finds all beautiful indices based on the specified conditions.
84    public List<Integer> beautifulIndices(String s, String a, String b, int k) {
85        int lengthS = s.length();
86
87        // KMP to find all occurrences of string a in s
88        List<Integer> iIndices = KMP_codestorywithMIK(a, s);
89        // KMP to find all occurrences of string b in s
90        List<Integer> jIndices = KMP_codestorywithMIK(b, s);
91
92        List<Integer> result = new ArrayList<>();
93
94        // Iterate through all occurrences of a
95        for (int i : iIndices) {
96
97            // Calculate the valid range keeping in bounds
98            int leftLimit = Math.max(0, i - k);
99            int rightLimit = Math.min(lengthS - 1, i + k);
100
101            // Find the lower bound index for the occurrence of string b
102            int lowerBoundIndex = lowerBound(jIndices, leftLimit);
103
104            // If found index is within the right limit, add it to the result
105            if (lowerBoundIndex < jIndices.size() && jIndices.get(lowerBoundIndex) <= rightLimit) {
106                result.add(i);
107            }
108        }
109
110        return result;
111    }
112}
113
1#include <vector>
2#include <string>
3
4using std::vector;
5using std::string;
6
7class Solution {
8public:
9    // Method to find all 'beautiful' indices where patterns A and B occur within 'k' distance
10    vector<int> beautifulIndices(const string& s, const string& patternA, const string& patternB, int k) {
11        // Find all occurrences of patternA and patternB in s using KMP algorithm
12        vector<int> beautifulIndicesA = kmpSearch(s, patternA);
13        vector<int> beautifulIndicesB = kmpSearch(s, patternB);
14
15        // Sort the indices for pattern B to allow for binary search
16        std::sort(beautifulIndicesB.begin(), beautifulIndicesB.end());
17
18        vector<int> result; // Holds the indices considered 'beautiful'
19        for (int indexA : beautifulIndicesA) {
20            // Find range of beautifulIndicesB within distance 'k' of indexA
21            auto leftIt = std::lower_bound(beautifulIndicesB.begin(), beautifulIndicesB.end(), indexA - k);
22            auto rightIt = std::lower_bound(beautifulIndicesB.begin(), beautifulIndicesB.end(), indexA + k + patternB.length());
23
24            for (int indexB = leftIt - beautifulIndicesB.begin(); indexB < rightIt - beautifulIndicesB.begin(); indexB++) {
25                // If the distance between occurrences of pattern A and B is <= k, add to result
26                if (abs(beautifulIndicesB[indexB] - indexA) <= k) {
27                    result.push_back(indexA);
28                    break;
29                }
30            }
31        }
32
33        return result;
34    }
35
36private:
37    // KMP (Knuth-Morris-Pratt) search algorithm to find all occurrences of a pattern in a text
38    vector<int> kmpSearch(const string& text, const string& pattern) {
39        vector<int> indices; // Stores start indices of each occurrence of pattern in text
40        vector<int> pi = computePrefixFunction(pattern);
41
42        int q = 0; // Number of characters matched
43        for (int i = 0; i < text.length(); i++) { // Loop over characters in text
44            while (q > 0 && pattern[q] != text[i]) {
45                q = pi[q - 1]; // Use the precomputed pi vector to skip characters
46            }
47            if (pattern[q] == text[i]) {
48                q++; // Next character matched, increment the count
49            }
50            if (q == pattern.length()) {
51                indices.push_back(i - q + 1); // Pattern found; record index
52                q = pi[q - 1]; // Prepare q for next potential match
53            }
54        }
55
56        return indices;
57    }
58
59    // Compute the prefix function for KMP algorithm - this helps to determine
60    // the next state of the search (where to start matching after a non-match)
61    vector<int> computePrefixFunction(const string& pattern) {
62        int m = pattern.length();
63        vector<int> pi(m, 0);
64        int k = 0; // The number of characters matched
65        for (int q = 1; q < m; q++) { // Loop over the pattern's characters
66            while (k > 0 && pattern[k] != pattern[q]) {
67                k = pi[k - 1]; // Use the precomputed pi vector to skip characters
68            }
69            if (pattern[k] == pattern[q]) {
70                k++; // Next character matched, increment the count
71            }
72            pi[q] = k; // pi[q] is now complete
73        }
74        return pi;
75    }
76};
77
1function kmpSearch(text: string, pattern: string): number[] {
2    const indices: number[] = []; // Stores start indices of each occurrence of pattern in text
3    const pi = computePrefixFunction(pattern);
4
5    let matchedCharsCount = 0; // Number of characters matched
6    for (let i = 0; i < text.length; i++) { // Loop over characters in text
7        while (matchedCharsCount > 0 && pattern[matchedCharsCount] !== text[i]) {
8            matchedCharsCount = pi[matchedCharsCount - 1]; // Use the precomputed pi to skip characters
9        }
10        if (pattern[matchedCharsCount] === text[i]) {
11            matchedCharsCount++; // Next character matched, increment the count
12        }
13        if (matchedCharsCount === pattern.length) {
14            indices.push(i - matchedCharsCount + 1); // Pattern found; record index
15            matchedCharsCount = pi[matchedCharsCount - 1]; // Prepare for next potential match
16        }
17    }
18
19    return indices;
20}
21
22function computePrefixFunction(pattern: string): number[] {
23    const m = pattern.length;
24    const pi = new Array(m).fill(0);
25    let k = 0; // The number of characters matched
26    for (let q = 1; q < m; q++) { // Loop over the pattern's characters
27        while (k > 0 && pattern[k] !== pattern[q]) {
28            k = pi[k - 1]; // Use the precomputed pi to skip characters
29        }
30        if (pattern[k] === pattern[q]) {
31            k++; // Next character matched, increment the count
32        }
33        pi[q] = k; // pi[q] is now complete
34    }
35    return pi;
36}
37
38function beautifulIndices(s: string, patternA: string, patternB: string, k: number): number[] {
39    const indicesA = kmpSearch(s, patternA); // Find all occurrences of patternA in s
40    const indicesB = kmpSearch(s, patternB); // Find all occurrences of patternB in s
41
42    // Sort the indices for pattern B to enable efficient searching
43    indicesB.sort((a, b) => a - b);
44
45    const result: number[] = []; // Holds the indices considered 'beautiful'
46    for (const indexA of indicesA) {
47        // Find range of indicesB within distance 'k' of indexA
48        const leftIndex = lowerBound(indicesB, indexA - k);
49        const rightIndex = lowerBound(indicesB, indexA + patternB.length + k);
50
51        for (let i = leftIndex; i < rightIndex; i++) {
52            // If the distance between occurrences of pattern A and B is <= k, add to the result
53            if (Math.abs(indicesB[i] - indexA) <= k) {
54                result.push(indexA);
55                // Once an indexA is considered 'beautiful' with respect to an indexB, move on to the next indexA
56                break;
57            }
58        }
59    }
60
61    return result;
62}
63
64// Defines a lower bound function similar to std::lower_bound in C++
65function lowerBound(arr: number[], value: number): number {
66    let low = 0;
67    let high = arr.length;
68
69    while (low < high) {
70        const mid = Math.floor((low + high) / 2);
71        if (value <= arr[mid]) {
72            high = mid;
73        } else {
74            low = mid + 1;
75        }
76    }
77    return low; // Returns the index of the first element greater than or equal to 'value'
78}
79
Not Sure What to Study? Take the 2-min Quiz:

Which of the following array represent a max heap?

Time and Space Complexity

Time Complexity

Let n be the length of string s, and m be the length of strings a and b. The time complexity can be analyzed as follows:

  1. The build_prefix_function function has a time complexity of O(m) as it goes through the length of either string a or b once to build the prefix function (also known as the failure function) that is later used in KMP pattern searching.

  2. The kmp_search function operates by going through the entire text s and applying the prefix function to find occurrences of a or b. It has a time complexity of O(n) because the inner while loop does not backtrack more than it progresses.

  3. We call kmp_search twice, once for a and once for b. Therefore, we have 2 * O(n) which is still O(n).

  4. In the worst case, we can have O(n/m) occurrences for either a or b; the two while loops at the end will then have a complexity of O((n/m) * (n/m)) because each element in resa can be compared with each element in resb in the worst case scenario.

Combining all parts, the overall time complexity becomes O(m) + O(n) + O((n/m) * (n/m)). Typically, m will be much smaller than n, and thus the dominating term will be O(n) + O((n/m) * (n/m)).

Space Complexity

The space complexity can be analyzed as follows:

  1. The build_prefix_function function creates a list of size m for the prefix table of both a and b, hence 2 * O(m) which simplifies to O(m).

  2. The kmp_search function returns a list of the occurrences of a and b within s which could in the worst case be O(n/m) occurrences each. Thus, the space needed to store these occurrences would be 2 * O(n/m).

  3. The list res collects the beautiful indices. In the worst case, this might include all occurrences from resa which would be O(n/m).

Combining all parts, the overall space complexity is O(m) + 2 * O(n/m) + O(n/m), which simplifies to O(m) + O(n/m). In the scenario where m is much smaller than n, the dominating term will be O(n/m).

Learn more about how to find time and space complexity quickly using problem constraints.

Fast Track Your Learning with Our Quick Skills Quiz:

How would you design a stack which has a function min that returns the minimum element in the stack, in addition to push and pop? All push, pop, min should have running time O(1).


Recommended Readings


Got a question? Ask the Teaching Assistant anything you don't understand.

Still not clear? Ask in the Forum,  Discord or Submit the part you don't understand to our editors.


TA 👨‍🏫