3006. Find Beautiful Indices in the Given Array I
Problem Description
The problem provides a string s
and two more strings a
and b
, along with an integer k
. We need to find all beautiful indices in s
. An index i
is defined as beautiful if:
- It is possible to select a substring from
s
starting ati
which is equal to stringa
. - There must exist another index
j
withins
such thatj
is the start of a substring equal to stringb
, and the absolute difference betweeni
andj
is less than or equal tok
.
After finding all the beautiful indices, the problem requires us to return them in sorted order.
Intuition
To solve this problem efficiently, we can use the Knuth-Morris-Pratt (KMP) string matching algorithm. The key intuition is to find all the occurrences of strings a
and b
as substrings within s
. The KMP algorithm does this efficiently by precomputing a prefix function which helps in avoiding unnecessary comparisons when a mismatch occurs.
Here's the high-level intuition for the solution:
- Precompute the prefix functions for both
a
andb
using KMP algorithm. - Find all the start indices within
s
wherea
andb
occur by using the previously computed prefix functions. - Now we have arrays with start indices for occurrences of
a
andb
, respectively. - Iterate through all the start indices of
a
, and for each index, find a matching index in theb
indices array that satisfies the beautiful index property.
While checking for the beautiful index property, we aim to find an index j
such that s[j..(j + b.length - 1)] == b
and the absolute difference between i
and j
(|i - j|
) is less than or equal to k
. If such an index is found, i
is a beautiful index.
The challenge here is to perform these checks efficiently. The sorted nature of indices obtained from KMP search allows using two pointers or a binary search method to quickly find the right pair (i, j) that satisfies the condition without having to compare each possible pair.
In short, the solution arrives at the answer by:
- Utilizing the KMP algorithm to find occurrences because it's efficient for string matching.
- Carefully iterating through the occurrences to find pairs that satisfy the given conditions, using a simple while loop and comparison logic.
Learn more about Two Pointers and Binary Search patterns.
Solution Approach
The solution follows a two-step approach. The first step is to find all occurrences of a
and b
within s
using the KMP string matching algorithm. The second step is to parse through these occurrences to determine the beautiful indices.
Step 1: KMP String Matching
The KMP algorithm is used here because of its efficiency in searching for occurrences of a pattern within a string. The algorithm avoids re-inspecting characters by using a prefix function to store how much the pattern can be shifted when a mismatch occurs.
The build_prefix_function
function computes this prefix function which is an array where the value at each index i
is the length of the longest proper suffix which is also a proper prefix for the substring pattern[0..i]
.
def build_prefix_function(pattern):
prefix_function = [0] * len(pattern)
j = 0
for i in range(1, len(pattern)):
while j > 0 and pattern[i] != pattern[j]:
j = prefix_function[j - 1]
if pattern[i] == pattern[j]:
j += 1
prefix_function[i] = j
return prefix_function
The kmp_search
function then uses the prefix function to efficiently match the pattern (a
or b
) within the text (s
). If a character does not match, the function uses the prefix function to skip some comparisons.
def kmp_search(pattern, text, prefix_function):
occurrences = []
j = 0
for i in range(len(text)):
while j > 0 and text[i] != pattern[j]:
j = prefix_function[j - 1]
if text[i] == pattern[j]:
j += 1
if j == len(pattern):
occurrences.append(i - j + 1)
j = prefix_function[j - 1]
return occurrences
Step 2: Finding the Beautiful Indices
Once we have the occurrences of a
and b
, we try to find all indices i
for a
where there is a corresponding index j
for b
that satisfies |i - j| <= k
. We iterate over the indices and, for each i
, find a compatible j
. The algorithm uses two pointers i
and j
, one for each list of occurrences (for a
and b
), to find pairs that fulfill the condition.
resa = kmp_search(a, s, prefix_a)
resb = kmp_search(b, s, prefix_b)
res = []
i = 0
j = 0
while i < len(resa):
while j < len(resb):
if abs(resb[j] - resa[i]) <= k:
res.append(resa[i])
break
elif j + 1 < len(resb) and abs(resb[j + 1] - resa[i]) < abs(resb[j] - resa[i]):
j += 1
else:
break
i += 1
Notice that we only update j
if doing so brings it closer to the current i
. This is an optimization to reduce the number of comparisons that need to be made, since both lists are sorted.
Finally, the result res
contains all beautiful indices, which the code then returns. The result is inherently sorted, as the KMP algorithm finds occurrences in order and we iterate through 'resa' in sequence.
return res
Ready to land your dream job?
Unlock your dream job with a 2-minute evaluator for a personalized learning plan!
Start EvaluatorExample Walkthrough
Let's walk through a small example to illustrate the solution approach. Consider the following input: s = "abcxabcdabxabcdabcdabcy"
, a = "abcdabcy"
, b = "abcy"
, and k = 10
.
Step 1: Building Prefix Functions
First, we compute the prefix functions for strings a
and b
using the build_prefix_function
.
- For
a = "abcdabcy"
, the prefix function will be[0, 0, 0, 0, 1, 2, 3, 0]
. - For
b = "abcy"
, the prefix function will be[0, 0, 0, 0]
.
Step 2: KMP Search for Occurrences in s
Next, we use the computed prefix functions to find occurrences of a
and b
within s
using kmp_search
.
- The
kmp_search
fora
ins
will find thata
fully matchess
starting at index15
. - The
kmp_search
forb
ins
will return matches at indices0
and15
.
Step 3: Finding Beautiful Indices
Finally, we want to find beautiful indices for a
. We must ensure there exists an index j
starting at b
, such that the absolute difference between any beautiful index i
for a
and j
is <= k
.
We iterate through occurrences of a
:
- At
i = 15
(wherea
is found withins
), we check against the occurrences ofb
. - For
b
at index0
, we have|15 - 0| = 15
which is> k
. - For
b
at index15
, we have|15 - 15| = 0
which is<= k
, so index15
fora
is beautiful.
Since we find that index 15
for a
has a matching b
within k
distance, we add 15
to the list of beautiful indices.
Final Output
Our output is [15]
, as that's the only beautiful index found through this process. The output is sorted as the algorithm checks each index sequentially and s
doesn't have any other occurrence of a
that would produce a beautiful index.
Solution Implementation
1from typing import List
2
3class Solution:
4 def beautifulIndices(self, text: str, pattern_a: str, pattern_b: str, k: int) -> List[int]:
5 def build_prefix_array(pattern):
6 # Initialize the prefix function array with zeros
7 prefix_arr = [0] * len(pattern)
8 # Iterate through the pattern to compute the prefix array
9 j = 0
10 for i in range(1, len(pattern)):
11 while j > 0 and pattern[i] != pattern[j]:
12 j = prefix_arr[j - 1]
13 if pattern[i] == pattern[j]:
14 j += 1
15 prefix_arr[i] = j
16 return prefix_arr
17
18 def kmp_search(pattern, text, prefix_arr):
19 # List to record the start indices of pattern occurrences in text
20 occurrences = []
21 # Initialize the pattern index j
22 j = 0
23 for i in range(len(text)):
24 while j > 0 and text[i] != pattern[j]:
25 j = prefix_arr[j - 1]
26 if text[i] == pattern[j]:
27 j += 1
28 if j == len(pattern):
29 # Full pattern found, append starting index to occurrences
30 occurrences.append(i - j + 1)
31 # Update j based on prefix array
32 j = prefix_arr[j - 1]
33 return occurrences
34
35 # Generate prefix arrays for both patterns
36 prefix_a = build_prefix_array(pattern_a)
37 prefix_b = build_prefix_array(pattern_b)
38
39 # Find all occurrences of pattern_a in text
40 occurrences_a = kmp_search(pattern_a, text, prefix_a)
41 # Find all occurrences of pattern_b in text
42 occurrences_b = kmp_search(pattern_b, text, prefix_b)
43
44 # Prepare the list to hold the indices of "beautiful" occurrences
45 result_indices = []
46 # Print occurrences for debugging purposes (you may remove this in production code)
47 print(occurrences_a, occurrences_b)
48
49 # Two pointers to traverse the separate occurrence lists
50 a_index = 0
51 b_index = 0
52
53 # Traverse through all occurrences of pattern_a
54 while a_index < len(occurrences_a):
55 # Compare with occurrences of pattern_b
56 while b_index < len(occurrences_b):
57 # Check if the index difference is within the range `k`
58 if abs(occurrences_b[b_index] - occurrences_a[a_index]) <= k:
59 result_indices.append(occurrences_a[a_index])
60 break # Found a valid "beautiful" index, move to the next a_index
61 elif (b_index + 1 < len(occurrences_b)
62 and abs(occurrences_b[b_index + 1] - occurrences_a[a_index])
63 < abs(occurrences_b[b_index] - occurrences_a[a_index])):
64 b_index += 1
65 else:
66 break
67 a_index += 1
68
69 return result_indices
70
1import java.util.ArrayList;
2import java.util.List;
3
4public class Solution {
5
6 // Computes the Longest Prefix Suffix (LPS) array used in KMP algorithm
7 public void computeLPS(String pattern, int[] lps) {
8 int lengthPattern = pattern.length();
9 int length = 0;
10
11 lps[0] = 0; // LPS of the first character is always 0
12
13 int index = 1;
14 while (index < lengthPattern) {
15 if (pattern.charAt(index) == pattern.charAt(length)) {
16 length++;
17 lps[index] = length;
18 index++;
19 } else {
20 if (length != 0) {
21 length = lps[length - 1];
22 } else {
23 lps[index] = 0;
24 index++;
25 }
26 }
27 }
28 }
29
30 // KMP algorithm to find occurrences of a pattern in a given text
31 public List<Integer> KMP_codestorywithMIK(String pat, String txt) {
32 int lengthTxt = txt.length();
33 int lengthPat = pat.length();
34 List<Integer> result = new ArrayList<>();
35
36 int[] lps = new int[lengthPat];
37 computeLPS(pat, lps);
38
39 int indexTxt = 0; // Index for text
40 int indexPat = 0; // Index for pattern
41
42 while (indexTxt < lengthTxt) {
43 if (pat.charAt(indexPat) == txt.charAt(indexTxt)) {
44 indexTxt++;
45 indexPat++;
46 }
47
48 if (indexPat == lengthPat) {
49 result.add(indexTxt - indexPat);
50 indexPat = lps[indexPat - 1];
51 } else if (indexTxt < lengthTxt && pat.charAt(indexPat) != txt.charAt(indexTxt)) {
52 if (indexPat != 0) {
53 indexPat = lps[indexPat - 1];
54 } else {
55 indexTxt++;
56 }
57 }
58 }
59
60 return result;
61 }
62
63 // Utility method to perform binary search to find the lower bound of the target
64 private int lowerBound(List<Integer> list, int target) {
65 int left = 0;
66 int right = list.size() - 1;
67 int result = list.size();
68
69 while (left <= right) {
70 int mid = left + (right - left) / 2;
71
72 if (list.get(mid) >= target) {
73 result = mid;
74 right = mid - 1;
75 } else {
76 left = mid + 1;
77 }
78 }
79
80 return result;
81 }
82
83 // Finds all beautiful indices based on the specified conditions.
84 public List<Integer> beautifulIndices(String s, String a, String b, int k) {
85 int lengthS = s.length();
86
87 // KMP to find all occurrences of string a in s
88 List<Integer> iIndices = KMP_codestorywithMIK(a, s);
89 // KMP to find all occurrences of string b in s
90 List<Integer> jIndices = KMP_codestorywithMIK(b, s);
91
92 List<Integer> result = new ArrayList<>();
93
94 // Iterate through all occurrences of a
95 for (int i : iIndices) {
96
97 // Calculate the valid range keeping in bounds
98 int leftLimit = Math.max(0, i - k);
99 int rightLimit = Math.min(lengthS - 1, i + k);
100
101 // Find the lower bound index for the occurrence of string b
102 int lowerBoundIndex = lowerBound(jIndices, leftLimit);
103
104 // If found index is within the right limit, add it to the result
105 if (lowerBoundIndex < jIndices.size() && jIndices.get(lowerBoundIndex) <= rightLimit) {
106 result.add(i);
107 }
108 }
109
110 return result;
111 }
112}
113
1#include <vector>
2#include <string>
3
4using std::vector;
5using std::string;
6
7class Solution {
8public:
9 // Method to find all 'beautiful' indices where patterns A and B occur within 'k' distance
10 vector<int> beautifulIndices(const string& s, const string& patternA, const string& patternB, int k) {
11 // Find all occurrences of patternA and patternB in s using KMP algorithm
12 vector<int> beautifulIndicesA = kmpSearch(s, patternA);
13 vector<int> beautifulIndicesB = kmpSearch(s, patternB);
14
15 // Sort the indices for pattern B to allow for binary search
16 std::sort(beautifulIndicesB.begin(), beautifulIndicesB.end());
17
18 vector<int> result; // Holds the indices considered 'beautiful'
19 for (int indexA : beautifulIndicesA) {
20 // Find range of beautifulIndicesB within distance 'k' of indexA
21 auto leftIt = std::lower_bound(beautifulIndicesB.begin(), beautifulIndicesB.end(), indexA - k);
22 auto rightIt = std::lower_bound(beautifulIndicesB.begin(), beautifulIndicesB.end(), indexA + k + patternB.length());
23
24 for (int indexB = leftIt - beautifulIndicesB.begin(); indexB < rightIt - beautifulIndicesB.begin(); indexB++) {
25 // If the distance between occurrences of pattern A and B is <= k, add to result
26 if (abs(beautifulIndicesB[indexB] - indexA) <= k) {
27 result.push_back(indexA);
28 break;
29 }
30 }
31 }
32
33 return result;
34 }
35
36private:
37 // KMP (Knuth-Morris-Pratt) search algorithm to find all occurrences of a pattern in a text
38 vector<int> kmpSearch(const string& text, const string& pattern) {
39 vector<int> indices; // Stores start indices of each occurrence of pattern in text
40 vector<int> pi = computePrefixFunction(pattern);
41
42 int q = 0; // Number of characters matched
43 for (int i = 0; i < text.length(); i++) { // Loop over characters in text
44 while (q > 0 && pattern[q] != text[i]) {
45 q = pi[q - 1]; // Use the precomputed pi vector to skip characters
46 }
47 if (pattern[q] == text[i]) {
48 q++; // Next character matched, increment the count
49 }
50 if (q == pattern.length()) {
51 indices.push_back(i - q + 1); // Pattern found; record index
52 q = pi[q - 1]; // Prepare q for next potential match
53 }
54 }
55
56 return indices;
57 }
58
59 // Compute the prefix function for KMP algorithm - this helps to determine
60 // the next state of the search (where to start matching after a non-match)
61 vector<int> computePrefixFunction(const string& pattern) {
62 int m = pattern.length();
63 vector<int> pi(m, 0);
64 int k = 0; // The number of characters matched
65 for (int q = 1; q < m; q++) { // Loop over the pattern's characters
66 while (k > 0 && pattern[k] != pattern[q]) {
67 k = pi[k - 1]; // Use the precomputed pi vector to skip characters
68 }
69 if (pattern[k] == pattern[q]) {
70 k++; // Next character matched, increment the count
71 }
72 pi[q] = k; // pi[q] is now complete
73 }
74 return pi;
75 }
76};
77
1function kmpSearch(text: string, pattern: string): number[] {
2 const indices: number[] = []; // Stores start indices of each occurrence of pattern in text
3 const pi = computePrefixFunction(pattern);
4
5 let matchedCharsCount = 0; // Number of characters matched
6 for (let i = 0; i < text.length; i++) { // Loop over characters in text
7 while (matchedCharsCount > 0 && pattern[matchedCharsCount] !== text[i]) {
8 matchedCharsCount = pi[matchedCharsCount - 1]; // Use the precomputed pi to skip characters
9 }
10 if (pattern[matchedCharsCount] === text[i]) {
11 matchedCharsCount++; // Next character matched, increment the count
12 }
13 if (matchedCharsCount === pattern.length) {
14 indices.push(i - matchedCharsCount + 1); // Pattern found; record index
15 matchedCharsCount = pi[matchedCharsCount - 1]; // Prepare for next potential match
16 }
17 }
18
19 return indices;
20}
21
22function computePrefixFunction(pattern: string): number[] {
23 const m = pattern.length;
24 const pi = new Array(m).fill(0);
25 let k = 0; // The number of characters matched
26 for (let q = 1; q < m; q++) { // Loop over the pattern's characters
27 while (k > 0 && pattern[k] !== pattern[q]) {
28 k = pi[k - 1]; // Use the precomputed pi to skip characters
29 }
30 if (pattern[k] === pattern[q]) {
31 k++; // Next character matched, increment the count
32 }
33 pi[q] = k; // pi[q] is now complete
34 }
35 return pi;
36}
37
38function beautifulIndices(s: string, patternA: string, patternB: string, k: number): number[] {
39 const indicesA = kmpSearch(s, patternA); // Find all occurrences of patternA in s
40 const indicesB = kmpSearch(s, patternB); // Find all occurrences of patternB in s
41
42 // Sort the indices for pattern B to enable efficient searching
43 indicesB.sort((a, b) => a - b);
44
45 const result: number[] = []; // Holds the indices considered 'beautiful'
46 for (const indexA of indicesA) {
47 // Find range of indicesB within distance 'k' of indexA
48 const leftIndex = lowerBound(indicesB, indexA - k);
49 const rightIndex = lowerBound(indicesB, indexA + patternB.length + k);
50
51 for (let i = leftIndex; i < rightIndex; i++) {
52 // If the distance between occurrences of pattern A and B is <= k, add to the result
53 if (Math.abs(indicesB[i] - indexA) <= k) {
54 result.push(indexA);
55 // Once an indexA is considered 'beautiful' with respect to an indexB, move on to the next indexA
56 break;
57 }
58 }
59 }
60
61 return result;
62}
63
64// Defines a lower bound function similar to std::lower_bound in C++
65function lowerBound(arr: number[], value: number): number {
66 let low = 0;
67 let high = arr.length;
68
69 while (low < high) {
70 const mid = Math.floor((low + high) / 2);
71 if (value <= arr[mid]) {
72 high = mid;
73 } else {
74 low = mid + 1;
75 }
76 }
77 return low; // Returns the index of the first element greater than or equal to 'value'
78}
79
Time and Space Complexity
Time Complexity
Let n
be the length of string s
, and m
be the length of strings a
and b
. The time complexity can be analyzed as follows:
-
The
build_prefix_function
function has a time complexity ofO(m)
as it goes through the length of either stringa
orb
once to build the prefix function (also known as the failure function) that is later used in KMP pattern searching. -
The
kmp_search
function operates by going through the entire texts
and applying the prefix function to find occurrences ofa
orb
. It has a time complexity ofO(n)
because the inner while loop does not backtrack more than it progresses. -
We call
kmp_search
twice, once fora
and once forb
. Therefore, we have2 * O(n)
which is stillO(n)
. -
In the worst case, we can have
O(n/m)
occurrences for eithera
orb
; the two while loops at the end will then have a complexity ofO((n/m) * (n/m))
because each element inresa
can be compared with each element inresb
in the worst case scenario.
Combining all parts, the overall time complexity becomes O(m) + O(n) + O((n/m) * (n/m))
. Typically, m
will be much smaller than n
, and thus the dominating term will be O(n) + O((n/m) * (n/m))
.
Space Complexity
The space complexity can be analyzed as follows:
-
The
build_prefix_function
function creates a list of sizem
for the prefix table of botha
andb
, hence2 * O(m)
which simplifies toO(m)
. -
The
kmp_search
function returns a list of the occurrences ofa
andb
withins
which could in the worst case beO(n/m)
occurrences each. Thus, the space needed to store these occurrences would be2 * O(n/m)
. -
The list
res
collects the beautiful indices. In the worst case, this might include all occurrences fromresa
which would beO(n/m)
.
Combining all parts, the overall space complexity is O(m) + 2 * O(n/m) + O(n/m)
, which simplifies to O(m) + O(n/m)
. In the scenario where m
is much smaller than n
, the dominating term will be O(n/m)
.
Learn more about how to find time and space complexity quickly using problem constraints.
A heap is a ...?
Recommended Readings
Tech Interview Pattern Two Pointers Introduction If you prefer videos here's a super quick introduction to Two Pointers div class responsive iframe iframe src https www youtube com embed xZ4AfXHQ1VQ title YouTube video player frameborder 0 allow accelerometer autoplay clipboard write encrypted media gyroscope picture in picture allowfullscreen iframe div Two pointers is a common interview
Binary Search Speedrun For each of the Speedrun questions you will be given a binary search related problem and a corresponding multiple choice question The multiple choice questions are related to the techniques and template s introduced in the binary search section It's recommended that you have gone through at
LeetCode Patterns Your Personal Dijkstra's Algorithm to Landing Your Dream Job The goal of AlgoMonster is to help you get a job in the shortest amount of time possible in a data driven way We compiled datasets of tech interview problems and broke them down by patterns This way we
Want a Structured Path to Master System Design Too? Don’t Miss This!