3008. Find Beautiful Indices in the Given Array II
Problem Description
You are given four inputs: a 0-indexed string s
, and strings a
, b
, as well as an integer k
. The task is to find all the "beautiful" indices in the string s
. An index i
is considered "beautiful" if it satisfies the following conditions:
- It's within the bounds of
s
length when considering stringa
, meaning:0 <= i <= s.length - a.length
. - The substring of
s
starting at indexi
and having the length ofa
is equal toa
. Formally,s[i..(i + a.length - 1)] == a
. - There exists an index
j
(for the stringb
) within the bounds ofs
, which means:0 <= j <= s.length - b.length
. - The substring of
s
starting at indexj
and having the length ofb
is equal tob
. Formally,s[j..(j + b.length - 1)] == b
. - The absolute difference between
i
andj
is less than or equal tok
, which can be expressed as|j - i| <= k
.
The output should be an array of these "beautiful" indices, sorted from the smallest index to the largest.
Intuition
To find the beautiful indices efficiently, we can use the Knuth-Morris-Pratt (KMP) algorithm, which is a string searching (or string matching) algorithm that seeks the occurrences of a "word" W
within a main "text string" S
by employing the observation that when a mismatch occurs, the word itself embodies sufficient information to determine where the next match could begin, thus bypassing re-examination of previously matched characters.
Here's the intuition broken down:
- The code defines two functions
build_prefix_function
andkmp_search
that are classic implementations of the KMP algorithm. The prefix function is used to avoid re-scanning the texts
by pre-computing the longest prefix which is also a suffix, for substrings in pattern stringsa
andb
. - After getting prefix arrays for
a
andb
, thekmp_search
function is used to find all occurrences ofa
andb
inside the texts
. - Once we have occurrences of
a
andb
ins
, the algorithm then checks which index ofa
can be paired with an index ofb
such that they meet the criteria of being beautiful - their position is not farther thank
apart.
The solution involves iterating through each occurrence of a
and trying to find the nearest occurrence of b
that satisfies the distance constraint. It keeps track of the current positions while scanning through both sets of occurrences to avoid unnecessary comparisons and thus optimize the search.
Learn more about Two Pointers and Binary Search patterns.
Solution Approach
To address the problem, we use the Knuth-Morris-Pratt (KMP) algorithm due to its efficiency in string searching over a naïve approach which can be significantly slower when dealing with large strings. The provided solution comprises two main parts: building the prefix functions for strings a
and b
, and searching for occurrences of a
and b
using the KMP search algorithm within the main string s
. Let's walk through the implementation of the solution:
-
Building the Prefix Functions:
- The
build_prefix_function
function takes a pattern (eithera
orb
) and precomputes an array that contains the lengths of the longest proper prefix that is also a suffix for every prefix of the pattern. - This array is essential to optimize the KMP search by avoiding the repetition of the comparisons already done.
- The
-
KMP Search:
- Once the prefix functions are prepared, the
kmp_search
function scans the strings
for the occurrences of patternsa
andb
. - When scanning
s
, if a mismatch is found, the function uses the prefix function to skip characters in the text that are known to match the pattern up to a certain point, effectively reducing the number of comparisons. - Every time a complete match of the pattern is found in
s
, the start index is stored in the occurrences list.
- Once the prefix functions are prepared, the
-
Finding Beautiful Indices:
- With the indices of occurrences of
a
andb
in hands, we iterate through each index in the occurrences ofa
, while simultaneously iterating through the occurrences ofb
to check for the distance constraint. - A while loop is used to find the nearest occurrence of
b
for each occurrence ofa
. If the condition|resb[j] - resa[i]| <= k
is satisfied, then the indexresa[i]
is a beautiful index, and it is appended to the result list, after which we break to move on to the next index inresa
. - We maintain two pointers,
i
andj
, that move through the listsresa
andresb
respectively, seeking to find the smallestj
such that the distance betweena
andb
is withink
. - To ensure that we don't miss any closer b occurrences, we check if moving
j
forward gets us closer to the currenti
, and if not, we have found the nearestb
for the currenti
and move on.
- With the indices of occurrences of
Finally, we return the list res
which contains all the beautiful indices sorted from smallest to largest as by their natural order of finding during the scan.
This algorithm is efficient because it uses the precomputed information in the prefix arrays to avoid redundant scanning and comparisons, enabling a linear-time complexity for the search process, which is significantly faster than quadratic time complexity for the brute force method.
Ready to land your dream job?
Unlock your dream job with a 2-minute evaluator for a personalized learning plan!
Start EvaluatorExample Walkthrough
Let's illustrate the solution approach with a small example. Assume we have a string s = "abcxabcdabxabcdabcdabcy"
, pattern a = "abcdabcy"
, and pattern b = "abc"
with a distance constraint k = 10
.
-
Building the Prefix Functions:
- We first build the prefix function for
a = "abcdabcy"
. This results in a prefix array[0, 0, 0, 0, 1, 2, 3, 0]
, indicating where we can safely resume the comparison if a mismatch happens. - Next, we build the prefix function for
b = "abc"
, leading to[0, 0, 0]
, meaning there is no proper prefix inb
that matches a suffix in the case of a mismatch.
- We first build the prefix function for
-
KMP Search:
- Using the
kmp_search
function, we search for occurrences ofa
withins
. The only match is found wheni = 15
, which is stored inresa
. - Searching for
b
ins
results in matches ati = 0
,i = 6
, andi = 12
, which are recorded inresb
.
- Using the
-
Finding Beautiful Indices:
- With
resa = [15]
andresb = [0, 6, 12]
, we begin by comparing each index inresa
with indices inresb
to find any that meet the distance constraint. - Starting with the first and only index in
resa
(i = 15
), we check the indices inresb
. The indexj = 12
fromresb
matches the condition|15 - 12| <= 10
. Thus indexi = 15
fromresa
is considered "beautiful". - The condition is satisfied for
j = 12
, and it is not necessary to check forj = 6
orj = 0
since these indices would not provide a smaller absolute difference withi = 15
within the constraintk
.
- With
The result list res
will thus contain the single index [15]
, indicating the position of the beautiful index in s
with respect to a
.
This example demonstrates how the KMP algorithm efficiently searches for occurrences of the patterns and how we only need to compare the indices of these occurrences to find the beautiful indices without re-scanning large parts of the string s
.
Solution Implementation
1from typing import List
2
3class Solution:
4 def beautifulIndices(self, s: str, a: str, b: str, k: int) -> List[int]:
5 # Helper function to build the prefix function for a given pattern,
6 # used by the Knuth-Morris-Pratt (KMP) algorithm.
7 def build_prefix_function(pattern: str) -> List[int]:
8 prefix_function = [0] * len(pattern)
9 j = 0
10 # Loop through pattern to calculate the longest prefix that is also a suffix.
11 for i in range(1, len(pattern)):
12 while j > 0 and pattern[i] != pattern[j]:
13 j = prefix_function[j - 1]
14 if pattern[i] == pattern[j]:
15 j += 1
16 prefix_function[i] = j
17 return prefix_function
18
19 # KMP search algorithm to find all occurrences of a pattern in a text.
20 def kmp_search(pattern: str, text: str, prefix_function: List[int]) -> List[int]:
21 occurrences = []
22 j = 0
23 # Loop through text to find all matches of the pattern.
24 for i in range(len(text)):
25 while j > 0 and text[i] != pattern[j]:
26 j = prefix_function[j - 1]
27 if text[i] == pattern[j]:
28 j += 1
29 if j == len(pattern):
30 occurrences.append(i - j + 1)
31 j = prefix_function[j - 1]
32 return occurrences
33
34 # Build prefix functions for both patterns a and b.
35 prefix_a = build_prefix_function(a)
36 prefix_b = build_prefix_function(b)
37
38 # Use KMP search to find all occurrences of a and b in string s.
39 matches_a = kmp_search(a, s, prefix_a)
40 matches_b = kmp_search(b, s, prefix_b)
41
42 # List to hold all beautiful indices.
43 beautiful_indices = []
44
45 # Initialize pointers for iterating through matches of a and b.
46 i, j = 0, 0
47
48 # Process all matches of pattern a.
49 while i < len(matches_a):
50 while j < len(matches_b):
51 # Check if the current match of b is within k distance from match a.
52 if abs(matches_b[j] - matches_a[i]) <= k:
53 beautiful_indices.append(matches_a[i])
54 break
55 # If not, and there is another b ahead that's closer to match a,
56 # move to the next match of b.
57 elif j + 1 < len(matches_b) and abs(matches_b[j + 1] - matches_a[i]) < abs(matches_b[j] - matches_a[i]):
58 j += 1
59 else:
60 break
61 i += 1
62
63 # Return the list of beautiful indices.
64 return beautiful_indices
65
1import java.util.ArrayList;
2import java.util.List;
3
4public class Solution {
5
6 // Computes the Longest Prefix Suffix (LPS) array used in the KMP algorithm
7 public void computeLPS(String pattern, int[] lps) {
8 int lengthOfPattern = pattern.length();
9 int length = 0; // length of the previous longest prefix suffix
10
11 lps[0] = 0; // LPS[0] is always 0 because a single character doesn't have a proper prefix and suffix
12
13 int index = 1;
14 while (index < lengthOfPattern) {
15 // check if pattern continued from previous lps
16 if (pattern.charAt(index) == pattern.charAt(length)) {
17 length++;
18 lps[index] = length;
19 index++;
20 } else { // (pattern[i] != pattern[len])
21 // fallback to previous longest proper prefix and suffix
22 if (length != 0) {
23 length = lps[length - 1];
24 } else { // if no proper prefix and suffix, set to 0
25 lps[index] = 0;
26 index++;
27 }
28 }
29 }
30 }
31
32 // KMP algorithm to find all occurrences of a pattern in a text
33 public List<Integer> KMP_codestorywithMIK(String pattern, String text) {
34 int lengthOfText = text.length();
35 int lengthOfPattern = pattern.length();
36 List<Integer> result = new ArrayList<>();
37
38 int[] lps = new int[lengthOfPattern];
39 computeLPS(pattern, lps);
40
41 int indexOfText = 0; // Index for the text
42 int indexOfPattern = 0; // Index for the pattern
43
44 while (indexOfText < lengthOfText) {
45 if (pattern.charAt(indexOfPattern) == text.charAt(indexOfText)) {
46 indexOfText++;
47 indexOfPattern++;
48 }
49
50 // Check if we have found a pattern match
51 if (indexOfPattern == lengthOfPattern) {
52 // Pattern found at index (indexOfText - indexOfPattern); add to result
53 result.add(indexOfText - indexOfPattern);
54 indexOfPattern = lps[indexOfPattern - 1];
55 }
56 // Mismatch after `indexOfPattern` matches
57 else if (indexOfText < lengthOfText && pattern.charAt(indexOfPattern) != text.charAt(indexOfText)) {
58 // Do not match lps[0..lps[indexOfPattern-1]] characters,
59 // they will match anyway
60 if (indexOfPattern != 0) {
61 indexOfPattern = lps[indexOfPattern - 1];
62 } else {
63 indexOfText++;
64 }
65 }
66 }
67
68 return result;
69 }
70
71 // Implements the Lower Bound function to find the smallest index of an element greater or equal to 'target'
72 private int lowerBound(List<Integer> list, int target) {
73 int left = 0, right = list.size() - 1, result = list.size();
74
75 while (left <= right) {
76 int mid = left + (right - left) / 2;
77
78 if (list.get(mid) >= target) {
79 result = mid;
80 right = mid - 1;
81 } else {
82 left = mid + 1;
83 }
84 }
85
86 return result;
87 }
88
89 // Finds indices such that substring starting at those indices and pattern a match,
90 // and there exists a substring within the k-neighborhood that matches pattern b
91 public List<Integer> beautifulIndices(String s, String a, String b, int k) {
92 int lengthOfString = s.length();
93
94 // Get all the starting indices of pattern a in string s
95 List<Integer> aIndices = KMP_codestorywithMIK(a, s);
96 // Get all the starting indices of pattern b in string s
97 List<Integer> bIndices = KMP_codestorywithMIK(b, s);
98
99 List<Integer> result = new ArrayList<>();
100
101 // Loop through all the starting indices of pattern a
102 for (int indexA : aIndices) {
103
104 // Define the boundaries of the k-neighborhood
105 int leftLimit = Math.max(0, indexA - k);
106 int rightLimit = Math.min(lengthOfString - 1, indexA + k);
107
108 // Find the first index in bIndices which is greater or equal to leftLimit
109 int lowerBoundIndex = lowerBound(bIndices, leftLimit);
110
111 // Check if within bounds and add to result if conditions are met
112 if (lowerBoundIndex < bIndices.size() && bIndices.get(lowerBoundIndex) <= rightLimit) {
113 result.add(indexA);
114 }
115 }
116
117 return result;
118 }
119}
120
1#include <vector>
2#include <string>
3#include <algorithm>
4
5class Solution {
6public:
7 // Finds and returns a vector containing the indices of the 's' where
8 // 'patternA' and 'patternB' occur, respecting the condition that any index in
9 // 'patternB' should not be at a distance of more than 'k' from 'patternA'.
10 std::vector<int> beautifulIndices(const std::string& s, const std::string& pattern_a, const std::string& pattern_b, int k) {
11 // Find all occurrences of patternA in s
12 std::vector<int> beautiful_indices_a = kmpSearch(s, pattern_a);
13 // Find all occurrences of patternB in s
14 std::vector<int> beautiful_indices_b = kmpSearch(s, pattern_b);
15
16 // Sort indices of patternB's occurrences for binary search
17 std::sort(beautiful_indices_b.begin(), beautiful_indices_b.end());
18
19 std::vector<int> result;
20 // Iterate over each index where patternA is found
21 for (int index_a : beautiful_indices_a) {
22 // Find the lower bound for valid indices of patternB
23 int left = std::lower_bound(beautiful_indices_b.begin(), beautiful_indices_b.end(), index_a - k) - beautiful_indices_b.begin();
24 // Find the upper bound for valid indices of patternB
25 int right = std::lower_bound(beautiful_indices_b.begin(), beautiful_indices_b.end(), index_a + k + pattern_b.length()) - beautiful_indices_b.begin();
26
27 // Check within the valid range for a beautiful index and add it to the result
28 for (int index_b = left; index_b < right; index_b++) {
29 if (std::abs(beautiful_indices_b[index_b] - index_a) <= k) {
30 result.push_back(index_a);
31 break;
32 }
33 }
34 }
35
36 return result;
37 }
38
39private:
40 // Knuth-Morris-Pratt (KMP) Algorithm for String Matching
41 std::vector<int> kmpSearch(const std::string& text, const std::string& pattern) {
42 std::vector<int> indices;
43 std::vector<int> prefix_function = computePrefixFunction(pattern);
44
45 int matched_characters = 0;
46 // Iterate through text to find all occurrences of the pattern
47 for (int i = 0; i < text.length(); i++) {
48 while (matched_characters > 0 && pattern[matched_characters] != text[i]) {
49 matched_characters = prefix_function[matched_characters - 1];
50 }
51 if (pattern[matched_characters] == text[i]) {
52 matched_characters++;
53 }
54 if (matched_characters == pattern.length()) {
55 // Pattern found, push starting index to the list
56 indices.push_back(i - matched_characters + 1);
57 matched_characters = prefix_function[matched_characters - 1];
58 }
59 }
60
61 return indices;
62 }
63
64 // Preprocessing function to calculate the longest prefix suffix (LPS) values
65 std::vector<int> computePrefixFunction(const std::string& pattern) {
66 int m = pattern.length();
67 std::vector<int> prefix_function(m, 0);
68 int length_of_longest_prefix_suffix = 0;
69 // Applying KMP preprocessing to calculate LPS values
70 for (int i = 1; i < m; i++) {
71 while (length_of_longest_prefix_suffix > 0 && pattern[length_of_longest_prefix_suffix] != pattern[i]) {
72 length_of_longest_prefix_suffix = prefix_function[length_of_longest_prefix_suffix - 1];
73 }
74 if (pattern[length_of_longest_prefix_suffix] == pattern[i]) {
75 length_of_longest_prefix_suffix++;
76 }
77 prefix_function[i] = length_of_longest_prefix_suffix;
78 }
79 return prefix_function;
80 }
81};
82
1function beautifulIndices(s: string, patternA: string, patternB: string, k: number): number[] {
2 // Find all occurrences of patternA in s using KMP algorithm
3 let indicesOfA: number[] = kmpSearch(s, patternA);
4 // Find all occurrences of patternB in s using KMP algorithm
5 let indicesOfB: number[] = kmpSearch(s, patternB);
6
7 // Sort indices of patternB's occurrences for binary search
8 indicesOfB.sort((a, b) => a - b);
9
10 let result: number[] = [];
11 // Iterate over each index where patternA is found
12 for (let indexA of indicesOfA) {
13 // Find the lower bound for valid indices of patternB
14 let left = lowerBound(indicesOfB, indexA - k);
15 // Find the upper bound for valid indices of patternB
16 let right = lowerBound(indicesOfB, indexA + k + patternB.length);
17
18 // Check within the valid range for a beautiful index and add it to the result
19 for (let i = left; i < right; i++) {
20 if (Math.abs(indicesOfB[i] - indexA) <= k) {
21 result.push(indexA);
22 break;
23 }
24 }
25 }
26
27 return result;
28}
29
30// Implementation of KMP search algorithm
31function kmpSearch(text: string, pattern: string): number[] {
32 let indices: number[] = [];
33 let prefixFunction: number[] = computePrefixFunction(pattern);
34
35 let matchedChars = 0;
36 // Iterate through text to find all occurrences of the pattern
37 for (let i = 0; i < text.length; i++) {
38 while (matchedChars > 0 && pattern[matchedChars] !== text[i]) {
39 matchedChars = prefixFunction[matchedChars - 1];
40 }
41 if (pattern[matchedChars] === text[i]) {
42 matchedChars++;
43 }
44 if (matchedChars === pattern.length) {
45 // Pattern found, push starting index to the list
46 indices.push(i - matchedChars + 1);
47 matchedChars = prefixFunction[matchedChars - 1];
48 }
49 }
50
51 return indices;
52}
53
54// Preprocessing function to calculate the prefix function (longest prefix suffix array)
55function computePrefixFunction(pattern: string): number[] {
56 let m: number = pattern.length;
57 let prefixFunction: number[] = new Array(m).fill(0);
58 let len: number = 0; // length of the longest prefix suffix
59 // Applying KMP preprocessing to calculate prefix function values
60 for (let i = 1; i < m; i++) {
61 while (len > 0 && pattern[len] !== pattern[i]) {
62 len = prefixFunction[len - 1];
63 }
64 if (pattern[len] === pattern[i]) {
65 len++;
66 }
67 prefixFunction[i] = len;
68 }
69 return prefixFunction;
70}
71
72// Helper function to find the lower bound of a value in a sorted array
73function lowerBound(arr: number[], value: number): number {
74 let low: number = 0;
75 let high: number = arr.length;
76 while (low < high) {
77 let mid: number = Math.floor((low + high) / 2);
78 if (arr[mid] < value) {
79 low = mid + 1;
80 } else {
81 high = mid;
82 }
83 }
84 return low;
85}
86
Time and Space Complexity
The given Python code implements the Knuth-Morris-Pratt (KMP) string-searching algorithm to find 'beautiful' indices in a string based on two patterns a
and b
, and a distance constraint k
.
Time Complexity
-
Building Prefix Functions: The
build_prefix_function
function constructs the prefix table for patterna
and patternb
. The complexity of building the prefix function for a pattern of lengthm
isO(m)
. Since we build it for both patternsa
andb
, the total time for this step isO(len(a) + len(b))
. -
KMP Search: The
kmp_search
function uses the prefix table to search the occurrences of patternsa
andb
in strings
. The KMP search runs inO(n + m)
for a pattern of lengthm
and text of lengthn
. Since we search for both patternsa
andb
separately in strings
, the complexity isO(len(s) + len(a)) + O(len(s) + len(b))
, which simplifies toO(2*len(s) + len(a) + len(b))
. -
Finding Beautiful Indices: In the worst case, every index can be a match, leading to
O(len(s))
matches for both patterns. Iterating through all matches ofa
and checking for the condition with matches ofb
has the worst-case complexity ofO(len(s)^2)
since for each index inresa
, we potentially compare it to all indices inresb
.
Combining all the steps, the overall worst-case time complexity is O(len(a) + len(b) + 2*len(s) + len(s)^2)
. Since O(len(s)^2)
is the dominant term, we can simplify the total time complexity to O(len(s)^2)
.
Space Complexity
-
Prefix Functions: The space used to store the prefix tables of patterns
a
andb
is proportional to the lengths ofa
andb
, making the space complexity for this stepO(len(a) + len(b))
. -
Occurrences: The space used to store the occurrences of
a
andb
ins
can be up toO(len(s))
for bothresa
andresb
in the worst case where each character ofs
starts a match. -
Final Result: We store the final result in
res
, which in the worst case can also be of sizeO(len(s))
.
Hence, the total space complexity of the algorithm is O(len(a) + len(b) + 3*len(s))
. Simplifying, we get O(len(a) + len(b) + len(s))
, with O(len(s))
being the dominant term if len(s)
is much longer than patterns a
and b
.
Learn more about how to find time and space complexity quickly using problem constraints.
A heap is a ...?
Recommended Readings
Tech Interview Pattern Two Pointers Introduction If you prefer videos here's a super quick introduction to Two Pointers div class responsive iframe iframe src https www youtube com embed xZ4AfXHQ1VQ title YouTube video player frameborder 0 allow accelerometer autoplay clipboard write encrypted media gyroscope picture in picture allowfullscreen iframe div Two pointers is a common interview
Binary Search Speedrun For each of the Speedrun questions you will be given a binary search related problem and a corresponding multiple choice question The multiple choice questions are related to the techniques and template s introduced in the binary search section It's recommended that you have gone through at
LeetCode Patterns Your Personal Dijkstra's Algorithm to Landing Your Dream Job The goal of AlgoMonster is to help you get a job in the shortest amount of time possible in a data driven way We compiled datasets of tech interview problems and broke them down by patterns This way we
Want a Structured Path to Master System Design Too? Don’t Miss This!