438. Find All Anagrams in a String
Problem Description
The task is to find all the starting indices of substrings in string s
that are anagrams of string p
. An anagram is a rearrangement of letters in a word to form a new word, using all the original letters exactly once. The two input strings, s
and p
, contain only lowercase English letters. The output should be a list of indices, and these indices don't need to be in any specific order.
For example, if s
is "cbaebabacd" and p
is "abc", then the function should return [0, 6]
, since the substrings starting at indices 0 ("cba") and 6 ("bac") are anagrams of "abc".
Intuition
The intuition behind the solution is to use a sliding window approach combined with character counting. The idea is to maintain a window of length equal to the length of p
and slide it across s
, while keeping track of the frequency of characters within the window and comparing it to the frequency of characters in p
. If at any point the frequency counts match, we've found an anagram and can record the starting index of the window.
To implement this efficiently, we use two Counter
objects (from Python's collections module) to store the frequency counts for p
and for the current window in s
. The key steps in the algorithm are as follows:
- If
s
is shorter thanp
, immediately return an empty list as no anagrams could exist. - Create a
Counter
for stringp
which holds the count of each character. - Also, create a
Counter
for the first window of sizelen(p) - 1
ins
. - Iterate through
s
starting at the index where the first full window would begin. For each indexi
:- Increment the count of character
s[i]
in the current window counter. - Check if the current window's character counts match those of
p
. If they do, append the starting index of the window(i - len(p) + 1)
to the result list. - Decrement the count of character
s[i - len(p) + 1]
in the current window counter, effectively sliding the window one position to the right.
- Increment the count of character
- After the loop, return the list of starting indices, which contains all the starting points of anagram substrings within
s
.
Learn more about Sliding Window patterns.
Solution Approach
The solution approach utilizes a sliding window algorithm combined with character frequency counting.
Sliding Window Algorithm
The sliding window algorithm is quite efficient for problems that require checking or computing something among all substrings of a given size within a larger string. This technique involves maintaining a window that slides across the data to examine different subsections of it.
In this problem, we slide a window of length len(p)
across the string s
and at each step, we compare the frequency of characters within the window to the frequency of characters in p
.
Character Frequency Counting
To track the characters' frequencies efficiently, we use Python's Counter
class from the collections module. A Counter
is a dictionary subclass that allows us to count hashable objects. The elements are stored as dictionary keys, and the counts of the objects are stored as the values.
Here's how the code implements the above techniques:
- We create a
Counter
calledcnt1
for the stringp
, which will remain constant throughout the algorithm. - Initialize a second
Counter
calledcnt2
for the firstlen(p) - 1
characters of strings
. We start with one less thanlen(p)
because we will be adding one character to the counter in the main loop, effectively scanning windows of sizelen(p)
.
cnt1 = Counter(p)
cnt2 = Counter(s[: len(p) - 1])
-
The main loop starts from index
len(p) - 1
and iterates until the end of the strings
. In each iteration, we:- Include the next character in the current window's count by incrementing
cnt2[s[i]]
. - Check if
cnt1
andcnt2
are equal, which means we found an anagram. If they are equal, we appendi - len(p) + 1
to our answer listans
. - Before moving to the next iteration, we decrement the count of the character at the start of the current window since the window will shift to the right in the next step.
- Include the next character in the current window's count by incrementing
for i in range(len(p) - 1, len(s)):
cnt2[s[i]] += 1
if cnt1 == cnt2:
ans.append(i - len(p) + 1)
cnt2[s[i - len(p) + 1]] -= 1
Notice how the algorithm elegantly maintains the window size by incrementing the end character count and decrementing the start character count in each iteration, making the time complexity for this operation O(1), assuming that the inner workings of the Counter object and comparing two Counters take constant time on average.
By employing the sliding window technique and frequency counting, we avoid the need to re-compute the frequencies from scratch for each window, thereby achieving an efficient solution.
Ready to land your dream job?
Unlock your dream job with a 2-minute evaluator for a personalized learning plan!
Start EvaluatorExample Walkthrough
Let's consider a small example to illustrate how the sliding window approach and character frequency counting is applied in the implementation.
Suppose our string s
is "abab" and our pattern string p
is "ab". We want to find all starting indices of substrings in s
that are anagrams of p
.
Following the steps described in the solution approach:
- Since
s
is not shorter thanp
, we proceed with creating aCounter
object forp
.
cnt1 = Counter("ab") # This will hold {'a': 1, 'b': 1}
- Next, initialize a
Counter
object for the firstlen(p) - 1
characters ofs
. In our case, the initial window will be of size 1 (2-1).
cnt2 = Counter(s[:1]) # This will hold {'a': 1}
-
Now, we will iterate from index 1 to the length of
s
.-
For index 1, we include character 'b' in
cnt2
and increment its count. -
We compare
cnt1
withcnt2
. Since at this pointcnt1
is {'a': 1, 'b': 1} andcnt2
is {'a': 1, 'b': 1}, they are equal, which indicates that we have found an anagram starting at index 0. -
The list
ans
will now include this index:
-
ans = [0]
- Before moving to the next index, we decrement the count of the character at the start of the current window (which is 'a') in `cnt2`.
4. At index 2, we include character 'a' in cnt2
then repeat the steps. cnt1
and cnt2
will still be equal, so we add index 1 to ans
.
ans = [0, 1]
- Decrement the count of 'b' in `cnt2` before the next iteration.
5. At index 3, we include character 'b' in cnt2
. The Counters are still equal, add index 2 to ans
.
ans = [0, 1, 2]
- Finally, decrement the count of 'a' in `cnt2` as the loop is completed.
6. Since we have no more characters in s
to iterate through, we have completed our search. The list ans
now contains all the starting indices of the anagrams of p
in s
.
The final output for our example s = "abab"
and p = "ab"
is [0, 1, 2]
, indicating that anagrams of p
start at indices 0, 1, and 2 in s
.
Solution Implementation
1from collections import Counter
2from typing import List
3
4class Solution:
5 def findAnagrams(self, s: str, p: str) -> List[int]:
6 # Length of the string 's' and the pattern 'p'
7 s_length, p_length = len(s), len(p)
8 # List to store the starting indices of the anagrams of 'p' in 's'
9 anagram_indices = []
10
11 # Early return if 's' is shorter than 'p', as no anagrams would fit
12 if s_length < p_length:
13 return anagram_indices
14
15 # Count the occurrences of characters in 'p'
16 p_char_count = Counter(p)
17 # Count the occurrences of characters in the first window of 's'
18 # This window has the size one less than the size of 'p'
19 s_window_count = Counter(s[:p_length - 1])
20
21 # Iterate over 's', starting from index where first full window can begin
22 for i in range(p_length - 1, s_length):
23 # Add the current character to the window's character count
24 s_window_count[s[i]] += 1
25
26 # If the character counts match, it's an anagram; record the start index
27 if p_char_count == s_window_count:
28 anagram_indices.append(i - p_length + 1)
29
30 # Move the window to the right: subtract the count of the oldest character
31 s_window_count[s[i - p_length + 1]] -= 1
32 # Remove the character count from the dict if it drops to zero
33 if s_window_count[s[i - p_length + 1]] == 0:
34 del s_window_count[s[i - p_length + 1]]
35
36 # Return the list of starting indices of anagrams
37 return anagram_indices
38
1import java.util.ArrayList;
2import java.util.Arrays;
3import java.util.List;
4
5class Solution {
6 public List<Integer> findAnagrams(String s, String p) {
7 // Initialize the length of both strings
8 int stringLength = s.length(), patternLength = p.length();
9 List<Integer> anagramsStartIndices = new ArrayList<>();
10
11 // If the pattern is longer than the string, no anagrams can be found
12 if (stringLength < patternLength) {
13 return anagramsStartIndices;
14 }
15
16 // Count the frequency of characters in the pattern
17 int[] patternCount = new int[26];
18 for (int i = 0; i < patternLength; ++i) {
19 patternCount[p.charAt(i) - 'a']++;
20 }
21
22 // Initialize character frequency counter for window in 's'
23 int[] windowCount = new int[26];
24 // Initialize the window with the first (patternLength - 1) characters
25 for (int i = 0; i < patternLength - 1; ++i) {
26 windowCount[s.charAt(i) - 'a']++;
27 }
28
29 // Slide the window across the string 's' one character at a time
30 for (int i = patternLength - 1; i < stringLength; ++i) {
31 // Add the new character to the window
32 windowCount[s.charAt(i) - 'a']++;
33
34 // If the window matches the pattern counts, record the start index
35 if (Arrays.equals(patternCount, windowCount)) {
36 anagramsStartIndices.add(i - patternLength + 1);
37 }
38
39 // Remove the leftmost (oldest) character from the window
40 windowCount[s.charAt(i - patternLength + 1) - 'a']--;
41 }
42
43 return anagramsStartIndices;
44 }
45}
46
1#include <vector>
2#include <string>
3
4class Solution {
5public:
6 // Function to find all the start indices of p's anagrams in s.
7 std::vector<int> findAnagrams(std::string s, std::string p) {
8 int sLength = s.size(), pLength = p.size();
9 std::vector<int> startingIndices; // Holds the starting indices of anagrams.
10
11 // Base case: if the length of string s is smaller than the length of string p,
12 // no anagrams of p can be found in s.
13 if (sLength < pLength) {
14 return startingIndices;
15 }
16
17 // Counters for the characters in p and the current window in s.
18 std::vector<int> pCharCount(26, 0); // Assuming only lowercase English letters.
19 std::vector<int> windowCharCount(26, 0);
20
21 // Count the occurrences of each character in p.
22 for (char c : p) {
23 ++pCharCount[c - 'a']; // Increment the count for this character.
24 }
25
26 // Initialize windowCharCount with the first window in s.
27 for (int i = 0; i < pLength - 1; ++i) {
28 ++windowCharCount[s[i] - 'a'];
29 }
30
31 // Slide the window over string s and compare with character counts of p.
32 for (int i = pLength - 1; i < sLength; ++i) {
33 ++windowCharCount[s[i] - 'a']; // Add the current character to the window count.
34
35 // If the window has the same character counts as p, it's an anagram starting at (i - pLength + 1).
36 if (pCharCount == windowCharCount) {
37 startingIndices.push_back(i - pLength + 1);
38 }
39
40 // Move the window forward by one: decrease the count of the character leaving the window.
41 --windowCharCount[s[i - pLength + 1] - 'a'];
42 }
43
44 return startingIndices; // Return the collected starting indices of anagrams.
45 }
46};
47
1function findAnagrams(source: string, pattern: string): number[] {
2 const sourceLength = source.length;
3 const patternLength = pattern.length;
4 const resultIndices: number[] = [];
5
6 // If the pattern is longer than the source string, no anagrams are possible
7 if (sourceLength < patternLength) {
8 return resultIndices;
9 }
10
11 // Character count arrays for the pattern and the current window in the source string
12 const patternCount: number[] = new Array(26).fill(0);
13 const windowCount: number[] = new Array(26).fill(0);
14
15 // Helper function to convert a character to its corresponding index (0-25)
16 const charToIndex = (c: string) => c.charCodeAt(0) - 'a'.charCodeAt(0);
17
18 // Initialize the character count for the pattern
19 for (const char of pattern) {
20 patternCount[charToIndex(char)]++;
21 }
22
23 // Initialize the character count for the first window in the source string
24 for (const char of source.slice(0, patternLength - 1)) {
25 windowCount[charToIndex(char)]++;
26 }
27
28 // Slide the window over the source string and compare counts of characters
29 for (let i = patternLength - 1; i < sourceLength; ++i) {
30 // Include the current character into the window count
31 windowCount[charToIndex(source[i])]++;
32
33 // Compare the pattern character count array with the current window count array
34 if (arrayEqual(patternCount, windowCount)) {
35 // If they match, add the starting index to the result array
36 resultIndices.push(i - patternLength + 1);
37 }
38
39 // Remove the first character of the current window from the window count
40 windowCount[charToIndex(source[i - patternLength + 1])]--;
41 }
42
43 // Helper function to check if two arrays are equal
44 function arrayEqual(arr1: number[], arr2: number[]): boolean {
45 for (let i = 0; i < arr1.length; i++) {
46 if (arr1[i] !== arr2[i]) {
47 return false;
48 }
49 }
50 return true;
51 }
52
53 return resultIndices;
54}
55
Time and Space Complexity
The time complexity of the given code snippet is O(m + n)
, where m
is the length of string s
and n
is the length of string p
. Here's a breakdown of the complexity:
- Initializing
cnt1
andcnt2
takesO(n)
time because we are counting characters in strings of lengthn
andn - 1
, respectively. - The
for
loop runs fromn - 1
tom
, resulting inO(m - n + 1)
iterations. - Inside the loop, we increment the count of
s[i]
inO(1)
time, comparecnt1
andcnt2
inO(1)
time assuming the alphabet is fixed and thus the Counter objects have a constant number of keys, and decrement the count of a character inO(1)
time. - Therefore, loop operations have a complexity of
O(m - n + 1)
, and setting up the Counter objects hasO(n)
, resulting in a total time complexity ofO(n + m - n + 1)
which simplifies toO(m + n)
.
The space complexity is O(1)
or O(k)
where k
is the size of the character set (e.g., the alphabet size), assuming the alphabet is fixed and does not grow with n
or m
. This is because the space that cnt1
and cnt2
take up does not depend on the length of s
or p
but only on the number of distinct characters, which is constant for a fixed alphabet.
Learn more about how to find time and space complexity quickly using problem constraints.
A heap is a ...?
Recommended Readings
https algomonster s3 us east 2 amazonaws com cover_photos stack svg Sliding Window Maximum Monotonic Stack We have an array and a sliding window defined by a start index and an end index The sliding window moves from left of the array to right There are always k elements in
LeetCode Patterns Your Personal Dijkstra's Algorithm to Landing Your Dream Job The goal of AlgoMonster is to help you get a job in the shortest amount of time possible in a data driven way We compiled datasets of tech interview problems and broke them down by patterns This way we
Recursion Recursion is one of the most important concepts in computer science Simply speaking recursion is the process of a function calling itself Using a real life analogy imagine a scenario where you invite your friends to lunch https algomonster s3 us east 2 amazonaws com recursion jpg You first
Want a Structured Path to Master System Design Too? Don’t Miss This!