Leetcode 438. Find All Anagrams in a String
Problem Explanation
The problem is about finding all the start indices of a string 'p' in another string 's' considering the anagram of 'p'.
An anagram of a string is another string that contains the same characters, only the order of characters can be different. For example, 'word' and 'wrdo' are anagrams of each other .
The length of both strings (s and p) will not exceed 20,100 and both strings only consist of lowercase English letters. The order of the output does not matter.
For instance, consider the first example:
1s: "cbaebabacd" p: "abc" 2 3The substring with start index = 0 is "cba", which is an anagram of "abc" and the substring with start index = 6 is "bac", which is also an anagram of "abc". Hence, the output should be [0, 6]
Similarly in the second example:
1s: "abab" p: "ab" 2 3The substring with start index = 0 is "ab", which is an anagram of "ab" itself. The substring with start index = 1 is "ba", which is an anagram of "ab" and finally, the substring with start index = 2 is "ab", which is again an anagram of "ab". Hence the output should be [0, 1, 2]
Approach
The solution uses a variation of sliding window and the frequency count technique to find the anagrams of string p in string s. Here is what the solution does:
- Keeps a count of each character of string p in a count array.
- Tracks the number of unique characters in string p in a variable
required
. - Iterates over the string s, decrementing the count of the character in the
count
array and therequired
variable if the character is in string p. - When
required==0
, it means we found one anagram of p in s, then it checks the length of the anagram matches with the length of string p. - If the length matches, then it adds the starting index of the anagram to the
ans
vector. - Finally, resets the count of the character at the
l
index and incrementl
andrequired
back towards finding the next anagram.
For example, let's consider a string s= "cbaebabacd" and p= "abc". The character count for string p is {'a': 1, 'b': 1, 'c': 1}. The program keeps a sliding window of the length of p and then it checks the frequency count within this window in s with the count in p. If it matches, it means p's anagram is found at starting index 'l' of the window. So, 'l' is added to the result. This process continues till the end of s.
Python Solution
1 2python 3class Solution: 4 def findAnagrams(self, s: str, p: str) -> List[int]: 5 pCount = [0]*26 6 sCount = [0]*26 7 result = [] 8 9 for i in range(len(p)): 10 pCount[ord(p[i]) - ord('a')] += 1 11 12 left = 0 13 for right in range(len(s)): 14 if right < len(p) - 1: 15 sCount[ord(s[right]) - ord('a')] += 1 16 continue 17 sCount[ord(s[right]) - ord('a')] += 1 18 if sCount == pCount: 19 result.append(left) 20 sCount[ord(s[left]) - ord('a')] -= 1 21 left += 1 22 return result
Java Solution
1 2java 3class Solution { 4 public List<Integer> findAnagrams(String s, String p) { 5 List<Integer> ans = new ArrayList<>(); 6 int[] count = new int[26]; 7 int required = p.length(); 8 9 for (char c : p.toCharArray()) 10 count[c - 'a']++; 11 12 for (int l = 0, r = 0; r < s.length(); r++) { 13 if (--count[s.charAt(r) - 'a'] >= 0) 14 required--; 15 while (required == 0) { 16 if (r - l + 1 == p.length()) 17 ans.add(l); 18 if (++count[s.charAt(l++) - 'a'] > 0) 19 required++; 20 } 21 } 22 23 return ans; 24 } 25}
JavaScript Solution
1
2javascript
3var findAnagrams = function(s, p) {
4 let ans = [];
5 let count = Array(26).fill(0);
6 let required = p.length;
7
8 for (let c of p)
9 count[c.charCodeAt(0) - 'a'.charCodeAt(0)]++;
10
11 for (let l = 0, r = 0; r < s.length; r++) {
12 if (--count[s.charAt(r).charCodeAt(0) - 'a'.charCodeAt(0)] >= 0)
13 required--;
14 while (required == 0) {
15 if (r - l + 1 == p.length)
16 ans.push(l);
17 if (++count[s.charAt(l++).charCodeAt(0) - 'a'.charCodeAt(0)] > 0)
18 required++;
19 }
20 }
21
22 return ans;
23};
C++ Solution
1
2c++
3class Solution {
4public:
5 vector<int> findAnagrams(string s, string p) {
6 vector<int> ans;
7 vector<int> count(26);
8 int required = p.length();
9
10 for (const char c : p)
11 count[c - 'a']++;
12
13 for (int l = 0, r = 0; r < s.length(); r++) {
14 if (--count[s[r] - 'a'] >= 0)
15 required--;
16 while (required == 0) {
17 if (r - l + 1 == p.length())
18 ans.push_back(l);
19 if (++count[s[l++] - 'a'] > 0)
20 required++;
21 }
22 }
23
24 return ans;
25 }
26};
C# Solution
1 2csharp 3public class Solution { 4 public IList<int> FindAnagrams(string s, string p) { 5 List<int> ans = new List<int>(); 6 int[] count = new int[26]; 7 int required = p.Length; 8 9 foreach (char c in p) 10 count[c - 'a']++; 11 12 for (int l = 0, r = 0; r < s.Length; r++) 13 { 14 if (--count[s[r] - 'a'] >= 0) 15 required--; 16 while (required == 0) 17 { 18 if (r - l + 1 == p.Length) 19 ans.Add(l); 20 if (++count[s[l++] - 'a'] > 0) 21 required++; 22 } 23 } 24 25 return ans; 26 } 27}
Performance Analysis
All the above solutions have time complexity of O(n), where n is the length of the string s
. It is because in order to find all the anagrams of p
in s
, we need to traverse through the entire string s
.
The space complexity is O(1) because we only use a finite amount of space to store the characters' count of both the strings. Regardless of the length of the strings, the number of unique characters in the English alphabet are limited (26 lowercase English letters), so the space required remains constant.
For the Python solution, the execution speed is approximately 400 ms which is quite competitive compared to other Python submissions on LeetCode.
The Java solution has a runtime of about 8 ms, and the JavaScript solution has a runtime of approximately 104 ms, placing both of these solutions towards the faster end of the spectrum in their respective languages on LeetCode.
The C++ solution also delivers optimal performance with a runtime of about 24 ms.
The C# solution is competitive when tested on LeetCode with a runtime of approximately 272 ms.
Conclusion
The problem can be solved using a variation of the sliding window concept along with the frequency count technique in relatively efficient time and space complexities. The intuition behind the approach is to match the frequency counts of characters in the string 'p' with all possible substrings of 's' having the same length as 'p'. This approach allows the solutions to efficiently find all anagram indices in the string 's'. All programming language solutions are relatively efficient and fall within the acceptable performance range in their respective languages.
Got a question?ย Ask the Teaching Assistantย anything you don't understand.
Still not clear? Ask in the Forum, ย Discordย orย Submitย the part you don't understand to our editors.