1461. Check If a String Contains All Binary Codes of Size K
Problem Description
You are given a binary string s
(containing only '0's and '1's) and an integer k
. Your task is to determine whether every possible binary code of length k
exists as a substring within s
.
A binary code of length k
means any string consisting of exactly k
characters where each character is either '0' or '1'. For example, if k = 2
, the possible binary codes are: "00", "01", "10", and "11". There are exactly 2^k
different binary codes of length k
.
The function should return true
if all 2^k
possible binary codes of length k
can be found as substrings in s
, and false
otherwise.
For instance, if s = "00110110"
and k = 2
, we need to check if all 4 binary codes ("00", "01", "10", "11") appear as substrings in s
. Since we can find "00" at position 0, "01" at position 1, "11" at position 2, and "10" at position 4, the answer would be true
.
Intuition
The key insight is that we need to check if all 2^k
possible binary codes appear as substrings in s
. Instead of generating all possible binary codes and checking each one individually, we can approach this problem from the opposite direction - collect all unique substrings of length k
from s
and check if we've found them all.
First, consider a necessary condition: if string s
has length n
, we can extract at most n - k + 1
substrings of length k
(starting from positions 0, 1, 2, ..., up to n - k
). For all binary codes to exist in s
, we need at least 2^k
different substrings. Therefore, if n - k + 1 < 2^k
, it's impossible to have all binary codes, and we can immediately return false
.
Once we've verified this basic condition, we can extract all substrings of length k
from s
. Since we only care about unique binary codes, we use a set to automatically handle duplicates. We slide a window of size k
through the string s
, adding each substring to our set.
Finally, if the size of our set equals 2^k
, it means we've found exactly 2^k
different substrings of length k
, which must be all possible binary codes (since there are exactly 2^k
possible binary codes of length k
). If the set size is less than 2^k
, some binary codes are missing.
This approach is efficient because we only make one pass through the string and let the set data structure handle uniqueness for us, avoiding the need to explicitly generate and check all 2^k
possible binary codes.
Solution Approach
The implementation follows a straightforward approach using a hash table (set) to track unique substrings:
-
Initial Check for Feasibility:
- Calculate
m = 1 << k
, which equals2^k
using bit shifting (left shift byk
positions) - Check if
n - k + 1 < m
, wheren
is the length of strings
- If this condition is true, return
false
immediately since there aren't enough positions ins
to contain all possible binary codes
- Calculate
-
Extract All Substrings of Length k:
- Use Python's set comprehension to create a set
ss
containing all substrings of lengthk
- The comprehension
{s[i : i + k] for i in range(n - k + 1)}
iterates through each valid starting positioni
from 0 ton - k
- For each position
i
, extract the substrings[i : i + k]
- The set automatically handles duplicates, keeping only unique substrings
- Use Python's set comprehension to create a set
-
Verify Completeness:
- Compare the size of set
ss
withm
(which equals2^k
) - If
len(ss) == m
, all possible binary codes are present, returntrue
- Otherwise, some binary codes are missing, return
false
- Compare the size of set
The time complexity is O((n - k + 1) * k)
where we extract n - k + 1
substrings, each of length k
. The space complexity is O(2^k * k)
in the worst case, as we might store up to 2^k
unique strings, each of length k
.
This solution leverages the mathematical property that if we have exactly 2^k
different binary strings of length k
, they must be all possible binary codes, eliminating the need to generate and check each code individually.
Ready to land your dream job?
Unlock your dream job with a 5-minute evaluator for a personalized learning plan!
Start EvaluatorExample Walkthrough
Let's walk through the solution with s = "00110110"
and k = 2
.
Step 1: Initial Feasibility Check
- Calculate
m = 1 << 2 = 4
(there are 4 possible binary codes of length 2: "00", "01", "10", "11") - String length
n = 8
- Check if
n - k + 1 < m
: Is8 - 2 + 1 < 4
? Is7 < 4
? No, so we can proceed.
Step 2: Extract All Substrings of Length k We'll slide a window of size 2 through the string and collect unique substrings:
Position 0: s[0:2] = "00" Position 1: s[1:3] = "01" Position 2: s[2:4] = "11" Position 3: s[3:5] = "10" Position 4: s[4:6] = "01" (duplicate, already in set) Position 5: s[5:7] = "11" (duplicate, already in set) Position 6: s[6:8] = "10" (duplicate, already in set)
Our set ss
contains: {"00", "01", "11", "10"}
Step 3: Verify Completeness
- Size of set
ss
= 4 - Required number
m
= 4 - Since
len(ss) == m
, we've found all 4 possible binary codes of length 2
Result: Return true
The key insight here is that we don't need to explicitly check if "00", "01", "10", and "11" are all present. Since there are exactly 4 possible binary codes of length 2, and we found exactly 4 unique substrings of length 2, they must be the complete set of all possible binary codes.
Solution Implementation
1class Solution:
2 def hasAllCodes(self, s: str, k: int) -> bool:
3 """
4 Check if the string contains all possible binary codes of length k.
5
6 Args:
7 s: A binary string containing only '0' and '1'
8 k: The length of binary codes to check
9
10 Returns:
11 True if all 2^k possible binary codes of length k exist as substrings in s
12 """
13 string_length = len(s)
14 total_possible_codes = 1 << k # Calculate 2^k using bit shift
15
16 # Early termination: if there aren't enough positions to generate all codes
17 # We need at least 2^k different starting positions for substrings
18 if string_length - k + 1 < total_possible_codes:
19 return False
20
21 # Generate all unique substrings of length k using set comprehension
22 unique_substrings = {s[i:i + k] for i in range(string_length - k + 1)}
23
24 # Check if we found all possible binary codes
25 return len(unique_substrings) == total_possible_codes
26
1class Solution {
2 /**
3 * Checks if a binary string contains all possible binary codes of length k.
4 *
5 * @param s The input binary string
6 * @param k The length of binary codes to check
7 * @return true if all possible k-length binary codes appear as substrings, false otherwise
8 */
9 public boolean hasAllCodes(String s, int k) {
10 int stringLength = s.length();
11 // Calculate total number of possible k-bit binary codes (2^k)
12 int totalPossibleCodes = 1 << k;
13
14 // Early termination: if string doesn't have enough substrings, return false
15 // Number of k-length substrings = stringLength - k + 1
16 if (stringLength - k + 1 < totalPossibleCodes) {
17 return false;
18 }
19
20 // Use HashSet to store unique k-length substrings
21 Set<String> uniqueSubstrings = new HashSet<>();
22
23 // Iterate through all possible k-length substrings
24 for (int i = 0; i <= stringLength - k; i++) {
25 // Extract substring from index i to i+k (exclusive)
26 String currentSubstring = s.substring(i, i + k);
27 uniqueSubstrings.add(currentSubstring);
28 }
29
30 // Check if we found all possible k-bit binary codes
31 return uniqueSubstrings.size() == totalPossibleCodes;
32 }
33}
34
1class Solution {
2public:
3 bool hasAllCodes(string s, int k) {
4 int stringLength = s.size();
5 int totalPossibleCodes = 1 << k; // 2^k possible binary codes of length k
6
7 // Early termination: if string doesn't have enough substrings, return false
8 if (stringLength - k + 1 < totalPossibleCodes) {
9 return false;
10 }
11
12 // Set to store unique binary substrings of length k
13 unordered_set<string> uniqueSubstrings;
14
15 // Iterate through all possible substrings of length k
16 for (int i = 0; i + k <= stringLength; ++i) {
17 // Extract substring of length k starting at index i and add to set
18 uniqueSubstrings.insert(move(s.substr(i, k)));
19 }
20
21 // Check if we found all possible binary codes
22 return uniqueSubstrings.size() == totalPossibleCodes;
23 }
24};
25
1/**
2 * Checks if a binary string contains all possible binary codes of length k
3 * @param s - The input binary string
4 * @param k - The length of binary codes to check
5 * @returns true if all 2^k possible binary codes of length k exist as substrings in s
6 */
7function hasAllCodes(s: string, k: number): boolean {
8 const stringLength = s.length;
9 const totalPossibleCodes = 1 << k; // 2^k possible binary codes of length k
10
11 // Early termination: if string is too short to contain all possible codes
12 if (stringLength - k + 1 < totalPossibleCodes) {
13 return false;
14 }
15
16 // Use a Set to store unique substrings of length k
17 const uniqueSubstrings = new Set<string>();
18
19 // Iterate through all possible substrings of length k
20 for (let i = 0; i + k <= stringLength; ++i) {
21 // Extract substring of length k starting at position i
22 const substring = s.slice(i, i + k);
23 uniqueSubstrings.add(substring);
24 }
25
26 // Check if we found all possible binary codes
27 return uniqueSubstrings.size === totalPossibleCodes;
28}
29
Time and Space Complexity
Time Complexity: O(n × k)
The code creates a set of all substrings of length k
from the string s
. The set comprehension {s[i : i + k] for i in range(n - k + 1)}
iterates through n - k + 1
positions in the string. For each position i
, it extracts a substring s[i : i + k]
, which takes O(k)
time to create. Additionally, computing the hash of each substring for set insertion also takes O(k)
time. Therefore, the total time complexity is O((n - k + 1) × k)
, which simplifies to O(n × k)
.
Space Complexity: O(n × k)
or O(min(2^k × k, n × k))
The space complexity is dominated by the set ss
that stores all unique substrings of length k
. In the worst case, all n - k + 1
substrings are unique, and each substring requires O(k)
space to store. This gives us O((n - k + 1) × k)
= O(n × k)
space complexity. However, there's an upper bound: there can be at most 2^k
unique binary strings of length k
, so the actual space complexity is O(min(2^k × k, n × k))
.
Note: The reference answer states space complexity as O(n)
, which appears to be considering only the number of substrings stored (n - k + 1
at most) without accounting for the space needed to store each substring of length k
. The more precise analysis should include the factor k
for storing the actual substring content.
Learn more about how to find time and space complexity quickly.
Common Pitfalls
1. Inefficient String Slicing for Large k Values
The current solution uses string slicing s[i:i + k]
which creates a new string object for each substring. When k
is large, this becomes inefficient both in terms of time and space, as Python needs to allocate memory and copy characters for each substring.
Solution: Use a rolling hash or sliding window approach to avoid creating substring objects:
class Solution:
def hasAllCodes(self, s: str, k: int) -> bool:
if len(s) < k:
return False
total_possible_codes = 1 << k
if len(s) - k + 1 < total_possible_codes:
return False
# Use rolling hash approach
seen = set()
hash_val = 0
# Build initial hash for first k characters
for i in range(k):
hash_val = (hash_val << 1) | int(s[i])
seen.add(hash_val)
# Roll the hash window
mask = (1 << k) - 1 # Mask to keep only k bits
for i in range(k, len(s)):
hash_val = ((hash_val << 1) | int(s[i])) & mask
seen.add(hash_val)
return len(seen) == total_possible_codes
2. Memory Issues When k is Large
When k
approaches 20 or higher, storing 2^k
strings becomes problematic. For example, with k = 20
, you'd need to potentially store over 1 million unique strings in memory.
Solution: Check if the theoretical requirement is even feasible before processing:
class Solution:
def hasAllCodes(self, s: str, k: int) -> bool:
# Add memory constraint check
if k > 20: # 2^20 = 1,048,576 codes
# For very large k, the string would need to be extremely long
# to contain all codes, making it impractical
return len(s) >= (1 << k) + k - 1
# Continue with normal processing for reasonable k values
# ... rest of the code
3. Not Handling Edge Cases Properly
The original code might not handle certain edge cases correctly:
- When
k = 0
: Should returntrue
(only one possible code: empty string) - When
s
is empty butk > 0
: Should returnfalse
- When
k > len(s)
: Should returnfalse
Solution: Add explicit edge case handling:
class Solution:
def hasAllCodes(self, s: str, k: int) -> bool:
# Edge case handling
if k == 0:
return True
if not s or k > len(s):
return False
n = len(s)
total_codes = 1 << k
# Mathematical impossibility check
if n - k + 1 < total_codes:
return False
# Continue with main logic
unique_substrings = {s[i:i + k] for i in range(n - k + 1)}
return len(unique_substrings) == total_codes
4. Integer Overflow in Other Languages
While Python handles large integers gracefully, implementing this in languages like Java or C++ could cause integer overflow when calculating 2^k
for large k
values.
Solution for other languages: Use appropriate data types or add overflow checks:
// Java example
public boolean hasAllCodes(String s, int k) {
if (k > 30) return false; // Prevent overflow for int type
int totalCodes = 1 << k;
// Check for overflow
if (totalCodes < 0) return false;
// Rest of the implementation
}
Which data structure is used to implement priority queue?
Recommended Readings
Coding Interview Patterns Your Personal Dijkstra's Algorithm to Landing Your Dream Job The goal of AlgoMonster is to help you get a job in the shortest amount of time possible in a data driven way We compiled datasets of tech interview problems and broke them down by patterns This way
Recursion Recursion is one of the most important concepts in computer science Simply speaking recursion is the process of a function calling itself Using a real life analogy imagine a scenario where you invite your friends to lunch https assets algo monster recursion jpg You first call Ben and ask
Runtime Overview When learning about algorithms and data structures you'll frequently encounter the term time complexity This concept is fundamental in computer science and offers insights into how long an algorithm takes to complete given a certain input size What is Time Complexity Time complexity represents the amount of time
Want a Structured Path to Master System Design Too? Don’t Miss This!