Facebook Pixel

1461. Check If a String Contains All Binary Codes of Size K

MediumBit ManipulationHash TableStringHash FunctionRolling Hash
Leetcode Link

Problem Description

You are given a binary string s (containing only '0's and '1's) and an integer k. Your task is to determine whether every possible binary code of length k exists as a substring within s.

A binary code of length k means any string consisting of exactly k characters where each character is either '0' or '1'. For example, if k = 2, the possible binary codes are: "00", "01", "10", and "11". There are exactly 2^k different binary codes of length k.

The function should return true if all 2^k possible binary codes of length k can be found as substrings in s, and false otherwise.

For instance, if s = "00110110" and k = 2, we need to check if all 4 binary codes ("00", "01", "10", "11") appear as substrings in s. Since we can find "00" at position 0, "01" at position 1, "11" at position 2, and "10" at position 4, the answer would be true.

Quick Interview Experience
Help others by sharing your interview experience
Have you seen this problem before?

Intuition

The key insight is that we need to check if all 2^k possible binary codes appear as substrings in s. Instead of generating all possible binary codes and checking each one individually, we can approach this problem from the opposite direction - collect all unique substrings of length k from s and check if we've found them all.

First, consider a necessary condition: if string s has length n, we can extract at most n - k + 1 substrings of length k (starting from positions 0, 1, 2, ..., up to n - k). For all binary codes to exist in s, we need at least 2^k different substrings. Therefore, if n - k + 1 < 2^k, it's impossible to have all binary codes, and we can immediately return false.

Once we've verified this basic condition, we can extract all substrings of length k from s. Since we only care about unique binary codes, we use a set to automatically handle duplicates. We slide a window of size k through the string s, adding each substring to our set.

Finally, if the size of our set equals 2^k, it means we've found exactly 2^k different substrings of length k, which must be all possible binary codes (since there are exactly 2^k possible binary codes of length k). If the set size is less than 2^k, some binary codes are missing.

This approach is efficient because we only make one pass through the string and let the set data structure handle uniqueness for us, avoiding the need to explicitly generate and check all 2^k possible binary codes.

Solution Approach

The implementation follows a straightforward approach using a hash table (set) to track unique substrings:

  1. Initial Check for Feasibility:

    • Calculate m = 1 << k, which equals 2^k using bit shifting (left shift by k positions)
    • Check if n - k + 1 < m, where n is the length of string s
    • If this condition is true, return false immediately since there aren't enough positions in s to contain all possible binary codes
  2. Extract All Substrings of Length k:

    • Use Python's set comprehension to create a set ss containing all substrings of length k
    • The comprehension {s[i : i + k] for i in range(n - k + 1)} iterates through each valid starting position i from 0 to n - k
    • For each position i, extract the substring s[i : i + k]
    • The set automatically handles duplicates, keeping only unique substrings
  3. Verify Completeness:

    • Compare the size of set ss with m (which equals 2^k)
    • If len(ss) == m, all possible binary codes are present, return true
    • Otherwise, some binary codes are missing, return false

The time complexity is O((n - k + 1) * k) where we extract n - k + 1 substrings, each of length k. The space complexity is O(2^k * k) in the worst case, as we might store up to 2^k unique strings, each of length k.

This solution leverages the mathematical property that if we have exactly 2^k different binary strings of length k, they must be all possible binary codes, eliminating the need to generate and check each code individually.

Ready to land your dream job?

Unlock your dream job with a 5-minute evaluator for a personalized learning plan!

Start Evaluator

Example Walkthrough

Let's walk through the solution with s = "00110110" and k = 2.

Step 1: Initial Feasibility Check

  • Calculate m = 1 << 2 = 4 (there are 4 possible binary codes of length 2: "00", "01", "10", "11")
  • String length n = 8
  • Check if n - k + 1 < m: Is 8 - 2 + 1 < 4? Is 7 < 4? No, so we can proceed.

Step 2: Extract All Substrings of Length k We'll slide a window of size 2 through the string and collect unique substrings:

Position 0: s[0:2] = "00"
Position 1: s[1:3] = "01" 
Position 2: s[2:4] = "11"
Position 3: s[3:5] = "10"
Position 4: s[4:6] = "01" (duplicate, already in set)
Position 5: s[5:7] = "11" (duplicate, already in set)
Position 6: s[6:8] = "10" (duplicate, already in set)

Our set ss contains: {"00", "01", "11", "10"}

Step 3: Verify Completeness

  • Size of set ss = 4
  • Required number m = 4
  • Since len(ss) == m, we've found all 4 possible binary codes of length 2

Result: Return true

The key insight here is that we don't need to explicitly check if "00", "01", "10", and "11" are all present. Since there are exactly 4 possible binary codes of length 2, and we found exactly 4 unique substrings of length 2, they must be the complete set of all possible binary codes.

Solution Implementation

1class Solution:
2    def hasAllCodes(self, s: str, k: int) -> bool:
3        """
4        Check if the string contains all possible binary codes of length k.
5      
6        Args:
7            s: A binary string containing only '0' and '1'
8            k: The length of binary codes to check
9          
10        Returns:
11            True if all 2^k possible binary codes of length k exist as substrings in s
12        """
13        string_length = len(s)
14        total_possible_codes = 1 << k  # Calculate 2^k using bit shift
15      
16        # Early termination: if there aren't enough positions to generate all codes
17        # We need at least 2^k different starting positions for substrings
18        if string_length - k + 1 < total_possible_codes:
19            return False
20      
21        # Generate all unique substrings of length k using set comprehension
22        unique_substrings = {s[i:i + k] for i in range(string_length - k + 1)}
23      
24        # Check if we found all possible binary codes
25        return len(unique_substrings) == total_possible_codes
26
1class Solution {
2    /**
3     * Checks if a binary string contains all possible binary codes of length k.
4     * 
5     * @param s The input binary string
6     * @param k The length of binary codes to check
7     * @return true if all possible k-length binary codes appear as substrings, false otherwise
8     */
9    public boolean hasAllCodes(String s, int k) {
10        int stringLength = s.length();
11        // Calculate total number of possible k-bit binary codes (2^k)
12        int totalPossibleCodes = 1 << k;
13      
14        // Early termination: if string doesn't have enough substrings, return false
15        // Number of k-length substrings = stringLength - k + 1
16        if (stringLength - k + 1 < totalPossibleCodes) {
17            return false;
18        }
19      
20        // Use HashSet to store unique k-length substrings
21        Set<String> uniqueSubstrings = new HashSet<>();
22      
23        // Iterate through all possible k-length substrings
24        for (int i = 0; i <= stringLength - k; i++) {
25            // Extract substring from index i to i+k (exclusive)
26            String currentSubstring = s.substring(i, i + k);
27            uniqueSubstrings.add(currentSubstring);
28        }
29      
30        // Check if we found all possible k-bit binary codes
31        return uniqueSubstrings.size() == totalPossibleCodes;
32    }
33}
34
1class Solution {
2public:
3    bool hasAllCodes(string s, int k) {
4        int stringLength = s.size();
5        int totalPossibleCodes = 1 << k;  // 2^k possible binary codes of length k
6      
7        // Early termination: if string doesn't have enough substrings, return false
8        if (stringLength - k + 1 < totalPossibleCodes) {
9            return false;
10        }
11      
12        // Set to store unique binary substrings of length k
13        unordered_set<string> uniqueSubstrings;
14      
15        // Iterate through all possible substrings of length k
16        for (int i = 0; i + k <= stringLength; ++i) {
17            // Extract substring of length k starting at index i and add to set
18            uniqueSubstrings.insert(move(s.substr(i, k)));
19        }
20      
21        // Check if we found all possible binary codes
22        return uniqueSubstrings.size() == totalPossibleCodes;
23    }
24};
25
1/**
2 * Checks if a binary string contains all possible binary codes of length k
3 * @param s - The input binary string
4 * @param k - The length of binary codes to check
5 * @returns true if all 2^k possible binary codes of length k exist as substrings in s
6 */
7function hasAllCodes(s: string, k: number): boolean {
8    const stringLength = s.length;
9    const totalPossibleCodes = 1 << k; // 2^k possible binary codes of length k
10  
11    // Early termination: if string is too short to contain all possible codes
12    if (stringLength - k + 1 < totalPossibleCodes) {
13        return false;
14    }
15  
16    // Use a Set to store unique substrings of length k
17    const uniqueSubstrings = new Set<string>();
18  
19    // Iterate through all possible substrings of length k
20    for (let i = 0; i + k <= stringLength; ++i) {
21        // Extract substring of length k starting at position i
22        const substring = s.slice(i, i + k);
23        uniqueSubstrings.add(substring);
24    }
25  
26    // Check if we found all possible binary codes
27    return uniqueSubstrings.size === totalPossibleCodes;
28}
29

Time and Space Complexity

Time Complexity: O(n × k)

The code creates a set of all substrings of length k from the string s. The set comprehension {s[i : i + k] for i in range(n - k + 1)} iterates through n - k + 1 positions in the string. For each position i, it extracts a substring s[i : i + k], which takes O(k) time to create. Additionally, computing the hash of each substring for set insertion also takes O(k) time. Therefore, the total time complexity is O((n - k + 1) × k), which simplifies to O(n × k).

Space Complexity: O(n × k) or O(min(2^k × k, n × k))

The space complexity is dominated by the set ss that stores all unique substrings of length k. In the worst case, all n - k + 1 substrings are unique, and each substring requires O(k) space to store. This gives us O((n - k + 1) × k) = O(n × k) space complexity. However, there's an upper bound: there can be at most 2^k unique binary strings of length k, so the actual space complexity is O(min(2^k × k, n × k)).

Note: The reference answer states space complexity as O(n), which appears to be considering only the number of substrings stored (n - k + 1 at most) without accounting for the space needed to store each substring of length k. The more precise analysis should include the factor k for storing the actual substring content.

Learn more about how to find time and space complexity quickly.

Common Pitfalls

1. Inefficient String Slicing for Large k Values

The current solution uses string slicing s[i:i + k] which creates a new string object for each substring. When k is large, this becomes inefficient both in terms of time and space, as Python needs to allocate memory and copy characters for each substring.

Solution: Use a rolling hash or sliding window approach to avoid creating substring objects:

class Solution:
    def hasAllCodes(self, s: str, k: int) -> bool:
        if len(s) < k:
            return False
      
        total_possible_codes = 1 << k
        if len(s) - k + 1 < total_possible_codes:
            return False
      
        # Use rolling hash approach
        seen = set()
        hash_val = 0
      
        # Build initial hash for first k characters
        for i in range(k):
            hash_val = (hash_val << 1) | int(s[i])
      
        seen.add(hash_val)
      
        # Roll the hash window
        mask = (1 << k) - 1  # Mask to keep only k bits
        for i in range(k, len(s)):
            hash_val = ((hash_val << 1) | int(s[i])) & mask
            seen.add(hash_val)
      
        return len(seen) == total_possible_codes

2. Memory Issues When k is Large

When k approaches 20 or higher, storing 2^k strings becomes problematic. For example, with k = 20, you'd need to potentially store over 1 million unique strings in memory.

Solution: Check if the theoretical requirement is even feasible before processing:

class Solution:
    def hasAllCodes(self, s: str, k: int) -> bool:
        # Add memory constraint check
        if k > 20:  # 2^20 = 1,048,576 codes
            # For very large k, the string would need to be extremely long
            # to contain all codes, making it impractical
            return len(s) >= (1 << k) + k - 1
      
        # Continue with normal processing for reasonable k values
        # ... rest of the code

3. Not Handling Edge Cases Properly

The original code might not handle certain edge cases correctly:

  • When k = 0: Should return true (only one possible code: empty string)
  • When s is empty but k > 0: Should return false
  • When k > len(s): Should return false

Solution: Add explicit edge case handling:

class Solution:
    def hasAllCodes(self, s: str, k: int) -> bool:
        # Edge case handling
        if k == 0:
            return True
        if not s or k > len(s):
            return False
      
        n = len(s)
        total_codes = 1 << k
      
        # Mathematical impossibility check
        if n - k + 1 < total_codes:
            return False
      
        # Continue with main logic
        unique_substrings = {s[i:i + k] for i in range(n - k + 1)}
        return len(unique_substrings) == total_codes

4. Integer Overflow in Other Languages

While Python handles large integers gracefully, implementing this in languages like Java or C++ could cause integer overflow when calculating 2^k for large k values.

Solution for other languages: Use appropriate data types or add overflow checks:

// Java example
public boolean hasAllCodes(String s, int k) {
    if (k > 30) return false;  // Prevent overflow for int type
  
    int totalCodes = 1 << k;
    // Check for overflow
    if (totalCodes < 0) return false;
  
    // Rest of the implementation
}
Discover Your Strengths and Weaknesses: Take Our 5-Minute Quiz to Tailor Your Study Plan:

Which data structure is used to implement priority queue?


Recommended Readings

Want a Structured Path to Master System Design Too? Don’t Miss This!

Load More