Facebook Pixel

2261. K Divisible Elements Subarrays

MediumTrieArrayHash TableEnumerationHash FunctionRolling Hash
Leetcode Link

Problem Description

You are given an integer array nums and two integers k and p. Your task is to find the number of distinct subarrays where each subarray contains at most k elements that are divisible by p.

A subarray is a contiguous sequence of elements from the array (cannot be empty).

Two subarrays are considered distinct if:

  • They have different lengths, OR
  • There exists at least one index i where the elements at position i differ between the two subarrays

For example, if nums = [1, 2, 3], the subarrays [1, 2] starting at index 0 and [1, 2] starting at index 0 are the same (not distinct), but [1, 2] and [2, 3] are distinct.

The solution uses a double hashing technique to identify unique subarrays. It iterates through all possible subarrays by fixing a starting position i and extending to position j. For each subarray, it:

  1. Counts how many elements are divisible by p (using cnt)
  2. Stops extending if the count exceeds k
  3. Computes two hash values (h1 and h2) using different base values and modulo operations to uniquely identify each subarray
  4. Combines both hash values into a single identifier and stores it in a set
  5. Returns the size of the set, which represents the count of distinct subarrays

The double hashing approach with two different base values (131 and 13331) and two different modulo values helps minimize hash collisions and ensures accurate identification of distinct subarrays.

Quick Interview Experience
Help others by sharing your interview experience
Have you seen this problem before?

Intuition

The key insight is that we need to find all valid subarrays (those with at most k elements divisible by p) and then count only the distinct ones. The challenge lies in efficiently identifying which subarrays are distinct.

A straightforward approach would be to generate all valid subarrays and store them directly (like storing the actual subarray elements). However, comparing entire subarrays for uniqueness would be inefficient, especially for large arrays.

This leads us to think about hashing - we can represent each subarray with a unique numerical fingerprint. If two subarrays have the same content in the same order, they'll produce the same hash value. By storing these hash values in a set, we automatically handle the distinctness requirement since sets don't allow duplicates.

The natural question is: how do we generate a hash for a subarray? We can treat the subarray as a sequence and compute a rolling hash. For a subarray [a, b, c], we can compute the hash as a * base^2 + b * base + c, similar to how we represent numbers in different bases.

Why double hashing? Single hashing might lead to collisions where different subarrays accidentally produce the same hash value. By using two different hash functions (with different bases 131 and 13331, and different modulos 10^9 + 7 and 10^9 + 9), the probability of collision becomes extremely small. We combine both hash values (h1 << 32 | h2) to create a unique identifier for each subarray.

The enumeration strategy is straightforward: for each starting position i, we extend the subarray by incrementing j. We maintain a count of elements divisible by p as we extend. If this count exceeds k, we stop extending from this starting position and move to the next one. This ensures we only consider valid subarrays while avoiding unnecessary computation.

Learn more about Trie patterns.

Solution Approach

The solution uses enumeration with double hashing to identify and count distinct subarrays that satisfy the constraint.

Implementation Steps:

  1. Initialize a set to store unique subarray hashes:

    s = set()

    This set will automatically handle duplicates for us.

  2. Set up hashing parameters:

    • Two base values: base1 = 131 and base2 = 13331
    • Two modulo values: mod1 = 10^9 + 7 and mod2 = 10^9 + 9

    These are carefully chosen prime numbers to minimize hash collisions.

  3. Enumerate all possible subarrays:

    • Outer loop: Fix the left endpoint i from 0 to n-1
    • Inner loop: Extend the right endpoint j from i to n-1
  4. For each starting position i, initialize:

    h1 = h2 = cnt = 0
    • h1 and h2: Rolling hash values for the two hash functions
    • cnt: Counter for elements divisible by p
  5. Extend the subarray by adding element at position j:

    • Update the divisibility counter:

      cnt += nums[j] % p == 0

      This increments cnt by 1 if nums[j] is divisible by p, otherwise by 0.

    • Check constraint violation:

      if cnt > k:
          break

      If we exceed k divisible elements, stop extending from this starting position.

    • Update rolling hashes:

      h1 = (h1 * base1 + nums[j]) % mod1
      h2 = (h2 * base2 + nums[j]) % mod2

      This implements polynomial rolling hash: for subarray [a, b, c], the hash becomes ((a * base + b) * base + c) % mod.

  6. Combine and store the hash values:

    s.add(h1 << 32 | h2)
    • h1 << 32 shifts h1 left by 32 bits
    • | h2 combines it with h2 using bitwise OR
    • This creates a unique 64-bit identifier from two 32-bit hashes
  7. Return the count of distinct subarrays:

    return len(s)

Time Complexity: O(nΒ²) where n is the length of the array, as we examine all possible subarrays.

Space Complexity: O(nΒ²) in the worst case, as we might store up to n(n+1)/2 distinct subarray hashes in the set.

Ready to land your dream job?

Unlock your dream job with a 3-minute evaluator for a personalized learning plan!

Start Evaluator

Example Walkthrough

Let's walk through a small example to illustrate the solution approach.

Example: nums = [2, 3, 3, 2, 2], k = 2, p = 2

We need to find distinct subarrays with at most 2 elements divisible by 2.

Step-by-step process:

Let's trace through the algorithm with simplified hash values (using base=10 for clarity, though actual implementation uses 131 and 13331):

Starting from index i=0:

  • j=0: subarray=[2], cnt=1 (2 is divisible by 2), h1=2, valid βœ“
  • j=1: subarray=[2,3], cnt=1, h1=2*10+3=23, valid βœ“
  • j=2: subarray=[2,3,3], cnt=1, h1=23*10+3=233, valid βœ“
  • j=3: subarray=[2,3,3,2], cnt=2, h1=233*10+2=2332, valid βœ“
  • j=4: subarray=[2,3,3,2,2], cnt=3 > k=2, STOP

Starting from index i=1:

  • j=1: subarray=[3], cnt=0, h1=3, valid βœ“
  • j=2: subarray=[3,3], cnt=0, h1=3*10+3=33, valid βœ“
  • j=3: subarray=[3,3,2], cnt=1, h1=33*10+2=332, valid βœ“
  • j=4: subarray=[3,3,2,2], cnt=2, h1=332*10+2=3322, valid βœ“

Starting from index i=2:

  • j=2: subarray=[3], cnt=0, h1=3, valid βœ“ (duplicate hash with [3] from i=1)
  • j=3: subarray=[3,2], cnt=1, h1=3*10+2=32, valid βœ“
  • j=4: subarray=[3,2,2], cnt=2, h1=32*10+2=322, valid βœ“

Starting from index i=3:

  • j=3: subarray=[2], cnt=1, h1=2, valid βœ“ (duplicate hash with [2] from i=0)
  • j=4: subarray=[2,2], cnt=2, h1=2*10+2=22, valid βœ“

Starting from index i=4:

  • j=4: subarray=[2], cnt=1, h1=2, valid βœ“ (duplicate hash with [2] from i=0 and i=3)

Collecting unique hashes: The set would contain these unique hash values (simplified):

  • {2, 23, 233, 2332, 3, 33, 332, 3322, 32, 322, 22}

Note that:

  • [2] appears 3 times (at indices 0, 3, 4) but only counted once
  • [3] appears 2 times (at indices 1, 2) but only counted once

Final answer: 11 distinct subarrays

The actual implementation uses double hashing with two different bases and modulos to ensure we correctly identify distinct subarrays even when dealing with large numbers and complex patterns.

Solution Implementation

1class Solution:
2    def countDistinct(self, nums: List[int], k: int, p: int) -> int:
3        """
4        Count the number of distinct subarrays where at most k elements are divisible by p.
5        Uses rolling hash with double hashing to identify unique subarrays.
6      
7        Args:
8            nums: Input array of integers
9            k: Maximum number of elements divisible by p allowed in a subarray
10            p: Divisor to check divisibility against
11          
12        Returns:
13            Number of distinct subarrays meeting the criteria
14        """
15        unique_subarrays = set()
16        n = len(nums)
17      
18        # Double hashing parameters to minimize collision probability
19        BASE_1, BASE_2 = 131, 13331
20        MOD_1, MOD_2 = 10**9 + 7, 10**9 + 9
21      
22        # Iterate through all possible starting positions
23        for start_idx in range(n):
24            hash_1 = 0
25            hash_2 = 0
26            divisible_count = 0
27          
28            # Extend the subarray from start_idx to the right
29            for end_idx in range(start_idx, n):
30                # Count elements divisible by p
31                if nums[end_idx] % p == 0:
32                    divisible_count += 1
33              
34                # Stop if we exceed k divisible elements
35                if divisible_count > k:
36                    break
37              
38                # Update rolling hashes for the current subarray
39                hash_1 = (hash_1 * BASE_1 + nums[end_idx]) % MOD_1
40                hash_2 = (hash_2 * BASE_2 + nums[end_idx]) % MOD_2
41              
42                # Combine two hashes into a single unique identifier
43                # Using bit shift to pack two 32-bit values into one
44                combined_hash = (hash_1 << 32) | hash_2
45                unique_subarrays.add(combined_hash)
46      
47        return len(unique_subarrays)
48
1class Solution {
2    public int countDistinct(int[] nums, int k, int p) {
3        // Use a HashSet to store unique hash values of subarrays
4        Set<Long> uniqueSubarrays = new HashSet<>();
5        int arrayLength = nums.length;
6      
7        // Define base values for rolling hash computation
8        int hashBase1 = 131;
9        int hashBase2 = 13331;
10      
11        // Define modulo values to prevent integer overflow
12        int modulo1 = (int) 1e9 + 7;
13        int modulo2 = (int) 1e9 + 9;
14      
15        // Iterate through all possible starting positions of subarrays
16        for (int startIndex = 0; startIndex < arrayLength; ++startIndex) {
17            // Initialize rolling hash values for two different hash functions
18            long hashValue1 = 0;
19            long hashValue2 = 0;
20          
21            // Counter for elements divisible by p in current subarray
22            int divisibleCount = 0;
23          
24            // Extend the subarray from startIndex to endIndex
25            for (int endIndex = startIndex; endIndex < arrayLength; ++endIndex) {
26                // Check if current element is divisible by p
27                divisibleCount += (nums[endIndex] % p == 0) ? 1 : 0;
28              
29                // If we exceed k divisible elements, stop extending this subarray
30                if (divisibleCount > k) {
31                    break;
32                }
33              
34                // Update rolling hash values using polynomial rolling hash technique
35                hashValue1 = (hashValue1 * hashBase1 + nums[endIndex]) % modulo1;
36                hashValue2 = (hashValue2 * hashBase2 + nums[endIndex]) % modulo2;
37              
38                // Combine two hash values into a single long value to minimize collisions
39                // Left shift hashValue1 by 32 bits and combine with hashValue2 using bitwise OR
40                uniqueSubarrays.add((hashValue1 << 32) | hashValue2);
41            }
42        }
43      
44        // Return the count of distinct subarrays
45        return uniqueSubarrays.size();
46    }
47}
48
1class Solution {
2public:
3    int countDistinct(vector<int>& nums, int k, int p) {
4        // Set to store unique hash values representing distinct subarrays
5        unordered_set<long long> uniqueSubarrays;
6        int n = nums.size();
7      
8        // Hash parameters for double hashing to minimize collisions
9        const int BASE1 = 131;
10        const int BASE2 = 13331;
11        const int MOD1 = 1e9 + 7;
12        const int MOD2 = 1e9 + 9;
13      
14        // Iterate through all possible starting positions
15        for (int startIdx = 0; startIdx < n; ++startIdx) {
16            long long hash1 = 0;  // First hash value using BASE1 and MOD1
17            long long hash2 = 0;  // Second hash value using BASE2 and MOD2
18            int divisibleCount = 0;  // Count of elements divisible by p
19          
20            // Extend subarray from startIdx to endIdx
21            for (int endIdx = startIdx; endIdx < n; ++endIdx) {
22                // Count elements divisible by p
23                divisibleCount += (nums[endIdx] % p == 0) ? 1 : 0;
24              
25                // Stop if we exceed k divisible elements
26                if (divisibleCount > k) {
27                    break;
28                }
29              
30                // Update rolling hash values for current subarray
31                hash1 = (hash1 * BASE1 + nums[endIdx]) % MOD1;
32                hash2 = (hash2 * BASE2 + nums[endIdx]) % MOD2;
33              
34                // Combine two hash values into a single 64-bit value
35                // Left shift hash1 by 32 bits and OR with hash2
36                long long combinedHash = (hash1 << 32) | hash2;
37                uniqueSubarrays.insert(combinedHash);
38            }
39        }
40      
41        // Return count of distinct subarrays
42        return uniqueSubarrays.size();
43    }
44};
45
1/**
2 * Counts the number of distinct subarrays where at most k elements are divisible by p
3 * @param nums - The input array of numbers
4 * @param k - Maximum number of elements divisible by p allowed in a subarray
5 * @param p - The divisor to check divisibility against
6 * @returns The count of distinct subarrays meeting the criteria
7 */
8function countDistinct(nums: number[], k: number, p: number): number {
9    // Set to store unique hash values representing distinct subarrays
10    const uniqueSubarrays = new Set<bigint>();
11  
12    // Rolling hash parameters - using two different bases and moduli to minimize collisions
13    const hashBase1 = 131;
14    const hashBase2 = 13331;
15    const modulus1 = 1000000007;
16    const modulus2 = 1000000009;
17  
18    // Iterate through all possible starting positions for subarrays
19    for (let startIndex = 0; startIndex < nums.length; startIndex++) {
20        // Initialize hash values and counter for elements divisible by p
21        let hashValue1 = 0;
22        let hashValue2 = 0;
23        let divisibleCount = 0;
24      
25        // Extend the subarray from the current starting position
26        for (let endIndex = startIndex; endIndex < nums.length; endIndex++) {
27            // Check if current element is divisible by p
28            if (nums[endIndex] % p === 0) {
29                divisibleCount++;
30                // Stop extending if we exceed k divisible elements
31                if (divisibleCount > k) {
32                    break;
33                }
34            }
35          
36            // Update rolling hash values for both hash functions
37            hashValue1 = (hashValue1 * hashBase1 + nums[endIndex]) % modulus1;
38            hashValue2 = (hashValue2 * hashBase2 + nums[endIndex]) % modulus2;
39          
40            // Combine both hash values into a single unique identifier using bit operations
41            // Left shift hashValue1 by 32 bits and OR with hashValue2
42            const combinedHash = (BigInt(hashValue1) << 32n) | BigInt(hashValue2);
43            uniqueSubarrays.add(combinedHash);
44        }
45    }
46  
47    // Return the number of unique subarrays found
48    return uniqueSubarrays.size;
49}
50

Time and Space Complexity

Time Complexity: O(n^2)

The code uses two nested loops:

  • The outer loop runs from i = 0 to i = n-1, iterating n times
  • The inner loop runs from j = i to at most j = n-1 for each i
  • In the worst case, when cnt <= k for all subarrays, the inner loop runs n-i times for each i
  • Total iterations: n + (n-1) + (n-2) + ... + 1 = n(n+1)/2 = O(n^2)
  • Each iteration performs constant time operations: modulo check, hash calculations, and set insertion
  • Therefore, the overall time complexity is O(n^2)

Space Complexity: O(n^2)

  • The set s stores unique hash values representing distinct subarrays
  • In the worst case, all possible subarrays are distinct and satisfy the condition cnt <= k
  • The number of possible subarrays is n(n+1)/2 = O(n^2)
  • Each hash value stored in the set takes O(1) space (a 64-bit integer created by h1 << 32 | h2)
  • Other variables (h1, h2, cnt, i, j) use O(1) space
  • Therefore, the overall space complexity is O(n^2)

Learn more about how to find time and space complexity quickly.

Common Pitfalls

1. Hash Collision Leading to Undercounting

Problem: Even with double hashing, there's still a possibility of hash collisions where two different subarrays produce the same combined hash value. This would cause the solution to undercount distinct subarrays.

Example Scenario:

# Two different subarrays might theoretically produce the same hash
subarray1 = [1, 2, 3]
subarray2 = [4, 5, 6]  # Different elements but same hash (rare but possible)

Solution: Instead of relying solely on hashing, use tuples of the actual subarray elements as set keys:

class Solution:
    def countDistinct(self, nums: List[int], k: int, p: int) -> int:
        unique_subarrays = set()
        n = len(nums)
      
        for start_idx in range(n):
            divisible_count = 0
          
            for end_idx in range(start_idx, n):
                if nums[end_idx] % p == 0:
                    divisible_count += 1
              
                if divisible_count > k:
                    break
              
                # Store the actual subarray as a tuple (immutable and hashable)
                subarray = tuple(nums[start_idx:end_idx + 1])
                unique_subarrays.add(subarray)
      
        return len(unique_subarrays)

2. Integer Overflow in Hash Calculation

Problem: In languages without automatic big integer handling, the expression (hash_1 << 32) | hash_2 might overflow if hash_1 is large, especially in 32-bit systems or languages with fixed integer sizes.

Solution: Use string concatenation or a tuple to combine hashes:

# Option 1: Use tuple as the key
combined_hash = (hash_1, hash_2)
unique_subarrays.add(combined_hash)

# Option 2: Use string concatenation (less efficient but safer)
combined_hash = f"{hash_1}_{hash_2}"
unique_subarrays.add(combined_hash)

3. Incorrect Divisibility Check with Negative Numbers

Problem: If the array contains negative numbers, the modulo operation behavior might be unexpected in some programming languages. In Python, -5 % 3 = 1, but in some languages it might be -2.

Solution: Use absolute value for divisibility checking if negative numbers are possible:

if abs(nums[end_idx]) % p == 0:
    divisible_count += 1

4. Memory Inefficiency with Large Arrays

Problem: For very large arrays with many distinct subarrays, storing all hash values might consume excessive memory, potentially causing memory errors.

Solution: If memory is a concern and approximate answers are acceptable, consider:

  • Using a bloom filter for approximate membership testing
  • Processing in batches and clearing the set periodically
  • Using a more memory-efficient data structure
# Example: Process in chunks if array is very large
def countDistinctOptimized(self, nums: List[int], k: int, p: int) -> int:
    if len(nums) > 10000:  # Threshold for optimization
        # Use tuple approach for smaller memory footprint
        # Or implement streaming/batching logic
        pass
Discover Your Strengths and Weaknesses: Take Our 3-Minute Quiz to Tailor Your Study Plan:

Which of the tree traversal order can be used to obtain elements in a binary search tree in sorted order?


Recommended Readings

Want a Structured Path to Master System Design Too? Don’t Miss This!

Load More