Facebook Pixel

2080. Range Frequency Queries

Problem Description

You need to design a data structure that can efficiently find how many times a specific value appears in a given subarray.

The task is to implement a class called RangeFreqQuery with the following functionality:

  1. Constructor RangeFreqQuery(int[] arr): Initialize the data structure with a given integer array arr (0-indexed).

  2. Query Method query(int left, int right, int value): Return the number of times value appears in the subarray from index left to index right (inclusive).

For example, if you have an array [1, 2, 2, 3, 2, 4] and you query for the value 2 in the range [1, 4], the subarray would be [2, 2, 3, 2], and the frequency of 2 would be 3.

The solution uses a preprocessing approach where during initialization, it creates a hash table that maps each unique value to a list of indices where that value appears in the original array. When a query is made, it uses binary search on the pre-stored index list to quickly find how many indices fall within the given range [left, right]. Specifically, it finds the leftmost position where an index is at least left and the leftmost position where an index is greater than right, then calculates the difference to get the count.

Quick Interview Experience
Help others by sharing your interview experience
Have you seen this problem before?

Intuition

The naive approach would be to iterate through the subarray [left, right] for each query and count occurrences of the target value. However, this would take O(n) time per query, which becomes inefficient when we have many queries.

The key insight is that we can preprocess the array to make queries faster. Think about it this way: if we know all the positions where a particular value appears in the array, we can answer any range query about that value quickly.

For instance, if value 2 appears at indices [1, 3, 5, 7, 9], and we want to know how many times it appears in range [2, 8], we just need to count how many of these indices fall within [2, 8]. The indices [3, 5, 7] are within the range, so the answer is 3.

Since the indices for each value are naturally sorted (we traverse the array from left to right during preprocessing), we can use binary search to efficiently find:

  • The first index in our list that is >= left (lower boundary)
  • The first index in our list that is > right (upper boundary)

The difference between these two positions gives us the count of indices within the range, which equals the frequency of the value in that subarray.

This transforms our problem from "count occurrences in a range" to "count how many pre-stored indices fall within a range", which can be solved in O(log m) time where m is the number of occurrences of the value in the entire array.

Learn more about Segment Tree and Binary Search patterns.

Solution Approach

The solution uses a Hash Table + Binary Search approach to efficiently answer range frequency queries.

Data Structure Setup

We use a hash table g (implemented as a defaultdict(list)) where:

  • Key: Each unique value in the array
  • Value: A list of indices where this value appears

During initialization in the constructor:

def __init__(self, arr: List[int]):
    self.g = defaultdict(list)
    for i, x in enumerate(arr):
        self.g[x].append(i)

We iterate through the array once, and for each element x at index i, we append the index i to the list self.g[x]. Since we traverse from left to right, the indices in each list are automatically sorted in ascending order.

Query Implementation

For the query(left, right, value) method:

def query(self, left: int, right: int, value: int) -> int:
    idx = self.g[value]
    l = bisect_left(idx, left)
    r = bisect_left(idx, right + 1)
    return r - l
  1. Get the index list: We retrieve idx = self.g[value], which contains all indices where value appears. If the value doesn't exist, idx will be an empty list.

  2. Find the left boundary: l = bisect_left(idx, left) finds the leftmost position in idx where we could insert left to maintain sorted order. This gives us the index of the first occurrence that is >= left.

  3. Find the right boundary: r = bisect_left(idx, right + 1) finds the leftmost position where we could insert right + 1. This gives us the index of the first occurrence that is > right.

  4. Calculate frequency: The difference r - l gives us the count of indices that fall within the range [left, right].

Time Complexity

  • Initialization: O(n) where n is the length of the array
  • Query: O(log m) where m is the number of occurrences of the queried value
  • Space Complexity: O(n) to store all indices in the hash table

Ready to land your dream job?

Unlock your dream job with a 5-minute evaluator for a personalized learning plan!

Start Evaluator

Example Walkthrough

Let's walk through a concrete example to understand how the solution works.

Given array: [12, 33, 4, 56, 22, 2, 34, 33, 22, 12, 34, 56]

Step 1: Initialization (Building the Hash Table)

We traverse the array and build our hash table g that maps each value to its list of indices:

Value 12: appears at indices [0, 9]
Value 33: appears at indices [1, 7]
Value 4:  appears at indices [2]
Value 56: appears at indices [3, 11]
Value 22: appears at indices [4, 8]
Value 2:  appears at indices [5]
Value 34: appears at indices [6, 10]

Our hash table g looks like:

g = {
    12: [0, 9],
    33: [1, 7],
    4:  [2],
    56: [3, 11],
    22: [4, 8],
    2:  [5],
    34: [6, 10]
}

Step 2: Query Example - query(left=1, right=8, value=33)

We want to count how many times 33 appears in the subarray from index 1 to 8 (inclusive).

  1. Retrieve index list: idx = g[33] = [1, 7]

    • These are all positions where 33 appears in the original array
  2. Find left boundary: l = bisect_left([1, 7], 1) = 0

    • We're looking for where to insert 1 in the list [1, 7]
    • Since 1 already exists at position 0, bisect_left returns 0
    • This means the first occurrence of 33 that is >= 1 is at position 0 in our index list
  3. Find right boundary: r = bisect_left([1, 7], 9) = 2

    • We're looking for where to insert 9 (which is right + 1 = 8 + 1)
    • 9 would go after 7, at position 2
    • This means there are 2 indices that are <= 8
  4. Calculate result: r - l = 2 - 0 = 2

    • Both indices [1, 7] fall within the range [1, 8]
    • Therefore, 33 appears 2 times in the subarray

Step 3: Another Query Example - query(left=5, right=10, value=34)

  1. Retrieve index list: idx = g[34] = [6, 10]

  2. Find left boundary: l = bisect_left([6, 10], 5) = 0

    • 5 would be inserted at position 0 (before 6)
    • The first index >= 5 is at position 0 in our list
  3. Find right boundary: r = bisect_left([6, 10], 11) = 2

    • 11 would be inserted at position 2 (after 10)
    • Both indices are <= 10
  4. Calculate result: r - l = 2 - 0 = 2

    • Both indices [6, 10] fall within the range [5, 10]
    • Therefore, 34 appears 2 times in the subarray

Edge Case - Query for non-existent value: query(left=2, right=6, value=99)

  1. Retrieve index list: idx = g[99] = [] (empty list, as 99 doesn't exist)
  2. Find boundaries: l = 0, r = 0 (both binary searches on empty list return 0)
  3. Calculate result: r - l = 0 - 0 = 0 (correctly returns 0)

This approach efficiently answers each query in O(log m) time by leveraging preprocessed data and binary search, rather than scanning through the entire range for each query.

Solution Implementation

1from collections import defaultdict
2from bisect import bisect_left
3from typing import List
4
5
6class RangeFreqQuery:
7    """
8    A data structure that efficiently answers frequency queries for a given value
9    within a specified range of indices in an array.
10    """
11
12    def __init__(self, arr: List[int]):
13        """
14        Initialize the RangeFreqQuery object with the given array.
15        Pre-processes the array by storing all indices for each unique value.
16      
17        Args:
18            arr: The input array of integers
19      
20        Time Complexity: O(n) where n is the length of the array
21        Space Complexity: O(n) for storing the index mappings
22        """
23        # Dictionary mapping each value to a list of indices where it appears
24        self.value_to_indices = defaultdict(list)
25      
26        # Store the index of each occurrence of every value
27        for index, value in enumerate(arr):
28            self.value_to_indices[value].append(index)
29
30    def query(self, left: int, right: int, value: int) -> int:
31        """
32        Query the frequency of a specific value within the range [left, right].
33      
34        Args:
35            left: The left boundary of the range (inclusive)
36            right: The right boundary of the range (inclusive)
37            value: The value whose frequency we want to find
38      
39        Returns:
40            The frequency of 'value' in the range [left, right]
41      
42        Time Complexity: O(log m) where m is the number of occurrences of 'value'
43        """
44        # Get the list of indices where 'value' appears
45        indices = self.value_to_indices[value]
46      
47        # Find the leftmost index >= left using binary search
48        left_position = bisect_left(indices, left)
49      
50        # Find the leftmost index > right using binary search
51        # We use right + 1 to include right in our range
52        right_position = bisect_left(indices, right + 1)
53      
54        # The frequency is the number of indices in the range [left, right]
55        return right_position - left_position
56
57
58# Your RangeFreqQuery object will be instantiated and called as such:
59# obj = RangeFreqQuery(arr)
60# param_1 = obj.query(left, right, value)
61
1class RangeFreqQuery {
2    // Map to store value -> list of indices where the value appears
3    private Map<Integer, List<Integer>> valueToIndicesMap = new HashMap<>();
4
5    /**
6     * Constructor that preprocesses the array
7     * Creates a mapping of each value to all indices where it appears
8     * @param arr The input array to preprocess
9     */
10    public RangeFreqQuery(int[] arr) {
11        // Iterate through the array and record each value's positions
12        for (int index = 0; index < arr.length; index++) {
13            // If the value doesn't exist in map, create a new list for it
14            // Then add the current index to the list
15            valueToIndicesMap.computeIfAbsent(arr[index], key -> new ArrayList<>()).add(index);
16        }
17    }
18
19    /**
20     * Query the frequency of a value within the range [left, right]
21     * @param left The left boundary of the range (inclusive)
22     * @param right The right boundary of the range (inclusive)
23     * @param value The value to count within the range
24     * @return The frequency of the value in the given range
25     */
26    public int query(int left, int right, int value) {
27        // If the value doesn't exist in the array, return 0
28        if (!valueToIndicesMap.containsKey(value)) {
29            return 0;
30        }
31      
32        // Get the list of indices where the value appears
33        List<Integer> indicesList = valueToIndicesMap.get(value);
34      
35        // Find the leftmost position where index >= left using binary search
36        int leftPosition = Collections.binarySearch(indicesList, left);
37        // If exact match not found, binarySearch returns -(insertion point) - 1
38        // Convert it to the insertion point (first index >= left)
39        leftPosition = leftPosition < 0 ? -leftPosition - 1 : leftPosition;
40      
41        // Find the leftmost position where index > right using binary search
42        // We search for right + 1 to find the first index beyond our range
43        int rightPosition = Collections.binarySearch(indicesList, right + 1);
44        // Convert negative result to insertion point
45        rightPosition = rightPosition < 0 ? -rightPosition - 1 : rightPosition;
46      
47        // The count is the difference between the two positions
48        // This gives us the number of indices in range [left, right]
49        return rightPosition - leftPosition;
50    }
51}
52
53/**
54 * Your RangeFreqQuery object will be instantiated and called as such:
55 * RangeFreqQuery obj = new RangeFreqQuery(arr);
56 * int param_1 = obj.query(left,right,value);
57 */
58
1class RangeFreqQuery {
2private:
3    // Map from value to list of indices where the value appears in the original array
4    unordered_map<int, vector<int>> valueToIndices;
5  
6public:
7    /**
8     * Constructor: Preprocesses the array to build an index map
9     * @param arr: The input array to process
10     * Time Complexity: O(n) where n is the length of arr
11     * Space Complexity: O(n) to store all indices
12     */
13    RangeFreqQuery(vector<int>& arr) {
14        // Build the index map: for each value, store all positions where it appears
15        for (int i = 0; i < arr.size(); ++i) {
16            valueToIndices[arr[i]].push_back(i);
17        }
18    }
19  
20    /**
21     * Query the frequency of a value in the range [left, right]
22     * @param left: The left boundary of the range (inclusive)
23     * @param right: The right boundary of the range (inclusive)
24     * @param value: The value to count in the given range
25     * @return: The frequency of the value in the range [left, right]
26     * Time Complexity: O(log m) where m is the frequency of the value in the entire array
27     */
28    int query(int left, int right, int value) {
29        // Check if the value exists in the array
30        if (!valueToIndices.contains(value)) {
31            return 0;
32        }
33      
34        // Get the list of indices where the value appears
35        const auto& indices = valueToIndices[value];
36      
37        // Find the first index >= left using binary search
38        auto leftIterator = lower_bound(indices.begin(), indices.end(), left);
39      
40        // Find the first index > right (i.e., >= right + 1) using binary search
41        auto rightIterator = lower_bound(indices.begin(), indices.end(), right + 1);
42      
43        // The count is the difference between the two iterators
44        return rightIterator - leftIterator;
45    }
46};
47
48/**
49 * Your RangeFreqQuery object will be instantiated and called as such:
50 * RangeFreqQuery* obj = new RangeFreqQuery(arr);
51 * int param_1 = obj->query(left,right,value);
52 */
53
1// Map to store value -> array of indices where the value appears
2let valueToIndicesMap: Map<number, number[]> = new Map();
3
4/**
5 * Initializes the data structure with the given array
6 * Creates a mapping of each unique value to all indices where it appears
7 * @param arr - The input array to process
8 */
9function initializeRangeFreqQuery(arr: number[]): void {
10    // Clear any existing data
11    valueToIndicesMap = new Map();
12  
13    // Iterate through the array and record indices for each value
14    for (let index = 0; index < arr.length; index++) {
15        const currentValue = arr[index];
16      
17        // Initialize array for this value if it doesn't exist
18        if (!valueToIndicesMap.has(currentValue)) {
19            valueToIndicesMap.set(currentValue, []);
20        }
21      
22        // Add current index to the list of indices for this value
23        valueToIndicesMap.get(currentValue)!.push(index);
24    }
25}
26
27/**
28 * Performs binary search to find the insertion position of target in a sorted array
29 * @param sortedArray - The sorted array to search in
30 * @param target - The target value to find insertion position for
31 * @returns The index where target should be inserted to maintain sorted order
32 */
33function binarySearchInsertPosition(sortedArray: number[], target: number): number {
34    let leftBound = 0;
35    let rightBound = sortedArray.length;
36  
37    while (leftBound < rightBound) {
38        const mid = Math.floor((leftBound + rightBound) / 2);
39      
40        if (sortedArray[mid] < target) {
41            leftBound = mid + 1;
42        } else {
43            rightBound = mid;
44        }
45    }
46  
47    return leftBound;
48}
49
50/**
51 * Queries the frequency of a specific value within a given range
52 * @param left - The left boundary of the range (inclusive)
53 * @param right - The right boundary of the range (inclusive)
54 * @param value - The value to count occurrences of
55 * @returns The number of times the value appears in the range [left, right]
56 */
57function query(left: number, right: number, value: number): number {
58    // Get the array of indices where the value appears
59    const indicesArray = valueToIndicesMap.get(value);
60  
61    // If value doesn't exist in the original array, return 0
62    if (!indicesArray) {
63        return 0;
64    }
65  
66    // Find the first index >= left using binary search
67    const leftInsertPosition = binarySearchInsertPosition(indicesArray, left);
68  
69    // Find the first index > right using binary search
70    const rightInsertPosition = binarySearchInsertPosition(indicesArray, right + 1);
71  
72    // The count is the difference between the two positions
73    return rightInsertPosition - leftInsertPosition;
74}
75

Time and Space Complexity

Time Complexity:

  • Constructor (__init__): O(n) where n is the length of the input array. The constructor iterates through each element in the array exactly once, and for each element, it appends the index to the corresponding list in the dictionary. The append operation is O(1) amortized, so the total time is O(n).

  • Query function (query): O(log m) where m is the number of occurrences of the queried value in the array. The function performs two binary searches using bisect_left on the list of indices for the given value. Each binary search takes O(log m) time. Since we perform two binary searches, the total time is O(log m) + O(log m) = O(log m). In the worst case where all elements in the array are the same value, m = n, so the worst-case time complexity is O(log n).

Space Complexity:

  • O(n) where n is the length of the input array. The dictionary self.g stores a list of indices for each unique value in the array. In total, we store exactly n indices across all lists in the dictionary (since each index from 0 to n-1 appears exactly once). Therefore, the space complexity is O(n).

Learn more about how to find time and space complexity quickly.

Common Pitfalls

1. Off-by-One Error in Right Boundary Calculation

The Pitfall: A common mistake is using bisect_right(indices, right) instead of bisect_left(indices, right + 1) for finding the right boundary. While both might seem to work, they can produce different results in edge cases.

# Incorrect approach that might seem intuitive
def query(self, left: int, right: int, value: int) -> int:
    indices = self.value_to_indices[value]
    left_position = bisect_left(indices, left)
    right_position = bisect_right(indices, right)  # Wrong!
    return right_position - left_position

Why it's wrong:

  • bisect_right(indices, right) finds the position after all occurrences of right
  • bisect_left(indices, right + 1) finds the position before all occurrences of values > right
  • These are equivalent only when right is in the indices list, but differ when it's not

Example demonstrating the issue:

arr = [1, 2, 3, 2, 5, 2]
# indices for value 2: [1, 3, 5]
# Query for value=2 in range [0, 4]

# Using bisect_right(indices, 4): returns 2 (correct by coincidence)
# Using bisect_left(indices, 5): returns 2 (correct)

# But if we query range [0, 3]:
# Using bisect_right(indices, 3): returns 2 (correct)
# Using bisect_left(indices, 4): returns 2 (correct)

# However, if indices were [1, 3, 4, 5] and we query [0, 3]:
# Using bisect_right(indices, 3): returns 2 (wrong - misses index 3)
# Using bisect_left(indices, 4): returns 2 (correct)

2. Not Handling Non-Existent Values

The Pitfall: Forgetting that the queried value might not exist in the original array. Since we use defaultdict(list), this is handled automatically, but with a regular dictionary, this would cause a KeyError.

# Problematic if using regular dict instead of defaultdict
def __init__(self, arr: List[int]):
    self.value_to_indices = {}  # Regular dict
    for index, value in enumerate(arr):
        if value not in self.value_to_indices:
            self.value_to_indices[value] = []
        self.value_to_indices[value].append(index)

def query(self, left: int, right: int, value: int) -> int:
    # This will raise KeyError if value doesn't exist!
    indices = self.value_to_indices[value]  
    # ...

Solution: Either use defaultdict(list) as in the original solution, or add a check:

def query(self, left: int, right: int, value: int) -> int:
    if value not in self.value_to_indices:
        return 0
    indices = self.value_to_indices[value]
    left_position = bisect_left(indices, left)
    right_position = bisect_left(indices, right + 1)
    return right_position - left_position

3. Misunderstanding Binary Search Return Values

The Pitfall: Assuming that bisect_left returns -1 or None when the element isn't found (like some search functions do). In reality, it returns the insertion position that would maintain sorted order.

# Incorrect assumption about bisect_left behavior
def query(self, left: int, right: int, value: int) -> int:
    indices = self.value_to_indices[value]
    left_position = bisect_left(indices, left)
  
    # Wrong: checking if element was "found"
    if left_position == -1 or left_position >= len(indices):
        return 0
  
    # This check is unnecessary and incorrect

Why it works correctly without checks:

  • If left is greater than all indices, bisect_left returns len(indices)
  • If right + 1 is greater than all indices, bisect_left returns len(indices)
  • The subtraction right_position - left_position naturally gives 0 when no indices fall in the range
  • The algorithm handles all edge cases automatically through the mathematical properties of the insertion positions
Discover Your Strengths and Weaknesses: Take Our 5-Minute Quiz to Tailor Your Study Plan:

Which of the tree traversal order can be used to obtain elements in a binary search tree in sorted order?


Recommended Readings

Want a Structured Path to Master System Design Too? Don’t Miss This!

Load More