1157. Online Majority Element In Subarray

HardDesignBinary Indexed TreeSegment TreeArrayBinary Search
Leetcode Link

Problem Description

The task is to design a data structure that can quickly find the majority element of any subarray within a given array. A majority element in this context is defined as an element that appears at least threshold times within the subarray. This data structure should be able to handle a large number of queries efficiently.

The class MajorityChecker is to be implemented with the following methods:

  • MajorityChecker(int[] arr) - Initializes the MajorityChecker class with the array arr.
  • int query(int left, int right, int threshold) - Returns the element within the subarray arr[left...right] that occurs at least threshold times. If no such element exists, the method should return -1.

These operations need to be optimized for speed because a straightforward implementation that simply counts occurrences of elements within the subarray on each query would result in poor performance, especially when there's a large number of queries or the subarray sizes are large.

Intuition

The solution needs to employ efficient data structures and algorithms to handle frequent and potentially large-scale queries. The majority of the heavy-lifting is done using a Segment Tree—a powerful data structure that allows us to store information about subarrays in a tree format where each node represents information about a segment (or subarray) of the array.

For the purpose of this problem, each node in the Segment Tree contains the following information:

  • The range of indices (l and r) it represents.
  • The element that appears most frequently within that range (x).
  • The count of how many times that element appears (cnt).

When constructing the Segment Tree, we merge the information of the two children nodes to calculate the information for their parent by following the majority voting algorithm concept, which is, if both halves have the same majority, it will also be the majority of the whole range. If the halves have different majorities, the global majority will be the one with the higher count, and the count will be the difference of counts from both halves.

The query function of the Segment Tree performs the same merge operation recursively to find the majority element and its count within any queried subarray.

To ensure that the queried element indeed meets the threshold constraint, we perform a binary search on the original array to quickly count the actual occurrences of the element between a specific range using a dictionary that stores indices for each element. If the actual count meets or exceeds the threshold, we return the element; otherwise, we return -1.

This approach significantly reduces the time complexity of the problem, allowing us to quickly answer queries after the initial Segment Tree is built.

Learn more about Segment Tree and Binary Search patterns.

Not Sure What to Study? Take the 2-min Quiz to Find Your Missing Piece:

Which of the following is a min heap?

Solution Approach

The solution applies two critical data structures: a Segment Tree to efficiently manage subarray queries and a hashmap (dictionary in Python) to record the indices of each element. Additionally, the solution uses a binary search to quickly determine whether a candidate element meets the threshold.

Segment Tree

The SegmentTree class is custom-built to handle queries for the majority element of subarrays. It is initialized with the input array nums, and builds the tree using a recursive build method:

  • The build method initializes each node of the Segment Tree to hold information about which part of the array it represents. Each leaf node represents a single-element subarray and hence, will have a count (cnt) of 1. Internal nodes represent larger subarrays obtained by merging two child subarrays.

  • The query method of the SegmentTree class takes in three parameters: the current node u, and the left and right bounds of the query l and r. It uses these bounds to navigate the tree and find or calculate the majority element and its count for the given subarray. It recursively queries the left and right children to get their majority element and count, then merges the results.

  • The pushup method defines how to merge children nodes to form parent nodes during the build process. If the two children have the same majority element, then it is also the parent's majority element, and its count is the sum of the children's counts. If they have different majority elements, the element with the higher count becomes the parent's majority element, and the count is the difference between the two counts.

Hashmap and Binary Search

A dictionary self.d is used to map each distinct element in the array to the list of indices where it occurs ("inverted index"). This is used later in the query method to quickly find the real occurrence count of candidate majority elements within any subarray.

In the query method of the MajorityChecker class, the binary search function bisect_left from Python's bisect module is employed to efficiently find the number of times the candidate majority element appears within the given query bounds:

  • After obtaining a candidate majority element from the Segment Tree, bisect_left is used twice: once to find the first occurrence of the element at or after left, and once to find the first occurrence at or after right + 1. The difference between these positions gives us the true occurrence count in the desired range.

  • Finally, if the count is greater than or equal to the threshold, we return the element; otherwise, we return -1, indicating that no such majority element meeting the threshold exists in the subarray.

By combining these techniques, the solution is able to efficiently process queries for the majority element in subarrays of the input array, satisfying the problem's constraints.

Discover Your Strengths and Weaknesses: Take Our 2-Minute Quiz to Tailor Your Study Plan:

A heap is a ...?

Example Walkthrough

Let's walk through a small example to illustrate the solution approach described above. Consider the following array:

1arr = [1, 1, 2, 2, 1, 2, 2]

Let's initialize our MajorityChecker class with this array:

1majorityChecker = MajorityChecker(arr)

The Segment Tree will be built using this array, and the dictionary for binary searches will be populated with indices of each element:

1Segment Tree (conceptual):
2            [0, 6]
3           /      \
4     [0, 3]        [4, 6]
5     /    \        /     \
6[0, 1]   [2, 3] [4, 5]   [6]
7
8Dictionary:
9{1: [0, 1, 4], 2: [2, 3, 5, 6]}

Now, let's perform a query on the subarray from index 1 to 5 with a threshold of 3. We use the query method:

1element = majorityChecker.query(1, 5, 3)

To answer this query, the MajorityChecker does the following:

  1. Invokes query method on the Segment Tree and navigates the nodes corresponding to the range [1, 5].

  2. It finds candidate majority elements based on the subarrays covered by this range. In this case, it may find that 2 is a candidate because it appears as a majority in subarrays [2, 3] and [4, 5].

  3. Next, the code uses the dictionary {1: [0, 1, 4], 2: [2, 3, 5, 6]} to quickly access how often 2 appears between indices 1 and 5.

    • First, bisect_left finds the first occurrence of 2 at or after index 1, which is 2.
    • Then, bisect_left finds the first occurrence of 2 at or after index 6 which is 6.
    • The difference of these indices tells us that 2 appears 3 times in the subrange [1, 5].
  4. Since this count equals the threshold of 3, the query will return 2. If the count was less than 3, the function would return -1, indicating no element meets the threshold.

Thus, for this example query, the result would be:

1element = majorityChecker.query(1, 5, 3)
2print(element)  # Output: 2

By utilizing the Segment Tree for fast range queries and the hashmap for efficient exact counts within the range, the MajorityChecker class quickly identifies the majority element or reports its absence if the threshold is not met.

Solution Implementation

1from collections import defaultdict
2from bisect import bisect_left
3
4
5class Node:
6    def __init__(self):
7        self.left_child = self.right_child = 0  # L and R represent the range of segment
8        self.value = self.count = 0  # Value stores the majority element and count stores its frequency
9
10
11class SegmentTree:
12    def __init__(self, nums):
13        self.nums = nums  # Original list of numbers
14        n = len(nums)
15        self.tree_nodes = [Node() for _ in range(n << 2)]  # Segment tree nodes
16        self.build(1, 1, n)
17
18    # Function to build the segment tree recursively
19    def build(self, node, left, right):
20        self.tree_nodes[node].left_child, self.tree_nodes[node].right_child = left, right
21        if left == right:
22            # Leaf node assignment where the value is the number itself and the count is 1
23            self.tree_nodes[node].value = self.nums[left - 1]
24            self.tree_nodes[node].count = 1
25            return
26        mid = (left + right) >> 1
27        self.build(node << 1, left, mid)  # Build left subtree
28        self.build(node << 1 | 1, mid + 1, right)  # Build right subtree
29        self.push_up(node)
30
31    # Function to query the segment tree for the majority element
32    def query(self, node, left, right):
33        if self.tree_nodes[node].left_child >= left and self.tree_nodes[node].right_child <= right:
34            # If the current node segment is within the query range, return its value and count.
35            return self.tree_nodes[node].value, self.tree_nodes[node].count
36        mid = (self.tree_nodes[node].left_child + self.tree_nodes[node].right_child) >> 1
37        if right <= mid:
38            return self.query(node << 1, left, right)  # Query left subtree
39        if left > mid:
40            return self.query(node << 1 | 1, left, right)  # Query right subtree
41        # If the query range spans both subtrees, query both and combine the results
42        value_left, count_left = self.query(node << 1, left, right)
43        value_right, count_right = self.query(node << 1 | 1, left, right)
44        if value_left == value_right:
45            return value_left, count_left + count_right
46        if count_left >= count_right:
47            return value_left, count_left - count_right
48        else:
49            return value_right, count_right - count_left
50
51    # Function to update the internal nodes of the segment tree after recursive build
52    def push_up(self, node):
53        left = node << 1
54        right = node << 1 | 1
55        if self.tree_nodes[left].value == self.tree_nodes[right].value:
56            self.tree_nodes[node].value = self.tree_nodes[left].value
57            self.tree_nodes[node].count = self.tree_nodes[left].count + self.tree_nodes[right].count
58        elif self.tree_nodes[left].count >= self.tree_nodes[right].count:
59            self.tree_nodes[node].value = self.tree_nodes[left].value
60            self.tree_nodes[node].count = self.tree_nodes[left].count - self.tree_nodes[right].count
61        else:
62            self.tree_nodes[node].value = self.tree_nodes[right].value
63            self.tree_nodes[node].count = self.tree_nodes[right].count - self.tree_nodes[left].count
64
65
66class MajorityChecker:
67    def __init__(self, arr):
68        self.tree = SegmentTree(arr)  # Instantiate the segment tree with the given array
69        self.index_dict = defaultdict(list)
70        for index, value in enumerate(arr):
71            self.index_dict[value].append(index)
72
73    # Function to query if there's a majority element within a specific range that exceeds the threshold
74    def query(self, left: int, right: int, threshold: int) -> int:
75        majority_value, _ = self.tree.query(1, left + 1, right + 1)
76        left_index = bisect_left(self.index_dict[majority_value], left)
77        right_index = bisect_left(self.index_dict[majority_value], right + 1)
78        # Check if the count of the majority element exceeds the threshold
79        if right_index - left_index >= threshold:
80            return majority_value
81        return -1  # Return -1 if no majority element meets the threshold
82
83
84# Example of how to use the MajorityChecker:
85# obj = MajorityChecker(arr)
86# result = obj.query(left, right, threshold)
87
1import java.util.*;
2
3class Node {
4    int left, right; // Boundaries of the segment
5    int value, count; // Value count in the segment
6
7    // Node constructor
8    public Node() {
9        left = 0;
10        right = 0;
11        value = 0;
12        count = 0;
13    }
14}
15
16class SegmentTree {
17    private Node[] tree; // Array representing the segment tree
18    private int[] nums; // Original input array
19
20    // SegmentTree constructor
21    public SegmentTree(int[] nums) {
22        this.nums = nums;
23        int n = nums.length;
24      
25        // Initialize the tree array with empty nodes
26        tree = new Node[n * 4]; 
27        for (int i = 0; i < tree.length; ++i) {
28            tree[i] = new Node();
29        }
30        build(1, 1, n);
31    }
32
33    // Builds the segment tree from the input array
34    private void build(int u, int l, int r) {
35        tree[u].left = l;
36        tree[u].right = r;
37        if (l == r) { // Base case: leaf node
38            tree[u].value = nums[l - 1];
39            tree[u].count = 1;
40            return;
41        }
42        int mid = (l + r) >> 1; // Mid-point of the segment
43        build(u * 2, l, mid); // Recursively build the left subtree
44        build(u * 2 + 1, mid + 1, r); // Recursively build the right subtree
45        pushUp(u); // Merge results from children nodes
46    }
47
48    // Merge operation, to be used after each recursive call
49    private void pushUp(int u) {
50        if (tree[u * 2].value == tree[u * 2 + 1].value) {
51            tree[u].value = tree[u * 2].value;
52            tree[u].count = tree[u * 2].count + tree[u * 2 + 1].count;
53        } else if (tree[u * 2].count >= tree[u * 2 + 1].count) {
54            tree[u].value = tree[u * 2].value;
55            tree[u].count = tree[u * 2].count - tree[u * 2 + 1].count;
56        } else {
57            tree[u].value = tree[u * 2 + 1].value;
58            tree[u].count = tree[u * 2 + 1].count - tree[u * 2].count;
59        }
60    }
61
62    // Queries the segment tree for the majority element between indices l and r
63    public int[] query(int u, int l, int r) {
64        if (tree[u].left >= l && tree[u].right <= r) {
65            return new int[]{tree[u].value, tree[u].count};
66        }
67        int mid = (tree[u].left + tree[u].right) >> 1;
68        if (r <= mid) {
69            return query(u * 2, l, r);
70        }
71        if (l > mid) {
72            return query(u * 2 + 1, l, r);
73        }
74        int[] left = query(u * 2, l, r);
75        int[] right = query(u * 2 + 1, l, r);
76        if (left[0] == right[0]) {
77            left[1] += right[1];
78        } else if (left[1] >= right[1]) {
79            left[1] -= right[1];
80        } else {
81            right[1] -= left[1];
82            left = right;
83        }
84        return left;
85    }
86}
87
88class MajorityChecker {
89    private SegmentTree tree; // SegmentTree instance
90    private Map<Integer, List<Integer>> indexMap = new HashMap<>(); // Map of value to list of indices
91
92    // MajorityChecker constructor
93    public MajorityChecker(int[] arr) {
94        tree = new SegmentTree(arr);
95        for (int i = 0; i < arr.length; ++i) {
96            // Fill the indexMap with the occurrences of each value
97            indexMap.computeIfAbsent(arr[i], k -> new ArrayList<>()).add(i);
98        }
99    }
100
101    // Answers queries for majority elements
102    public int query(int left, int right, int threshold) {
103        int x = tree.query(1, left + 1, right + 1)[0];
104        List<Integer> indices = indexMap.get(x);
105        int l = search(indices, left);
106        int r = search(indices, right + 1);
107        if (r - l >= threshold) {
108            return x; // x has at least the threshold number of occurrences
109        }
110        return -1; // No value passes the threshold
111    }
112
113    // Binary search to find the index of the element in the sorted list
114    private int search(List<Integer> arr, int x) {
115        int left = 0, right = arr.size();
116        while (left < right) {
117            int mid = (left + right) >> 1;
118            if (arr.get(mid) >= x) {
119                right = mid;
120            } else {
121                left = mid + 1;
122            }
123        }
124        return left;
125    }
126}
127
1#include <vector>
2#include <unordered_map>
3#include <algorithm>
4using namespace std;
5
6// Definition for the node structure used within the segment tree.
7class Node {
8public:
9    int left = 0, right = 0; // Range boundaries of the node
10    int value = 0, count = 0; // Value and count for the majority element in the range
11};
12
13using Pair = pair<int, int>;
14
15// Definition of the segment tree class to manage range queries efficiently.
16class SegmentTree {
17public:
18    // Constructor initializes the segment tree with input numbers.
19    SegmentTree(vector<int>& nums) {
20        this->nums = nums;
21        int n = nums.size();
22        tr.resize(4 * n); // Typically 4*n is a safe size for the segment tree array
23        for (int i = 0; i < tr.size(); ++i) {
24            tr[i] = new Node();
25        }
26        buildTree(1, 1, n);
27    }
28
29    // Query the range for the majority element and its count.
30    Pair query(int idx, int l, int r) {
31        Node* node = tr[idx];
32        if (node->left >= l && node->right <= r) {
33            return {node->value, node->count};
34        }
35        int mid = (node->left + node->right) >> 1;
36        // If entire range is on the left side of the tree.
37        if (r <= mid) {
38            return query(idx << 1, l, r);
39        }
40        // If entire range is on the right side of the tree.
41        if (l > mid) {
42            return query((idx << 1) | 1, l, r);
43        }
44        // If the range is split between the left and right children.
45        Pair leftQueryResult = query(idx << 1, l, r);
46        Pair rightQueryResult = query((idx << 1) | 1, l, r);
47        return mergeQueryResults(leftQueryResult, rightQueryResult);
48    }
49
50private:
51    vector<Node*> tr; // Array to store the segment tree nodes
52    vector<int> nums; // Array of numbers to build the segment tree
53
54    // Recursive function to build the segment tree.
55    void buildTree(int idx, int l, int r) {
56        Node* node = tr[idx];
57        node->left = l;
58        node->right = r;
59        if (l == r) {
60            node->value = nums[l - 1];
61            node->count = 1;
62            return;
63        }
64        int mid = (l + r) >> 1;
65        buildTree(idx << 1, l, mid);
66        buildTree((idx << 1) | 1, mid + 1, r);
67        pushUp(idx);
68    }
69
70    // Helper function to update information for the parent based on its children.
71    void pushUp(int idx) {
72        Node* leftChild = tr[idx << 1];
73        Node* rightChild = tr[(idx << 1) | 1];
74        Node* parent = tr[idx];
75        parent->value = (leftChild->count >= rightChild->count) ? leftChild->value : rightChild->value;
76        parent->count = abs(leftChild->count - rightChild->count);
77    }
78
79    // Function to merge the results of two queries - left and right child queries.
80    Pair mergeQueryResults(const Pair &leftResult, const Pair &rightResult) {
81        if (leftResult.first == rightResult.first) {
82            return {leftResult.first, leftResult.second + rightResult.second};
83        }
84        return (leftResult.second >= rightResult.second) ?
85               Pair{leftResult.first, leftResult.second - rightResult.second} :
86               Pair{rightResult.first, rightResult.second - leftResult.second};
87    }
88};
89
90// Class to check for majority element inside a range with more than threshold occurrences.
91class MajorityChecker {
92public:
93    // Constructor that initializes the majority checker with the array of elements.
94    MajorityChecker(vector<int>& arr) {
95        tree = new SegmentTree(arr);
96        for (int i = 0; i < arr.size(); ++i) {
97            elementsPositionsMap[arr[i]].push_back(i);
98        }
99    }
100
101    // Queries for the majority element in the range [left, right] with a minimum threshold.
102    int query(int left, int right, int threshold) {
103        int candidate = tree->query(1, left + 1, right + 1).first;
104        vector<int>& positions = elementsPositionsMap[candidate];
105        auto leftBound = lower_bound(positions.begin(), positions.end(), left);
106        auto rightBound = lower_bound(positions.begin(), positions.end(), right + 1);
107        if (rightBound - leftBound >= threshold) {
108            return candidate;
109        }
110        return -1; // Return -1 if no such element exists.
111    }
112
113private:
114    unordered_map<int, vector<int>> elementsPositionsMap; // Maps each element to its positions in the array.
115    SegmentTree* tree; // Segment tree to facilitate range queries.
116};
117
118/**
119 * Your MajorityChecker object will be instantiated and called as such:
120 * MajorityChecker* obj = new MajorityChecker(arr);
121 * int param_1 = obj->query(left,right,threshold);
122 */
123
1// Import necessary features from the standard TypeScript library.
2import { lowerBound } from './std'; // Assume lowerBound is implemented somewhere as it's not native to JS/TS
3
4// Node structure used within the segment tree (array structure).
5type Node = {
6    left: number;
7    right: number;
8    value: number;
9    count: number;
10};
11
12type Pair = [number, number];
13
14// Segment Tree data to simulate the functionality of segment tree nodes
15let tr: Node[];
16
17// Input array of numbers for which the segment tree is being built
18let nums: number[];
19
20// Function to initialize the segment tree
21function createSegmentTree(inputNums: number[]): void {
22    nums = inputNums;
23    const n = nums.length;
24    tr = new Array(4 * n).fill(null).map(() => ({ left: 0, right: 0, value: 0, count: 0 }));
25    buildTree(1, 1, n);
26}
27
28// Function to query the segment tree
29function querySegmentTree(idx: number, l: number, r: number): Pair {
30    let node = tr[idx];
31    if (node.left >= l && node.right <= r) {
32        return [node.value, node.count];
33    }
34    let mid = (node.left + node.right) >> 1;
35    if (r <= mid) {
36        return querySegmentTree(idx << 1, l, r);
37    }
38    if (l > mid) {
39        return querySegmentTree((idx << 1) | 1, l, r);
40    }
41    let leftQueryResult = querySegmentTree(idx << 1, l, r);
42    let rightQueryResult = querySegmentTree((idx << 1) | 1, l, r);
43    return mergeQueryResults(leftQueryResult, rightQueryResult);
44}
45
46// Recursively build the segment tree
47function buildTree(idx: number, l: number, r: number): void {
48    let node = tr[idx];
49    node.left = l;
50    node.right = r;
51    if (l === r) {
52        node.value = nums[l - 1];
53        node.count = 1;
54        return;
55    }
56    let mid = (l + r) >> 1;
57    buildTree(idx << 1, l, mid);
58    buildTree((idx << 1) | 1, mid + 1, r);
59    pushUp(idx);
60}
61
62// Update parent node based on the values from its children
63function pushUp(idx: number): void {
64    let leftChild = tr[idx << 1];
65    let rightChild = tr[(idx << 1) | 1];
66    let parent = tr[idx];
67    parent.value = (leftChild.count >= rightChild.count) ? leftChild.value : rightChild.value;
68    parent.count = Math.abs(leftChild.count - rightChild.count);
69}
70
71// Merge two query results into one
72function mergeQueryResults(leftResult: Pair, rightResult: Pair): Pair {
73    if (leftResult[0] === rightResult[0]) {
74        return [leftResult[0], leftResult[1] + rightResult[1]];
75    }
76    return (leftResult[1] >= rightResult[1]) ?
77        [leftResult[0], leftResult[1] - rightResult[1]] :
78        [rightResult[0], rightResult[1] - leftResult[1]];
79}
80
81// Maps each element to the indices where it appears in the array
82let elementsPositionsMap: { [key: number]: number[] };
83
84// Initialize the majority checker with the array of elements and build the segment tree
85function createMajorityChecker(arr: number[]): void {
86    elementsPositionsMap = {};
87    arr.forEach((element, index) => {
88        if (!elementsPositionsMap[element]) {
89            elementsPositionsMap[element] = [];
90        }
91        elementsPositionsMap[element].push(index);
92    });
93    createSegmentTree(arr);
94}
95
96// Queries for the majority element in the range [left, right] with minimum threshold.
97// Returns the majority element if it exists, or -1 otherwise.
98function majorityQuery(left: number, right: number, threshold: number): number {
99    let candidate = querySegmentTree(1, left + 1, right + 1)[0];
100    let positions = elementsPositionsMap[candidate] ?? [];
101    let leftBound = lowerBound(positions, left);
102    let rightBound = lowerBound(positions, right + 1);
103    if (rightBound - leftBound >= threshold) {
104        return candidate;
105    }
106    return -1;
107}
108
Not Sure What to Study? Take the 2-min Quiz:

What's the relationship between a tree and a graph?

Time and Space Complexity

Time Complexity

SegmentTree __init__ (Build Tree):

  • The tree is built using a recursive divide-and-conquer approach. Each node of the tree represents a segment of the initial array, with leaf nodes representing individual elements. Building this tree involves visiting each element once, thus each node is built in O(1) time. However, since we visit each of the n elements once, and each element corresponds to a node, the overall complexity of building the tree is O(n).

SegmentTree query:

  • Querying the segment tree requires traversing from the root of the tree down to the segments which might contain the query range. This traversal goes through at most O(log n) segments from top to bottom, as the tree has a height of log n.

MajorityChecker __init__:

  • Here, we initialize the segment tree as described previously, which is O(n). We also construct a hashmap (self.d) where keys are elements of the array and values are lists of indices where these elements are found. Since this requires iterating through the entire array once and appending indices, which takes O(1) time for each element, this process is also O(n).

MajorityChecker query:

  • We perform a segment tree query which has O(log n) complexity. We then do two binary searches using bisect_left, which are O(log k) each, where k is the number of occurrences of the element x. In the worst case scenario, an element could occur everywhere which would make k = n, hence binary search complexity would be O(log n). But typically k can be considered much smaller than n depending on the array. Finally, we perform a simple calculation to check if the found element appears at least threshold times between left and right.

Space Complexity

SegmentTree:

  • The segment tree is a full binary tree containing 4n nodes, as it is a common practice to allocate 4 times the size of the array space for convenience when using segment trees. So, the space complexity of the segment tree is O(n).

MajorityChecker:

  • Apart from the segment tree, we store a dictionary which contains all indices of each unique number in the given array. In the worst case, where all elements are distinct, the space taken would be proportional to 2n (each element with a single index). Hence the space complexity would remain O(n). However, in general, the space complexity of the dictionary is O(n) (which may require less space than that needed by the segment tree in practice).

Therefore, the total space complexity of the code is O(n). We only consider the larger term for space complexity, which is from the segment tree, as the space for the dictionary is not necessarily larger than that.

Learn more about how to find time and space complexity quickly using problem constraints.

Fast Track Your Learning with Our Quick Skills Quiz:

Which data structure is used to implement recursion?


Recommended Readings


Got a question? Ask the Teaching Assistant anything you don't understand.

Still not clear? Ask in the Forum,  Discord or Submit the part you don't understand to our editors.


TA 👨‍🏫