Priority Queue and Heap
What is the relationship between priority queue and heap?
Priority Queue is an Abstract Data Type, and Heap is the concrete data structure we use to implement a priority queue.
Priority Queue
A priority queue is a data structure that consists of a collection of items and supports the following operations:
- insert: insert an item with a key.
- delete_min/delete_max: remove the item with the smallest/largest key and return it.
Note that
- we only allow getting and deleting the element with the min/max key and NOT any arbitrary key.
Use cases
The hospital triage process is a quintessential use of a priority queue. Patients are sorted based on the severity of their condition. For example, a person with the cold comes in and he is placed near the end of the queue. Then a person who had a car accident comes in, and he is placed before the person with the cold even though he came in later because his condition is more severe. Severity is the key in this case.
Consider a problem like Merge K sorted lists. We need to keep track of the min value among k elements (smallest element in each sorted list) and remove the min value and insert new values at will while still having access to the min value at any point in time. Here are some other problems where the priority queue comes in handy.
- k closest pointers to origin
- kth largest element
- kth largest element in a stream
- Median of a data stream
- Uniform Cost Search
Implement Priority Queue using an array
To do this, we could try using
- an unsorted array, insert would be
O(1)
as we just have to put it at the end, but finding and deleting min value would beO(N)
since we have to loop through the entire array to find it - a sorted array, finding min value would be easy
O(1)
, but it would beO(N)
to insert since we have to loop through to find the correct position of the value and move elements after the position to make space and insert into the space
There must be a better way! "Patience you must have, my young padawan." Enter Heap.
Heap
Heaps are special tree based data structures. Usually when we say heap, we refer to the binary heap that uses the binary tree structure.
However, the tree isn't necessarily always binary, in particular, a k
-ary heap (A.K.A. k
-heap) is a tree where the nodes have k
children.
As long as the nodes follow the 2 heap properties, it is a valid heap.
Max Heap and Min Heap
There are two kinds of heaps - Min Heap and Max Heap. A Min Heap is a tree that has two properties:
- almost complete, i.e. every level is filled except possibly the last(deepest) level. The filled items in the last level are left-justified.
- for any node, its key (priority) is greater than its parent's key (Min Heap).
A Max Heap has the same property #1 and opposite property #2, i.e. for any node, its key is less than its parent's key.
Let's play "is it a heap?" game on binary heaps:
Note that
- the number in each node is the key, not value (remember a tree node has a value). Keys are used to sort the nodes/construct the tree, and values are the data we want heap to store.
- unlike a binary search tree, there is no comparable relationship between children. For example, the third example in the first row, 17 and 8 are both larger than 2 but they are NOT sorted left to right. In fact, there is no comparable relationship across a level of a heap at all.
Why heaps are useful
At first glance, the heap is an odd data structure - it's required to be a complete tree but unlike binary search tree, it's not sorted across a level.
What makes it so useful is
- because a heap is a complete tree, the height of a heap is guaranteed to be
O(log(N))
. This bound operations that go from root to leaf to be inO(log(N))
. - because only nodes in a root-to-leaf path are sorted (nodes in the same level are not sorted), when we add/remove a node, we only have to fix the order in the vertical path the node is in. This makes inserting and deleting
O(log(N))
too. - being complete also makes array a good choice to store a heap since data is continuous. As we will see later in this module, we can find the parent and children of a node simply by doing index arithmetics.
Operations
insert
To insert a key into a heap,
- place the new key at the first free leaf
- if property #2 is violated, perform a bubble-up
def bubble_up(node):
while node.parent exist and node.parent.key > node.key:
swap node and node.parent
node = node.parent
As the name of the algorithm suggests, it "bubbles up" the new node by swapping it with its parent until the order is correct
Since the height of a heap is O(log(N))
, the complexity of bubble-up is O(log(N))
.
delete_min
What this operation does is:
- delete a node with min key and return it
- reorganize the heap so the two properties still hold
To do that, we:
- remove and return the root since the node with the minimum key is always at the root
- replace the root with the last node (the rightmost node at the bottom) of the heap
- if property #2 is violated, perform a bubble-down
def bubble_down(node):
while node is not a leaf:
smallest_child = child of node with smallest key
if smallest_child < node:
swap node and smallest_child
node = smallest_child
else:
break
What this says is we keep swapping between the current node and its smallest child until we get to the leaf, hence a "bubble down". Again the time complexity is O(log(N))
since the height of a heap is O(log(N))
.
Implementing Heap
Being a complete tree makes an array a natural choice to implement a heap since it can be stored compactly and no space is wasted. Pointers are not needed. The parent and children of each node can be calculated with index arithmetic
For node i
, its children are stored at 2i+1
and 2i+2
, and its parent is at floor((i-1)/2)
. So instead of node.left
we'd do 2*i+1
.
(Note that if we are implementing a k
-ary heap, then the childrens are at ki+1
to ki+k
, and its parent is at floor((i-1)/k)
.)
This is exactly how the python language implements heaps. https://github.com/python/cpython/blob/d3a8d616faf3364b22fde18dce8c168de9368146/Lib/heapq.py#L263
tldr
A Heap
- is a complete tree
- each node of a min heap is less than all of its children
- allows
O(log(N))
to insert and remove an element with priority - is implemented with arrays.
Heap in Python
Python comes with a built-in heapq
we can use, and it is a min heap, i.e. element at top is smallest.
heapq.heappush
takes two arguments, the first is the heap (an array/list) we want to push the element into, and the second argument can be anything as long as it can be used for comparison. Typically, we push a tuple since Python tuples are compared position by position. For example, (1, 10)
is smaller than (2, 0)
because the first element in (1, 0)
is smaller than the first element in (2, 0)
. (1, 10)
is smaller than (1, 20)
because when the first elements are equal, we compare the next one. In this case, 10 is smaller than 20.
heapq.heappop
takes a single argument heap, and pop and return the smallest element in the heap.
import heapq >>> h = [] >>> heapq.heappush(h, (5, 'write code')) >>> heapq.heappush(h, (7, 'release product')) >>> heapq.heappush(h, (1, 'write spec')) >>> heapq.heappush(h, (3, 'create tests')) >>> heapq.heappop(h) (1, 'write spec') >>> h[0] >>> (3, 'create tests')
The simplest way to implement a max heap is to reverse the sign of the number before we push it into the heap and reverse it again when we pop it out.
import heapq
>>> h = []
>>> heapq.heappush(h, -1 * 10)
>>> heapq.heappush(h, -1 * 30)
>>> heapq.heappush(h, -1 * 20)
>>> print(-1 * heapq.heappop(h))
30
>>> print(-1 * h[0])
20
If the list is known beforehand, we can create a heap out of it by simply using heapify
, which is actually an O(N)
operation.
The math of proving O(N)
requires master thereom and is a bit complex. You don't really need to know it for the interviews. If you are still curious, check out this stackoverflow answer.
import heapq arr = [3, 1, 2] heapq.heapify(arr)
Heap in Java
By default, the Java class java.util.PriorityQueue
is a min heap. Most of the time the elements are sorted using their natural comparison methods, but when there isn't one (for example, sorting points by distance) it can be provided by passing in a Comparator
at construction time.
A Comparator
is required if the elements don't implement the Comparable
interface.
To create a PriorityQueue
of non-comparable objects, we can use lambda expression or method reference introduced in Java 8 and provide an inline comparator to the constructor.
import java.util.Comparator;
import java.util.PriorityQueue;
public class Task {
private int priority;
private String description;
public Task(int priority, String description) {
this.priority = priority;
this.description = description;
}
}
public static void main(String[] args) {
// initialize a min heap sorted by the priority of tasks with an initial capacity of 5
PriorityQueue<Task> tasks = new PriorityQueue<>(5, Comparator.comparingInt(t -> t.priority));
tasks.add(new Task(5, "write code"));
tasks.add(new Task(7, "release product"));
tasks.add(new Task(1, "write spec"));
tasks.add(new Task(3, "create tests"));
Task head = tasks.poll(); // return the task with min priority
System.out.println(head.description); // print out "write spec"
}
To create a max heap, we can use Collections.reverseOrder()
to get a reverse behavior of the default comparator when initializing the PriorityQueue
.
PriorityQueue<Integer> maxHeap = new PriorityQueue<Integer>(Collections.reverseOrder());
Heap in Javascript
Javascript does not support the heap data structure natively, so you might have to implement your own heap during an interview. Here is a common implementation of min-heap in Javascript:
class HeapItem {
constructor(item, priority = item) {
this.item = item;
this.priority = priority;
}
}
class MinHeap {
constructor() {
this.heap = [];
}
push(node) {
// insert the new node at the end of the heap array
this.heap.push(node);
// find the correct position for the new node
this.bubble_up();
}
bubble_up() {
let index = this.heap.length - 1;
while (index > 0) {
const element = this.heap[index];
const parentIndex = Math.floor((index - 1) / 2);
const parent = this.heap[parentIndex];
if (parent.priority <= element.priority) break;
// if the parent is bigger than the child then swap the parent and child
this.heap[index] = parent;
this.heap[parentIndex] = element;
index = parentIndex;
}
}
pop() {
const min = this.heap[0];
this.heap[0] = this.heap[this.size() - 1];
this.heap.pop();
this.bubble_down();
return min;
}
bubble_down() {
let index = 0;
let min = index;
const n = this.heap.length;
while (index < n) {
const left = 2 * index + 1;
const right = left + 1;
// check if left or right child is smaller than parent
if ((left < n && this.heap[left].priority < this.heap[min].priority) ||
(right < n && this.heap[right].priority < this.heap[min].priority)) {
// pick the smaller child if both child is present
if (right < n) {
min = this.heap[left].priority < this.heap[right].priority ? left : right;
} else {
min = left;
}
}
if (min === index) break;
[this.heap[min], this.heap[index]] = [this.heap[index], this.heap[min]];
index = min;
}
}
peek() {
return this.heap[0];
}
size() {
return this.heap.length;
}
}
We can use MinHeap
to create a max heap by reversing the sign of the number before we push it into the heap and reverse it again when we pop it out. For example, if we want to build a max heap out of [3, 1, 2]
, we can push [-3, -1, -2]
into a MinHeap. After pop()
, -3
will be popped out and its reverse 3
is the max of the three and thus we have a max heap.
Heap in C++
By default, the C++ class std::priorityqueue<T>
is a max heap. The elements are sorted using std::less<T>
, which invokes the operator <
on the elements to be compared. priority_queue
also supports a template that takes two extra arguments to change the ordering.
std::priority_queue <T, std::vector<T>, ComparatorType> max_heap;
The third argument ComparatorType
can either be a function or a function object with bool
return-type and takes 2 arguments. For example, we can pass std::greater<T>
as the ComparatorType
to create a min heap.
std::priority_queue<int, std::vector<int>, std::greater<int>> minheap;
Since C++11, we can declare a lambda expression as comparator.
#include <queue> //priority_queue
#include <string>
#include <vector>
struct Task {
int priority;
std::string description;
Element(int priority, std::string description)
: priority{vpriorityal}, description{description} {}
};
int main() {
auto compare_task = [](Task& a, Task& b) {
// use > for min heap, < for max heap
return a.priority > b.priority;
};
std::priority_queue<Task, std::vector<Task>, decltype(compare_task)> minheap(compare_task);
Task t1(5, "write code");
minheap.push(t1);
Task t2(7, "release product");
minheap.push(t2);
Task t3(3, "create tests");
minheap.push(t3);
Task current_task = minheap.top(); // current_task = "create tests"
minheap.pop(); // removes the task "create tests"
Task next_task = minheap.top(); // next_task = "write code"
}
Note that pop()
removes the top element from the heap and returns void, so we cannot use it to retrieve the top element in the heap.
Try it yourself
Given a list of numbers, return top 3 smallest numbers
Here's the solution:
1from heapq import heapify, heappop
2from typing import List
3
4def heap_top_3(arr: List[int]) -> List[int]:
5 heapify(arr)
6 res = []
7 for _ in range(3):
8 res.append(heappop(arr))
9 return res
10
11if __name__ == "__main__":
12 arr = [int(x) for x in input().split()]
13 res = heap_top_3(arr)
14 print(" ".join(map(str, res)))
15
1import java.util.ArrayList;
2import java.util.Arrays;
3import java.util.List;
4import java.util.PriorityQueue;
5import java.util.Scanner;
6import java.util.stream.Collectors;
7
8class Solution {
9 public static List<Integer> heapTop3(List<Integer> arr) {
10 PriorityQueue<Integer> heap = new PriorityQueue<>();
11 for (int x : arr) {
12 heap.add(x);
13 }
14 List<Integer> res = new ArrayList<>();
15 for (int i = 0; i < 3; i++) {
16 res.add(heap.poll());
17 }
18 return res;
19 }
20
21 public static List<String> splitWords(String s) {
22 return s.isEmpty() ? List.of() : Arrays.asList(s.split(" "));
23 }
24
25 public static void main(String[] args) {
26 Scanner scanner = new Scanner(System.in);
27 List<Integer> arr = splitWords(scanner.nextLine()).stream().map(Integer::parseInt).collect(Collectors.toList());
28 scanner.close();
29 List<Integer> res = heapTop3(arr);
30 System.out.println(res.stream().map(String::valueOf).collect(Collectors.joining(" ")));
31 }
32}
33
1"use strict";
2
3class HeapItem {
4 constructor(item, priority = item) {
5 this.item = item;
6 this.priority = priority;
7 }
8}
9
10class MinHeap {
11 constructor() {
12 this.heap = [];
13 }
14
15 push(node) {
16 // insert the new node at the end of the heap array
17 this.heap.push(node);
18 // find the correct position for the new node
19 this.bubbleUp();
20 }
21
22 bubbleUp() {
23 let index = this.heap.length - 1;
24
25 while (index > 0) {
26 const element = this.heap[index];
27 const parentIndex = Math.floor((index - 1) / 2);
28 const parent = this.heap[parentIndex];
29
30 if (parent.priority <= element.priority) break;
31 // if the parent is bigger than the child then swap the parent and child
32 this.heap[index] = parent;
33 this.heap[parentIndex] = element;
34 index = parentIndex;
35 }
36 }
37
38 pop() {
39 const min = this.heap[0];
40 this.heap[0] = this.heap[this.size() - 1];
41 this.heap.pop();
42 this.bubbleDown();
43 return min;
44 }
45
46 bubbleDown() {
47 let index = 0;
48 let min = index;
49 const n = this.heap.length;
50
51 while (index < n) {
52 const left = 2 * index + 1;
53 const right = left + 1;
54
55 if (left < n && this.heap[left].priority < this.heap[min].priority) {
56 min = left;
57 }
58 if (right < n && this.heap[right].priority < this.heap[min].priority) {
59 min = right;
60 }
61 if (min === index) break;
62 [this.heap[min], this.heap[index]] = [this.heap[index], this.heap[min]];
63 index = min;
64 }
65 }
66
67 peek() {
68 return this.heap[0];
69 }
70
71 size() {
72 return this.heap.length;
73 }
74}
75
76function heapTop3(arr) {
77 const heap = new MinHeap();
78 for (const x of arr) {
79 heap.push(new HeapItem(x));
80 }
81 const res = [];
82 for (let i = 0; i < 3; i++) {
83 res.push(heap.pop().item);
84 }
85 return res;
86}
87
88function splitWords(s) {
89 return s === "" ? [] : s.split(" ");
90}
91
92function* main() {
93 const arr = splitWords(yield).map((v) => parseInt(v));
94 const res = heapTop3(arr);
95 console.log(res.join(" "));
96}
97
98class EOFError extends Error {}
99{
100 const gen = main();
101 const next = (line) => gen.next(line).done && process.exit();
102 let buf = "";
103 next();
104 process.stdin.setEncoding("utf8");
105 process.stdin.on("data", (data) => {
106 const lines = (buf + data).split("\n");
107 buf = lines.pop();
108 lines.forEach(next);
109 });
110 process.stdin.on("end", () => {
111 buf && next(buf);
112 gen.throw(new EOFError());
113 });
114}
115
1#include <algorithm>
2#include <functional>
3#include <iostream>
4#include <iterator>
5#include <queue>
6#include <sstream>
7#include <string>
8#include <utility>
9#include <vector>
10
11std::vector<int> heap_top_3(std::vector<int>& arr) {
12 std::priority_queue<int, std::vector<int>, std::greater<>> heap({}, std::move(arr));
13 std::vector<int> res;
14 for (int i = 0; i < 3; i++) {
15 res.emplace_back(heap.top());
16 heap.pop();
17 }
18 return res;
19}
20
21template<typename T>
22std::vector<T> get_words() {
23 std::string line;
24 std::getline(std::cin, line);
25 std::istringstream ss{line};
26 ss >> std::boolalpha;
27 std::vector<T> v;
28 std::copy(std::istream_iterator<T>{ss}, std::istream_iterator<T>{}, std::back_inserter(v));
29 return v;
30}
31
32template<typename T>
33void put_words(const std::vector<T>& v) {
34 if (!v.empty()) {
35 std::copy(v.begin(), std::prev(v.end()), std::ostream_iterator<T>{std::cout, " "});
36 std::cout << v.back();
37 }
38 std::cout << '\n';
39}
40
41int main() {
42 std::vector<int> arr = get_words<int>();
43 std::vector<int> res = heap_top_3(arr);
44 put_words(res);
45}
46