Priority Queue and Heap

What is the relationship between priority queue and heap?

Priority Queue is an Abstract Data Type, and Heap is the concrete data structure we use to implement a priority queue.

Priority Queue

A priority queue is a data structure that consists of a collection of items and supports the following operations:

  • insert: insert an item with a key
  • delete_min/delete_max: remove the item with the smallest/largest key and returning it

Note that

  • we only allow getting and deleting the element with min/max key and NOT arbitrary key.

Use cases

The hospital triage process is a quintessential priority queue. Patients are sorted based on the severity of their condition. For example, a person with the cold comes in and he is placed near the end of the queue. Then a person who had a car accident comes in, and he is placed before the person with the cold even though he came in later because his condition is more severe. Severity is the key in this case.

Consider a problem like Merge K sorted lists. We need to keep track of min value among k elements (smallest element in each sorted list) and remove the min value and insert new values at will while still having access to the min value at any point in time. Here are some other problems where the priority queue comes in handy.

Implement Priority Queue using an array

To do this, we could try using

  • an unsorted array, insert would be O(1) as we just have to put it at the end, but finding and deleting min value would be O(N) since we have to loop through the entire array to find it
  • a sorted array, finding min value would be easy O(1), but it would be O(N) to insert since we have to loop through to find the correct position of the value and move elements after the position to make space and insert into the space

There must be a better way! "Patience you must have, my young padawan." Enter Heap.

Heap

Max Heap and Min Heap

There are two kinds of heaps - Min Heap and Max Heap. A Min Heap is a tree that has two properties:

  1. almost complete, i.e. every level is filled except possibly the last(deepest) level. The filled items in the last level are left-justified.
  2. for any node, its key (priority) is greater than its parent's key (Min Heap).

A Max Heap has the same property #1 and opposite property #2, i.e. for any node, its key is less than its parent's key.

Let's play "is it a heap?" game:

Note that

  • the number in each node is the key, not value (remember a tree node has a value). Keys are used to sort the nodes/construct the tree, and values are the data we want heap to store.
  • unlike a binary search tree, there is no comparable relationship between children. For example, the third example in the first row, 17 and 8 are both larger than 2 but they are NOT sorted left to right. In fact, there is no comparable relationship across a level of a heap at all.

Why heaps are useful

By the first look, the heap is an odd data structure - it's required to be a complete tree but unlike binary search tree, it's not sorted across a level.

What makes it so useful is

  • because a heap is a complete tree, the height of a heap is guaranteed to be O(log(N)). This makes operations that go from root to leaf guaranteed to be O(log(N)).
  • because only nodes in a root-to-leaf path are sorted (nodes in the same level are not sorted), when we add/remove a node, we only have to fix the order in the vertical path the node is in. This makes inserting and deleting O(log(N)) too.
  • being complete also makes array a good choice to store a heap since data is continuous. As we will see later in this module, we can find the parent and children of a node simply by doing index arithmetics.

Operations

insert

To insert a key into a heap,

  • place the new key at the first free leaf
  • if property #2 is violated, perform a bubble-up
1def bubble_up(node):
2    while node.parent exist and node.parent.key > node.key:
3        swap node and node.parent
4        node = node.parent

As the name of the algorithm suggests, it "bubbles up" the new node by swapping it with its parent until the order is correct

Since the height of a heap is O(log(N)), the complexity of bubble-up is O(log(N)).

delete_min

What this operation does is:

  • delete a node with min key and return it
  • reorganize the heap so the two properties still hold

To do that, we:

  • remove and return the root since the node with the minimum key is always at the root
  • replace the root with the last node (the rightmost node at the bottom) of the heap
  • if property #2 is violated, perform a bubble-down
1def bubble_down(node):
2    while node is not a leaf:
3        smallest_child = child of node with smallest key
4        if smallest_child < node:
5            swap node and smallest_child
6            node = smallest_child
7        else:
8            break
9

What this says is we keep swapping between the current node and its smallest child until we get to the leaf, hence a "bubble down". Again the time complexity is O(log(N)) since the height of a heap is O(log(N)).

Implementing Heap

Being a complete tree makes an array a natural choice to implement a heap since it can be stored compactly and no space is wasted. Pointers are not needed. The parent and children of each node can be calculated with index arithmetic

For node i, its children are stored at 2i+1 and 2i+2, and its parent is at floor((i-1)/2). So instead of node.left we'd do 2*i+1.

This is exactly how the python language implements heaps. https://github.com/python/cpython/blob/d3a8d616faf3364b22fde18dce8c168de9368146/Lib/heapq.py#L263

tldr

A Heap

  • is a complete tree
  • each node of a min heap is less than all of its children
  • allows O(log(N)) to insert and remove an element with priority
  • is implemented with arrays

Heap in Python

Python comes with a built-in heapq we can use, and it is a min heap, i.e. element at top is smallest.

heapq.heappush takes two arguments, the first is the heap (an array/list) we want to push the element into, and the second argument can be anything as long as it can be used for comparison. Typically, we push a tuple since Python tuples are compared position by position. For example, (1, 10) is smaller than (2, 0) because the first element in (1, 0) is smaller than the first element in (2, 0). (1, 10) is smaller than (1, 20) because when the first elements are equal, we compare the next one. In this case, 10 is smaller than 20.

heapq.heappop takes a single argument heap, and pop and return the smallest element in the heap.

1import heapq
2>>> h = []
3>>> heapq.heappush(h, (5, 'write code'))
4>>> heapq.heappush(h, (7, 'release product'))
5>>> heapq.heappush(h, (1, 'write spec'))
6>>> heapq.heappush(h, (3, 'create tests'))
7>>> heapq.heappop(h)
8(1, 'write spec')
9>>> h[0]
10>>> (3, 'create tests')

The simplest way to implement a max heap is to reverse the sign of the number before we push it into the heap and reverse it again when we pop it out.

1import heapq
2>>> h = []
3>>> heapq.heappush(h, -1 * 10)
4>>> heapq.heappush(h, -1 * 30)
5>>> heapq.heappush(h, -1 * 20)
6>>> print(-1 * heapq.heappop(h))
730
8>>> print(-1 * h[0])
920

If the list is known beforehand, we can create a heap out of it by simply using heapify, which is actually an O(N) operation.

1import heapq
2arr = [3, 1, 2]
3heapq.heapify(arr)

Heap in Java

By default, the Java class java.util.PriorityQueue is a min heap. Most of the time the elements are sorted using their natural comparison methods, but when there isn't one (for example, sorting points by distance) it can be provided by passing in a Comparator at construction time. A Comparator is required if the elements don't implement the Comparable interface.

To create a PriorityQueue of non-comparable objects, we can use lambda expression or method reference introduced in Java 8 and provide an inline comparator to the constructor.

1import java.util.Comparator;
2import java.util.PriorityQueue;
3
4public class Task {
5    private int priority;
6    private String description;
7
8    public Task(int priority, String description) {
9        this.priority = priority;
10        this.description = description;
11    }
12}
13
14public static void main(String[] args) {
15    // initialize a min heap sorted by the priority of tasks with an initial capacity of 5
16    PriorityQueue<Task> tasks = new PriorityQueue<>(5, Comparator.comparingInt(t -> t.priority));
17    tasks.add(new Task(5, "write code"));
18    tasks.add(new Task(7, "release product"));
19    tasks.add(new Task(1, "write spec"));
20    tasks.add(new Task(3, "create tests"));
21    Task head = tasks.poll();  // return the task with min priority
22    System.out.println(head.description);  // print out "write spec"
23}

To create a max heap, we can use Collections.reverseOrder() to get a reverse behavior of the default comparator when initializing the PriorityQueue.

1PriorityQueue<Integer> maxHeap = new PriorityQueue<Integer>(Collections.reverseOrder());

Heap in Javascript

Javascript does not support the heap data structure natively, so you might have to implement your own heap during an interview. Here is a common implementation of min-heap in Javascript:

1class HeapItem {
2    constructor(item, priority = item) {
3        this.item = item;
4        this.priority = priority;
5    }
6}
7
8class MinHeap {
9    constructor() {
10        this.heap = [];
11    }
12
13    push(node) {
14        // insert the new node at the end of the heap array
15        this.heap.push(node);
16        // find the correct position for the new node
17        this.bubble_up();
18    }
19
20    bubble_up() {
21        let index = this.heap.length - 1;
22
23        while (index > 0) {
24            const element = this.heap[index];
25            const parentIndex = Math.floor((index - 1) / 2);
26            const parent = this.heap[parentIndex];
27
28            if (parent.priority <= element.priority) break;
29            // if the parent is bigger than the child then swap the parent and child
30            this.heap[index] = parent;
31            this.heap[parentIndex] = element;
32            index = parentIndex;
33        }
34    }
35
36    pop() {
37        const min = this.heap[0];
38        this.heap[0] = this.heap[this.size() - 1];
39        this.heap.pop();
40        this.bubble_down();
41        return min;
42    }
43
44    bubble_down() {
45        let index = 0;
46        let min = index;
47        const n = this.heap.length;
48
49        while (index < n) {
50            const left = 2 * index + 1;
51            const right = left + 1;
52            // check if left or right child is smaller than parent
53            if ((left < n && this.heap[left].priority < this.heap[min].priority) ||
54            (right < n && this.heap[right].priority < this.heap[min].priority)) {
55                // pick the smaller child if both child is present
56                if (right < n) {
57                    min = this.heap[left].priority < this.heap[right].priority ? left : right;
58                } else {
59                    min = left;
60                }
61            }
62
63            if (min === index) break;
64            [this.heap[min], this.heap[index]] = [this.heap[index], this.heap[min]];
65            index = min;
66        }
67    }
68
69    peek() {
70        return this.heap[0];
71    }
72
73    size() {
74        return this.heap.length;
75    }
76}

We can use MinHeap to create a max heap by reversing the sign of the number before we push it into the heap and reverse it again when we pop it out. For example, if we want to build a max heap out of [3, 1, 2], we can push [-3, -1, -2] into a MinHeap. After pop(), -3 will be popped out and its reverse 3 is the max of the three and thus we have a max heap.

Heap in C++

By default, the C++ class std::priorityqueue<T> is a max heap. The elements are sorted using std::less<T>, which invokes the operator < on the elements to be compared. priority_queue also supports a template that takes two extra arguments to change the ordering.

1std::priority_queue <T, std::vector<T>, ComparatorType> max_heap;

The third argument ComparatorType can either be a function or a function object with bool return-type and takes 2 arguments. For example, we can pass std::greater<T> as the ComparatorType to create a min heap.

1std::priority_queue<int, std::vector<int>, std::greater<int>> minheap;

Since C++11, we can declare a lambda expression as comparator.

1#include <queue>  //priority_queue
2#include <string>
3#include <vector>
4
5struct Task {
6    int priority;
7    std::string description;
8
9    Element(int priority, std::string description)
10        : priority{vpriorityal}, description{description} {}
11};
12
13int main() {
14    auto compare_task = [](Task& a, Task& b) {
15        // use > for min heap, < for max heap
16        return a.priority > b.priority; 
17    };
18    std::priority_queue<Task, std::vector<Task>, decltype(compare_task)> minheap(compare_task);
19    Task t1(5, "write code");
20    minheap.push(t1);
21    Task t2(7, "release product");
22    minheap.push(t2);
23    Task t3(3, "create tests");
24    minheap.push(t3);
25
26    Task current_task = minheap.top();   // current_task = "create tests"
27    minheap.pop();    // removes the task "create tests"
28    Task next_task = minheap.top();   // next_task = "write code"
29}

Note that pop() removes the top element from the heap and returns void, so we cannot use it to retrieve the top element in the heap.

Try it yourself

Given a list of numbers, return top 3 smallest numbers

1from heapq import heapify, heappop
2from typing import List
3
4def heap_top_3(arr: List[int]) -> List[int]:
5    heapify(arr)
6    res = []
7    for i in range(3):
8        res.append(heappop(arr))
9    return res
10
11if __name__ == '__main__':
12    arr = [int(x) for x in input().split()]
13    res = heap_top_3(arr)
14    print(' '.join(map(str, res)))
15
1import java.util.ArrayList;
2import java.util.Arrays;
3import java.util.List;
4import java.util.PriorityQueue;
5import java.util.Scanner;
6import java.util.stream.Collectors;
7
8class Solution {
9    public static List<Integer> heapTop3(List<Integer> arr) {
10        PriorityQueue<Integer> heap = new PriorityQueue<>();
11        for (int x : arr) {
12            heap.add(x);
13        }
14        List<Integer> res = new ArrayList<>();
15        for (int i = 0; i < 3; i++) {
16            res.add(heap.poll());
17        }
18        return res;
19    }
20
21    public static List<String> splitWords(String s) {
22        return s.isEmpty() ? List.of() : Arrays.asList(s.split(" "));
23    }
24
25    public static void main(String[] args) {
26        Scanner scanner = new Scanner(System.in);
27        List<Integer> arr = splitWords(scanner.nextLine()).stream().map(Integer::parseInt).collect(Collectors.toList());
28        scanner.close();
29        List<Integer> res = heapTop3(arr);
30        System.out.println(res.stream().map(String::valueOf).collect(Collectors.joining(" ")));
31    }
32}
33
1class HeapItem {
2    constructor(item, priority = item) {
3        this.item = item;
4        this.priority = priority;
5    }
6}
7
8class MinHeap {
9    constructor() {
10        this.heap = [];
11    }
12
13    push(node) {
14        // insert the new node at the end of the heap array
15        this.heap.push(node);
16        // find the correct position for the new node
17        this.bubble_up();
18    }
19
20    bubble_up() {
21        let index = this.heap.length - 1;
22
23        while (index > 0) {
24            const element = this.heap[index];
25            const parentIndex = Math.floor((index - 1) / 2);
26            const parent = this.heap[parentIndex];
27
28            if (parent.priority <= element.priority) break;
29            // if the parent is bigger than the child then swap the parent and child
30            this.heap[index] = parent;
31            this.heap[parentIndex] = element;
32            index = parentIndex;
33        }
34    }
35
36    pop() {
37        const min = this.heap[0];
38        this.heap[0] = this.heap[this.size() - 1];
39        this.heap.pop();
40        this.bubble_down();
41        return min;
42    }
43
44    bubble_down() {
45        let index = 0;
46        let min = index;
47        const n = this.heap.length;
48
49        while (index < n) {
50            const left = 2 * index + 1;
51            const right = left + 1;
52
53            if (left < n && this.heap[left].priority < this.heap[min].priority) {
54                min = left;
55            }
56            if (right < n && this.heap[right].priority < this.heap[min].priority) {
57                min = right;
58            }
59            if (min === index) break;
60            [this.heap[min], this.heap[index]] = [this.heap[index], this.heap[min]];
61            index = min;
62        }
63    }
64
65    peek() {
66        return this.heap[0];
67    }
68
69    size() {
70        return this.heap.length;
71    }
72}
73
74function heapTop3(arr) {
75    const heap = new MinHeap();
76    for (const x of arr) {
77        heap.push(new HeapItem(x));
78    }
79    const res = [];
80    for (let i = 0; i < 3; i++) {
81        res.push(heap.pop().item);
82    }
83    return res;
84}
85
86function splitWords(s) {
87    return s == "" ? [] : s.split(' ');
88}
89
90function* main() {
91    const arr = splitWords(yield).map((v) => parseInt(v));
92    const res = heapTop3(arr);
93    console.log(res.join(' '));
94}
95
96class EOFError extends Error {}
97{
98    const gen = main();
99    const next = (line) => gen.next(line).done && process.exit();
100    let buf = '';
101    next();
102    process.stdin.setEncoding('utf8');
103    process.stdin.on('data', (data) => {
104        const lines = (buf + data).split('\n');
105        buf = lines.pop();
106        lines.forEach(next);
107    });
108    process.stdin.on('end', () => {
109        buf && next(buf);
110        gen.throw(new EOFError());
111    });
112}
113
1#include <algorithm> // copy
2#include <functional> // greater
3#include <iostream> // cin, cout
4#include <iterator> // back_inserter, istream_iterator, ostream_iterator, prev
5#include <queue> // priority_queue
6#include <sstream> // istringstream
7#include <string> // getline, string
8#include <utility> // move
9#include <vector> // vector
10
11std::vector<int> heap_top_3(std::vector<int> arr) {
12    std::priority_queue<int, std::vector<int>, std::greater<int>> heap(arr.begin(), arr.end());
13    std::vector<int> res;
14    for (int i = 0; i < 3; i++) {
15        res.emplace_back(heap.top());
16        heap.pop();
17    }
18    return res;
19}
20
21template<typename T>
22std::vector<T> get_words() {
23    std::string line;
24    std::getline(std::cin, line);
25    std::istringstream ss{line};
26    std::vector<T> v;
27    std::copy(std::istream_iterator<T>{ss}, std::istream_iterator<T>{}, std::back_inserter(v));
28    return v;
29}
30
31template<typename T>
32void put_words(const std::vector<T>& v) {
33    if (!v.empty()) {
34        std::copy(v.begin(), std::prev(v.end()), std::ostream_iterator<T>{std::cout, " "});
35        std::cout << v.back();
36    }
37    std::cout << '\n';
38}
39
40int main() {
41    std::vector<int> arr = get_words<int>();
42    std::vector<int> res = heap_top_3(arr);
43    put_words(res);
44}
45