1649. Create Sorted Array through Instructions
Problem Description
The problem presents a scenario where we are given an integer array instructions
and asked to build a sorted array using those instructions. Starting with an empty array nums
, we insert each element from instructions
from left to right. The key element is to calculate the cost associated with each insertion. This cost is computed by determining the minimum of two quantities: the count of elements currently in nums
that are strictly less than the element being inserted, and the count of elements currently in nums
that are strictly greater than the element being inserted.
After an element is inserted, if there are already duplicates of that element in nums
, the insertion point can be before or after the existing ones, which doesn't affect the calculation of the cost.
The output is the sum of all individual costs (modulus 10^9 + 7
to handle large numbers) for inserting the entire instructions
array into nums
.
The challenge of the problem is to efficiently handle the cost calculations and insertions, as straightforward methods may not suffice for large datasets due to time complexity constraints.
Intuition
To arrive at the solution approach, one should recognize that the operations of insertion in nums
and computing the count of elements less than or greater than a given value are crucial points where time efficiency matters. A naive iteration for each element would lead to a time complexity of O(n^2), which is not efficient.
The intuition is to use a Binary Indexed Tree (BIT) or a Segment Tree to efficiently query and update counts while maintaining a cumulative frequency of elements up to a certain value. A Binary Indexed Tree offers a more memory-efficient way to achieve this compared to a traditional Segment Tree, and both have a time complexity of O(log n) for updates and queries.
A BIT is a data structure that supports two operations efficiently on an array of numbers: updating the value of an element and computing the prefix sum of elements up to a certain index.
- When we insert an element
x
, we update the BIT to reflect that there is now one more occurrence ofx
. - To find the cost of inserting
x
, we query the BIT:- To find the number of elements strictly less than
x
, we get a prefix sum query ofx - 1
, since this will give us the total count of elements belowx
. - To find the number of elements strictly greater than
x
, we subtract from the total elements inserted so far, the prefix sum ofx
(which includesx
itself and all numbers below it).
- To find the number of elements strictly less than
With this approach, we can incrementally build our sorted array nums
and calculate the insertion cost for each element efficiently.
The solution's implementation defines a Binary Indexed Tree class, with update
and query
methods. As we iterate through instructions
, for each element, we use the BIT to calculate the cost of insertion, update the cumulative frequency of elements, and keep a running total of the cost modulo 10^9 + 7
.
This approach ensures efficient handling of the required operations to calculate the cost with a time complexity of O(n log m), where n
is the length of the instructions
list and m
is the range of numbers in instructions
.
Learn more about Segment Tree, Binary Search, Divide and Conquer and Merge Sort patterns.
Solution Approach
The given solution makes use of a Binary Indexed Tree (BIT), which is also known as a Fenwick Tree. This data structure is particularly useful for solving problems that involve frequent updates and queries on prefix sums, which is the situation we encounter when determining the insertion cost for the instructions
array.
Binary Indexed Tree
The Binary Indexed Tree is constructed to store the frequencies of numbers up to the maximum value present in the instructions
array. Its size is initialized based on the maximum element to ensure that all elements can be accounted for.
Update Operation
To update the BIT, increment the frequency count of a number, tree.update(x, 1)
is called. This affects all the subsequent elements that contain x
in their binary representation sum path. Here's how the update method works:
- Starting from our index, denoted by
x
, increment the corresponding BIT bucket by an update value (which, in this case, is1
for each insertion). - The tree's buckets are designed such that each bucket holds a cumulative frequency of a range of indexes, determined by the least significant bit of the bucket's index.
- After updating the current bucket, the next index to update is found by incrementing the current index
x
with the valuex & -x
(x AND the 2's complement of x
), which gives the least significant bit that represents how much you can jump to the next affected bucket in the BIT.
Query Operation
To query the BIT (for example, for a value x
), tree.query(x)
is called, which returns the prefix sum for the first x
elements. This operation is used to determine the number of elements less than the current element that we are trying to insert. Here's how the query method works:
- Start from the index
x
, and keep adding the values at the indexed buckets to a running sum. - To move to the next index to be checked, we subtract from the current index
x
the valuex & -x
to step backwards through the BIT. - The process repeats until we reach the start of the array (index zero), by which point
s
contains the cumulative count of frequencies for indexes up tox
.
The BIT's update and query operations optimize the calculation of the insertion cost as they allow for both the insertion and the count of elements less than or greater than a given value to be computed in O(log n)
time complexity.
Integration in the Solution
In the context of the problem at hand, the BIT is used to keep track of the number of times each number in the instructions
has been inserted into nums
.
As each element in instructions
is processed:
- We calculate the cost of insertion as the minimum of all numbers strictly less than (
tree.query(x - 1)
) and strictly greater than it (i - tree.query(x)
). - We update the BIT with the current number to reflect its insertion by incrementing its frequency in
tree.update(x, 1)
. - We accumulate the cost after each insertion operation. To prevent integer overflow, we take modulo
10^9 + 7
of the result.
By continuously updating the BIT and querying it for each element's associated cost, we arrive at the total insertion cost for the entire instructions
array with the desired time efficiency.
Ready to land your dream job?
Unlock your dream job with a 2-minute evaluator for a personalized learning plan!
Start EvaluatorExample Walkthrough
Let's illustrate the solution approach using a small example with the array of integers instructions
given as [1, 5, 6, 2]
.
When we start, our BIT is all zeroes, since no elements have been inserted yet.
-
We insert the first element
1
. There are no elements in the bit, so the cost is0
. Our BIT is updated to1
at the index representing the number1
.Current BIT state (index: count):
[0: 0, 1: 1, 2: 0, 3: 0, 4: 0, 5: 0, 6: 0]
Total cost:
0
-
Inserting
5
: The cost is the minimum of the count of numbers less than5
(tree.query(4)
which is1
) and the count of numbers greater than5
(0
as we only have1
and5
so far). The cost is0
since that's the minimum.Update the BIT for
5
:[0: 0, 1: 1, 2: 0, 3: 0, 4: 0, 5: 1, 6: 0]
Total cost:
0 + 0 = 0
-
Inserting
6
: The cost is the minimum of the count of numbers less than6
(tree.query(5)
which is the sum of counts for1
and5
, hence2
) and the count of numbers greater than6
(0
).Update the BIT for
6
:[0: 0, 1: 1, 2: 0, 3: 0, 4: 0, 5: 1, 6: 1]
Total cost:
0 + 0 = 0
-
Inserting
2
: We calculate the cost as the minimum of the count of numbers less than2
(tree.query(1)
which is1
) and the count of numbers greater than2
(2
- the sum of counts for1 and 5
).So the cost for inserting
2
is1
, and we update the BIT for2
:[0: 0, 1: 1, 2: 1, 3: 0, 4: 0, 5: 1, 6: 1]
Total cost:
0 + 0 + 1 = 1
The final total cost for inserting all elements is 1
, and as we insert the elements one by one, we use the BIT to efficiently calculate the cost rather than querying the nums
array directly which would be less efficient. The final cost is outputted as the answer, which in this case remains 1
(since it's under the modulo 10^9 + 7
).
Solution Implementation
1from typing import List
2
3class BinaryIndexedTree:
4 def __init__(self, size: int):
5 self.size = size
6 # Initialize the tree with 0 values, +1 because index 0 is not used
7 self.tree = [0] * (size + 1)
8
9 def update(self, index: int, value: int):
10 # Updates the tree with 'value' at the position 'index'
11 while index <= self.size:
12 self.tree[index] += value
13 # Move to next element to update
14 index += index & -index
15
16 def query(self, index: int) -> int:
17 # Queries the cumulative frequency up to the index
18 result = 0
19 while index:
20 result += self.tree[index]
21 # Move to parent element
22 index -= index & -index
23 return result
24
25
26class Solution:
27 def createSortedArray(self, instructions: List[int]) -> int:
28 # Find the maximum value in the instructions list for tree size
29 max_value = max(instructions)
30 # Initialize the Binary Indexed Tree with size max_value
31 tree = BinaryIndexedTree(max_value)
32 cost_sum = 0
33 # A large number for modulo operation to ensure result fits in 32-bit integer
34 modulo = 10**9 + 7
35
36 # Iterate over the instructions to calculate the cost
37 for i, number in enumerate(instructions):
38 # Calculate the cost as the minimum of numbers less than the current number
39 # and numbers greater than it that came before the current (i-th) position
40 cost = min(tree.query(number - 1), i - tree.query(number))
41 cost_sum += cost
42 # Update the tree with current number
43 tree.update(number, 1)
44
45 # Return the total cost modulo 10^9 + 7 to avoid large numbers
46 return cost_sum % modulo
47
1class BinaryIndexedTree {
2 private int size; // Size of the array
3 private int[] tree; // The Binary Indexed Tree (also, called Fenwick Tree)
4
5 // Constructor
6 public BinaryIndexedTree(int size) {
7 this.size = size;
8 this.tree = new int[size + 1];
9 }
10
11 // Updates the tree with an integer value 'v' at position 'x'
12 public void update(int x, int v) {
13 while (x <= size) {
14 tree[x] += v; // Increment the value at index x
15 x += x & -x; // Move to the next index to update
16 }
17 }
18
19 // Queries the sum of the first 'x' elements in the tree
20 public int query(int x) {
21 int sum = 0;
22 while (x > 0) {
23 sum += tree[x]; // Add the value at index x to the sum
24 x -= x & -x; // Move to the previous index to continue querying
25 }
26 return sum;
27 }
28}
29
30class Solution {
31 // Method to create a sorted array and return the minimum cost to do so
32 public int createSortedArray(int[] instructions) {
33 int maxValue = 0;
34 // Find the maximum value in instructions to determine the size of BinaryIndexedTree
35 for (int x : instructions) {
36 maxValue = Math.max(maxValue, x);
37 }
38 BinaryIndexedTree tree = new BinaryIndexedTree(maxValue);
39
40 int totalCost = 0; // Initialize the total cost of insertion operations
41 final int mod = (int) 1e9 + 7; // The modulus value to ensure the result is within the bounds of an integer
42 for (int i = 0; i < instructions.length; ++i) {
43 int x = instructions[i];
44 // Cost is the minimum of the number of elements smaller than 'x' and the
45 // number of elements greater than 'x' that are already in the sorted array
46 int cost = Math.min(tree.query(x - 1), i - tree.query(x));
47 totalCost = (totalCost + cost) % mod; // Update the total cost while keeping it within the int bounds
48 tree.update(x, 1); // Insert 'x' by updating the tree
49 }
50 return totalCost; // Return the computed total cost
51 }
52}
53
1#include <vector>
2#include <algorithm>
3
4using std::vector;
5using std::min;
6using std::max_element;
7
8// BinaryIndexedTree (also known as a Fenwick Tree) is a data structure that provides efficient methods for cumulative frequency queries and updates.
9class BinaryIndexedTree {
10public:
11 // Constructor to initialize a BinaryIndexedTree of size n.
12 explicit BinaryIndexedTree(int size)
13 : size_(size)
14 , tree_(size + 1, 0) {} // tree_ array holds the binary indexed tree with an extra element for simplicity.
15
16 // Function to update the tree with a value at a given index.
17 void update(int index, int delta) {
18 while (index <= size_) {
19 tree_[index] += delta;
20 index += index & -index; // Climb up the tree by adding the least significant bit (LSB).
21 }
22 }
23
24 // Query the cumulative frequency up to and including index.
25 int query(int index) {
26 int sum = 0;
27 while (index > 0) {
28 sum += tree_[index];
29 index -= index & -index; // Move to parent by subtracting the least significant bit (LSB).
30 }
31 return sum;
32 }
33
34private:
35 int size_; // Represents the size of the input array for which the tree is built.
36 vector<int> tree_; // The representation of the Binary Indexed Tree.
37};
38
39class Solution {
40public:
41 // Function to create a sorted array and calculate the cost of sorting.
42 int createSortedArray(vector<int>& instructions) {
43 int maxElement = *max_element(instructions.begin(), instructions.end()); // Find the maximum element in the instructions.
44 BinaryIndexedTree tree(maxElement); // Initialize a BinaryIndexedTree with size equal to the max element.
45 const int mod = 1e9 + 7; // Modulo value for output to prevent integer overflow.
46 int cost = 0; // Variable to store the total cost of inserting elements.
47
48 for (int i = 0; i < instructions.size(); ++i) {
49 int num = instructions[i]; // Current instruction/element.
50
51 // Calculate the cost of the current instruction:
52 // It is the minimum of all elements less than the current
53 // or all elements greater than the current that have already been processed.
54 int currentCost = min(tree.query(num - 1), i - tree.query(num));
55
56 cost = (cost + currentCost) % mod; // Update the cost with modulo operation to keep it within bounds.
57 tree.update(num, 1); // Update the BinaryIndexedTree after processing an element.
58 }
59
60 return cost; // Return the total cost of creating the sorted array.
61 }
62};
63
1// Define the size of the Binary Indexed Tree (BIT) and initialize the array.
2let size: number;
3let bitArray: number[];
4
5// Initialize the BIT with the given size.
6function initializeBIT(n: number): void {
7 size = n;
8 bitArray = new Array(n + 1).fill(0);
9}
10
11// Update the BIT at index 'index' with value 'value'.
12// This ultimately represents adding 'value' to all elements in the original array represented by the BIT at 'index' and beyond.
13function updateBIT(index: number, value: number): void {
14 while (index <= size) {
15 bitArray[index] += value;
16 index += index & -index;
17 }
18}
19
20// Query the BIT up to index 'index' to get the prefix sum of the original array.
21// This gives us the sum of elements from the start of the original array up to 'index'.
22function queryBIT(index: number): number {
23 let sum = 0;
24 while (index > 0) {
25 sum += bitArray[index];
26 index -= index & -index;
27 }
28 return sum;
29}
30
31// Function to create a sorted array following the instructions and calculate the cost.
32// 'instructions' is an array where 'instructions[i]' is the number to be added at step 'i'.
33// The cost is calculated as the minimum between the number of elements already in the array that are less than the current number
34// and the number of elements that are greater than the current number.
35function createSortedArray(instructions: number[]): number {
36 // Maximum number from the instructions to define the size of the BIT.
37 const maxElement = Math.max(...instructions);
38
39 // Initialize the BIT with the maximum element.
40 initializeBIT(maxElement);
41 let totalCost = 0;
42 const mod = 10 ** 9 + 7; // Modulus to ensure the result stays within integer limits after each addition.
43
44 // Process each instruction and calculate cost.
45 instructions.forEach((value, index) => {
46 // Calculate the cost of placing value into its correct position.
47 const cost = Math.min(queryBIT(value - 1), index - queryBIT(value));
48 // Update total cost, ensuring it does not exceed the modulus.
49 totalCost = (totalCost + cost) % mod;
50 // Update BIT to include the current value.
51 updateBIT(value, 1);
52 });
53
54 // Return the total cost after all instructions have been processed.
55 return totalCost;
56}
57
Time and Space Complexity
The provided code defines a BinaryIndexedTree
(also known as a Fenwick Tree) and uses it in a Solution
class to determine the cost of creating a sorted array from the given instructions
.
Time Complexity
The main operations involved are update
and query
operations on the Binary Indexed Tree.
-
The
update
method has a time complexity ofO(log n)
, wheren
is the size of the tree (or the maximum number in theinstructions
). This is because in the worst case, updating a single element requires updating all the tree levels that the element contributes to, and since a Binary Indexed Tree is a binary tree, there areO(log n)
levels. -
Similarly, the
query
method also has a time complexity ofO(log n)
for the same reasoning as theupdate
method.
These operations are performed for every element in instructions
, leading to a total time complexity of O(m * log n)
, where m
is the number of elements in instructions
and n
is the highest value in instructions
.
Space Complexity
The space complexity is determined by the storage requirements of the Binary Indexed Tree.
- The Binary Indexed Tree
c
array has a size ofn + 1
, wheren
is the maximum number ininstructions
. Thus, the space complexity isO(n)
.
So overall the time complexity of the provided solution is O(m * log n)
and the space complexity is O(n)
.
Learn more about how to find time and space complexity quickly using problem constraints.
Which of these properties could exist for a graph but not a tree?
Recommended Readings
Segment Tree Faster Range Queries For this article we want to introduce the idea of a Segment Tree Segment Trees allow us to quickly perform range queries as well as range updates Suppose we have an array and we want to know the sum of a particular range of numbers as well
Binary Search Speedrun For each of the Speedrun questions you will be given a binary search related problem and a corresponding multiple choice question The multiple choice questions are related to the techniques and template s introduced in the binary search section It's recommended that you have gone through at
https algomonster s3 us east 2 amazonaws com cover_photos dnd svg Divide and Conquer The main idea of divide and conquer is to split the main problem into two smaller components assuming that each one of the components had already been solved recursively and then try to solve the bigger
Want a Structured Path to Master System Design Too? Donāt Miss This!