546. Remove Boxes


Problem Description

The problem you're tackling involves a game with boxes of different colors, each color represented by a unique positive number. The objective of this game is to score the highest points possible by removing continuous blocks of boxes with the same color, following certain rules. For each round, you can remove a set of contiguous boxes of the same color and earn points equal to the square of the number of boxes removed (if k boxes are removed, you earn k*k points). The game is played in several rounds, and you continue removing boxes until no boxes are left. The challenge is to figure out the order in which to remove the boxes to maximize your total score.

Intuition

When considering a solution to maximize points for removing boxes, we face a complexity due to the various combinations in which boxes can be removed. A brute force approach considering all possible removal sequences would be computationally intensive and inefficient. Hence, a dynamic programming approach is suitable for optimizing the solution. The intuition behind the solution is to break down the larger problem into smaller sub-problems that can be solved once and stored for future reference, thus avoiding redundant computations.

To solve this problem, we can use a technique called memorization (a form of dynamic programming) where we remember the outcomes of sub-problems. We define a recursive function that can explore all possibilities of removing boxes, which will calculate the maximum points for a given range of boxes and a certain number of same-colored boxes adjacent to the rightmost box in that range. By caching the results, we optimize the solution by ensuring that each sub-problem is only solved once, even if it is encountered multiple times through different recursive paths.

The recursion process expands as follows: We consider the rightmost box and the number of same-colored boxes k (initially 'k' would be 1, as it's the box itself). If there are any boxes to the left with the same color, it might be beneficial to remove boxes in between first, to later remove all of them together for more points ((k+1)*(k+1) points instead of k*k plus 1*1). We iterate through each position to the left, checking if removing other groups eventually leads to a higher score. By comparing the result of taking different actions, we can determine the optimal solution.

Notice that the provided solution recycles the boxes list by incrementally removing the elements from the right and calculating the points as if those boxes were being removed at each step of the recursion, while also tracking the count of consecutive boxes with the same color as the one being targeted for removal.

The solution incorporates a decorator @cache over the recursive function to memorize previously computed results, which drastically reduces the number of computations and leads to an efficient algorithm for solving the problem.

Solution Approach

The reference solution approach uses depth-first search (DFS) in combination with dynamic programming for caching (memoization). The cache stores the results of subproblems, avoiding recalculating them multiple times. Let's walk through the implementation.

Data Structures Used

  • A 3-dimensional cache implemented implicitly via the @cache decorator. This decorator caches the return value of the recursive function dfs for each unique set of its arguments.

Algorithm

  1. Define a recursion function dfs(i, j, k):

    • i and j are the starting and ending indices, forming the subarray of boxes which we currently consider.
    • k is the number of boxes of the same color as boxes[j] which are consecutive to boxes[j] from the right (including boxes[j] itself).
  2. Check for the base case where i > j. This means no boxes to remove, so the function returns 0.

  3. Optimize the process by removing all the same-colored boxes adjacent from the right of boxes[j]. Increment k as we do this and move j to the left.

  4. Calculate the current score by removing the groups of k same-colored boxes:

    • ans = dfs(i, j - 1, 0) + (k + 1) * (k + 1): We get (k + 1) * (k + 1) points for removing these boxes, and we solve the subproblem from the remaining ones.
  5. Iterate over the subarray to consider cases where it might be more profitable to split the action:

    • If element at h is the same color as boxes[j], try removing boxes between h and j first, then combine this action with the removal of all contiguous same-colored boxes starting from h backward to i.
  6. Within each iteration, the function dfs calls itself twice:

    • dfs(h + 1, j - 1, 0): solves for the subarray after skipping boxes that could be merged with boxes[j] for a potential bigger score.
    • dfs(i, h, k + 1): assimilates the boxes[h] and consecutive boxes into the group of same-colored boxes towards the right.
  7. The recursion ensures that all combinations of splits and removals are considered, maximizing the score stored in ans.

  8. The main function initializes the process by calling dfs(0, n - 1, 0) where n is the total number of boxes.

  9. Clear the cache at the end if needed, though in Python3.9+ the cache decorator automatically handles this.

The algorithm uses dynamic programming through memoization to refer back to previously computed results of subproblems. This, combined with the DFS search pattern, guarantees that the solution explores all possible ways to maximize the score while ensuring efficiency by not revisiting solved subproblems.

Note that this solution approach effectively uses memoization to break down a complex recursive problem into subproblems that can be independently solved, storing their results to build up to the final solution.

💪
Level Up Your
Algo Skills

Example Walkthrough

Let's use a simplified example with a smaller set of boxes to illustrate the solution approach. Consider the following array of boxes: [1, 2, 2, 1]. In this array, the numbers represent the colors of the boxes. The goal is to maximize the points earned by removing continuous blocks of same-colored boxes.

  1. We start by considering the entire array [1, 2, 2, 1]. We initialize our recursive function dfs with i=0, j=3, and k=0, which represent the start and end of the array and the initial count of same-colored boxes adjacent to boxes[j].

  2. We observe that we can remove the two middle boxes of color 2 for 2*2 points and then remove the two boxes of color 1 for 1*1 + 1*1 points, for a total of 6 points.

  3. However, the recursive function considers the adjacent matching colors at the end (boxes[j]) and evaluates the optimal strategy. Removing the single 1 box at index 3 first would be suboptimal since combining it with the box of color 1 at index 0 in the future could yield more points.

  4. Our function explores the possibility of combining like-colored boxes. The calling of dfs(0, 2, 1) would return 9 (since removing [2, 2, 1] would yield 3*3 points), which would leave us with just the 1 box at index 0, which could then be removed for an additional 1 point for a cumulative total of 10 points.

  5. The caching mechanism ensures that if we come across another subproblem with the same i, j, and k, we don't recompute the result but use the stored value instead.

  6. The function dfs also considers splits where it might be beneficial to remove a subset of boxes to maximize the overall score. It checks if there are configurations where removing boxes between h and j first (where boxes[h] is the same color as boxes[j]) results in a higher score once the contiguous like-colored boxes are combined and removed together.

  7. For our example, the function will also check the score if we remove box at index 1 or 2 (both of color 2) separately, but it will find that it is more beneficial to remove them together with the box at index 3 (of color 1) afterwards.

  8. After exploring all possibilities, the function returns the maximum score possible, which, for our example, was found to be 10 points.

  9. To conclude, the recursive solution with memoization enabled finds that the optimal sequence of removals for the array [1, 2, 2, 1] is to remove the boxes in the following order:

    • Remove the subarray [2, 2, 1] for 9 points.
    • Remove the remaining box [1] for 1 point.
    • This sequence gives us a total of 10 points, which is the maximum score possible.

By caching intermediate results and using a recursive function that iteratively considers all possible moves, the solution is optimized to find the maximum points achievable for any given array of boxes, no matter the size or sequence.

Python Solution

1from functools import lru_cache
2
3class Solution:
4    def removeBoxes(self, boxes: List[int]) -> int:
5        # Using LRU cache to store results of subproblems
6        @lru_cache(None)
7        def dfs(start, end, continuous_count):
8            # Base case: if the start index is greater than the end index,
9            # no boxes are left, so return 0 points
10            if start > end:
11                return 0
12          
13            # Skip the boxes that are the same from the end, increasing the continuous count.
14            # This step merges the later operations for the same color boxes.
15            while start < end and boxes[end] == boxes[end - 1]:
16                end -= 1
17                continuous_count += 1
18          
19            # Initialize the score by removing the last box (and all its continuous equals)
20            # and solve the subproblem without the last box
21            score = dfs(start, end - 1, 0) + (continuous_count + 1) ** 2
22          
23            # Try to merge with the same color boxes by finding a box of the same
24            # color between the start and end pointers and then combine them
25            for middle in range(start, end):
26                if boxes[middle] == boxes[end]:
27                    # Potentially increase the score by merging the box at 'middle'
28                    # with the same color boxes at the end
29                    score = max(score, dfs(middle + 1, end - 1, 0) + dfs(start, middle, continuous_count + 1))
30            return score
31
32        # Call the DFS for the whole array
33        total_score = dfs(0, len(boxes) - 1, 0)
34      
35        # Clear the cache after computation to free memory
36        dfs.cache_clear()
37      
38        return total_score
39
40# Note that List typing should be imported from typing module by adding:
41# from typing import List
42

Java Solution

1class Solution {
2    // f stores the computed results to avoid repetitive calculations (memoization)
3    private int[][][] memo;
4    // boxes is the array representing the input boxes
5    private int[] boxes;
6
7    public int removeBoxes(int[] boxes) {
8        this.boxes = boxes;
9        int n = boxes.length;
10        memo = new int[n][n][n]; // Initialised with default value 0
11        return dfs(0, n - 1, 0); // Start the depth-first search from the whole range
12    }
13
14    /**
15     * Depth-first search function to find the maximum points possible.
16     *
17     * @param start - the starting index of the range of boxes considered
18     * @param end - the ending index of the range of boxes considered
19     * @param consecutive - the number of boxes of the same color consecutive to boxes[end]
20     * @return the maximum points achievable for the subproblem
21     */
22    private int dfs(int start, int end, int consecutive) {
23        if (start > end) {
24            return 0; // Base case: no boxes left to remove
25        }
26
27        // Optimization: Group all boxes of the same color as boxes[end].
28        while (start < end && boxes[end] == boxes[end - 1]) {
29            --end;
30            ++consecutive;
31        }
32
33        // Check for memoized result to avoid recalculating
34        if (memo[start][end][consecutive] > 0) {
35            return memo[start][end][consecutive];
36        }
37
38        // Calculate score for removing all boxes of the same color as boxes[end]
39        int maxPoints = dfs(start, end - 1, 0) + (consecutive + 1) * (consecutive + 1);
40
41        // Try to increase the score by finding a box with the same color earlier in the array
42        for (int i = start; i < end; ++i) {
43            if (boxes[i] == boxes[end]) {
44                int points = 
45                    dfs(i + 1, end - 1, 0) +  // Points after removing boxes between i and end
46                    dfs(start, i, consecutive + 1); // Points from the current segment including i
47                maxPoints = Math.max(maxPoints, points);
48            }
49        }
50
51        // Store the maximum points in the memo array for future reference
52        memo[start][end][consecutive] = maxPoints;
53        return maxPoints;
54    }
55}
56

C++ Solution

1class Solution {
2public:
3    int removeBoxes(vector<int>& boxes) {
4        int n = boxes.size();
5        // Create a 3D dynamic programming (dp) table filled with zeros
6        // dp[i][j][k] : the maximum points for boxes[i] to boxes[j]
7        // with k boxes which are the same as boxes[j] appended at the end.
8        vector<vector<vector<int>>> dp(n, vector<vector<int>>(n, vector<int>(n, 0)));
9      
10        // Define the recursive depth-first search (dfs) function
11        // It uses memoization to store intermediate results in the dp table.
12        function<int(int, int, int)> dfs = [&](int i, int j, int k) -> int {
13            // If the start index is greater than the end index, no boxes to remove
14            if (i > j) return 0;
15          
16            // Optimize by merging same adjacent boxes at the end
17            while (i < j && boxes[j] == boxes[j - 1]) {
18                --j;
19                ++k;
20            }
21          
22            // If we have computed this state before, return the result from dp table
23            if (dp[i][j][k] != 0) return dp[i][j][k];
24          
25            // Handle the base case - remove boxes[j] along with k same boxes,
26            // then remove boxes from i to j - 1 as a subproblem.
27            int ans = dfs(i, j - 1, 0) + (k + 1) * (k + 1);
28          
29            // Try to find a box boxes[h] with the same color as boxes[j]
30            // to maximize points.
31            for (int h = i; h < j; ++h) {
32                if (boxes[h] == boxes[j]) {
33                    // If such a box is found, maximize the points by splitting
34                    // the problem into: boxes from h + 1 to j - 1 (without boxes[h] and boxes[j])
35                    // and boxes from i to h (with an additional same-colored box added to the end).
36                    ans = max(ans, dfs(h + 1, j - 1, 0) + dfs(i, h, k + 1));
37                }
38            }
39          
40            // Memoize the computed result for the current state
41            dp[i][j][k] = ans;
42            return ans;
43        };
44      
45        // Start the dfs from the first box to the last box with no same-colored box at the end.
46        return dfs(0, n - 1, 0);
47    }
48};
49

Typescript Solution

1// Define the type for the 3D dynamic programming table.
2type DPTable = number[][][];
3
4// The `removeBoxes` function calculates the maximum points for removing boxes.
5// It takes an array of boxes as input and returns the maximum points achievable.
6function removeBoxes(boxes: number[]): number {
7    const n: number = boxes.length;
8    // Create a 3D dynamic programming (dp) table filled with zeros
9    // dp[i][j][k] : the maximum points for boxes[i] to boxes[j]
10    // with k boxes which are the same as boxes[j] appended at the end.
11    const dp: DPTable = Array.from({ length: n }, () =>
12        Array.from({ length: n }, () => Array(n).fill(0))
13    );
14
15    // Define the recursive depth-first search (dfs) function
16    // It uses memoization to store intermediate results in the dp table.
17    const dfs = (i: number, j: number, k: number): number => {
18        // If the start index is greater than the end index, no boxes to remove
19        if (i > j) return 0;
20
21        // Optimize by merging the same adjacent boxes at the end
22        while (i < j && boxes[j] === boxes[j - 1]) {
23            --j;
24            ++k;
25        }
26
27        // If we have computed this state before, return the result from the dp table
28        if (dp[i][j][k] !== 0) return dp[i][j][k];
29
30        // Handle the base case - remove boxes[j] along with k same boxes,
31        // then remove boxes from i to j - 1 as a subproblem.
32        let maxPoints: number = dfs(i, j - 1, 0) + (k + 1) * (k + 1);
33
34        // Try to find a box boxes[h] with the same color as boxes[j]
35        // to maximize points.
36        for (let h: number = i; h < j; h++) {
37            if (boxes[h] === boxes[j]) {
38                // If such a box is found, maximize the points by splitting
39                // the problem into: (1) boxes from h + 1 to j - 1 (without boxes[h] and boxes[j])
40                // and (2) boxes from i to h (with an additional same-colored box added to the end).
41                maxPoints = Math.max(maxPoints, dfs(h + 1, j - 1, 0) + dfs(i, h, k + 1));
42            }
43        }
44
45        // Memoize the computed result for the current state
46        dp[i][j][k] = maxPoints;
47        return maxPoints;
48    };
49
50    // Start the dfs from the first box to the last box with no same-colored boxes at the end.
51    return dfs(0, n - 1, 0);
52}
53
54// Example usage:
55// const boxes: number[] = [1, 3, 2, 2, 2, 3, 4, 3, 1];
56// const maxPoints: number = removeBoxes(boxes);
57// console.log(maxPoints); // Output the result
58

Time and Space Complexity

The function removeBoxes is a dynamic programming problem that uses memoization (decorated with @cache) to optimize overlapping subproblems. The performance of the function is evaluated in terms of its time and space complexity.

Time Complexity

The time complexity of this function is O(n^4). Here's why:

  1. There are n * n * n states represented by the parameters (i, j, k). The state space is cubic because i and j can take n different values (where n is the number of elements in boxes), and k can also take up to n values in the worst case (when all boxes are of the same type).
  2. The main contributing factor in the dfs(i, j, k) function is the loop for h in range(i, j):. This loop runs in O(n) time for each recursive call.
  3. In the worst case, for each state (i, j, k), the loop might call the dfs function twice, making the complexity O(2*n) for these recursive calls.

Multiplying the state space by the cost per state gives us n^3 * 2n, which simplifies to O(n^4).

Space Complexity

The space complexity is O(n^3) for the following reasons:

  1. The memoization cache stores each unique call to dfs(i, j, k), which means it could potentially store all n^3 states if every state ends up being visited.
  2. Each recursive call to dfs adds a new layer to the call stack. In the worst case, the deepest recursive call chain would be n (as the function could be called for every single box), but this does not multiply with the number of states, instead, the O(n) call stack depth adds to the O(n^3) memoization space, which still leads to O(n^3) space complexity overall.

Hence, the memoization dominates the space complexity, and we consider the space complexity to be O(n^3), which corresponds to the number of subproblems that can be cached.

😈
Become an
Algo Monster

Got a question? Ask the Teaching Assistant anything you don't understand.

Still not clear? Ask in the Forum,  Discord or Submit the part you don't understand to our editors.


TA 👨‍🏫