Facebook Pixel

2547. Minimum Cost to Split an Array

Problem Description

You have an integer array nums and an integer k. Your task is to split this array into some number of non-empty subarrays and find the minimum possible total cost of the split.

To calculate the cost of a split, you need to understand two key concepts:

Trimmed Subarray: When you trim a subarray, you remove all numbers that appear only once in that subarray. For example:

  • trimmed([3,1,2,4,3,4]) = [3,4,3,4] (numbers 1 and 2 appear only once, so they're removed)
  • trimmed([1,2,3,3,3,4,4]) = [3,3,3,4,4] (numbers 1 and 2 appear only once, so they're removed)

Importance Value: The importance value of a subarray is calculated as k + length of trimmed subarray. For instance:

  • If a subarray is [1,2,3,3,3,4,4], its trimmed version is [3,3,3,4,4] with length 5
  • The importance value would be k + 5

Total Cost: The total cost of a split is the sum of importance values of all subarrays in that split.

Your goal is to find a way to split the array nums into subarrays such that the total cost is minimized. Each subarray must be contiguous (elements next to each other in the original array) and non-empty.

For example, if you have nums = [1,2,3,4] and k = 2, you could:

  • Keep it as one subarray [1,2,3,4] where all elements appear once, so trimmed is [], giving importance value 2 + 0 = 2
  • Or split it into [1,2] and [3,4], each having importance value 2 + 0 = 2, for total cost 4
  • Or other combinations

The challenge is finding the optimal split that gives the minimum total cost.

Quick Interview Experience
Help others by sharing your interview experience
Have you seen this problem before?

Intuition

The key insight is that this is an optimal subproblem structure where we need to make decisions about where to split the array. At each position, we face a choice: where should the current subarray end?

Think of it this way: if we're standing at index i, we need to decide where to make the next cut. We could cut immediately after i, or after i+1, or after i+2, and so on. Each choice creates a subarray from i to some position j, and then we need to solve the same problem for the remaining array starting from j+1.

This naturally leads to a recursive approach: dfs(i) represents "what's the minimum cost to split the array starting from index i?" For each starting position i, we try all possible ending positions j and calculate:

  • The cost of the subarray from i to j
  • Plus the minimum cost to handle the rest of the array from j+1

The tricky part is calculating the importance value efficiently. Instead of actually creating the trimmed array, we can be clever: the length of the trimmed array equals the total length minus the count of numbers that appear only once. So we track:

  • cnt: frequency of each number in the current subarray
  • one: count of numbers that appear exactly once

As we extend the subarray by including nums[j]:

  • If cnt[nums[j]] becomes 1, we increment one (a new unique number)
  • If cnt[nums[j]] becomes 2, we decrement one (a number is no longer unique)

The importance value is then k + (j - i + 1) - one, where (j - i + 1) is the subarray length and one is the count of unique numbers.

Since we might recalculate dfs(i) for the same i multiple times through different splitting paths, we use memoization to cache results and avoid redundant calculations.

Learn more about Dynamic Programming patterns.

Solution Approach

The solution uses dynamic programming with memoization to find the minimum cost. Here's how the implementation works:

Main Function Structure: We define a recursive function dfs(i) that calculates the minimum cost to split the array starting from index i. The answer to our problem is dfs(0).

Base Case: When i >= n (reached or passed the end of array), we return 0 since there's nothing left to split.

Recursive Case: For each starting position i, we try all possible ending positions j from i to n-1:

  1. Initialize tracking variables:

    • cnt = Counter(): A hash table to track frequency of each number in current subarray
    • one = 0: Count of numbers appearing exactly once
    • ans = inf: Store the minimum cost found
  2. Extend the subarray: For each j from i to n-1:

    • Add nums[j] to our frequency counter: cnt[nums[j]] += 1
    • Update the one counter:
      • If cnt[nums[j]] == 1: A new unique number, so one += 1
      • If cnt[nums[j]] == 2: A number is no longer unique, so one -= 1
  3. Calculate cost for this split:

    • Length of current subarray: j - i + 1
    • Length of trimmed subarray: (j - i + 1) - one
    • Importance value: k + (j - i + 1) - one
    • Total cost: k + j - i + 1 - one + dfs(j + 1)
  4. Track minimum: Update ans = min(ans, current_cost)

Memoization: The @cache decorator automatically memoizes the results of dfs(i). This prevents recalculating the same subproblem multiple times, reducing time complexity from exponential to O(n²).

Time Complexity: O(n²) - We have n states for dfs(i), and for each state, we iterate through at most n positions.

Space Complexity: O(n) - For the recursion stack and memoization cache.

Ready to land your dream job?

Unlock your dream job with a 5-minute evaluator for a personalized learning plan!

Start Evaluator

Example Walkthrough

Let's walk through the solution with nums = [1,2,1,2,1] and k = 2.

We start with dfs(0), which needs to find the minimum cost to split the entire array starting from index 0.

From index 0, we try different ending positions:

Option 1: Subarray [0,0] = [1]

  • cnt = {1: 1}, one = 1 (number 1 appears once)
  • Trimmed length = 1 - 1 = 0
  • Importance = k + 0 = 2 + 0 = 2
  • Total cost = 2 + dfs(1)

Option 2: Subarray [0,1] = [1,2]

  • Start with cnt = {1: 1}, one = 1
  • Add nums[1]=2: cnt = {1: 1, 2: 1}, one = 2 (both appear once)
  • Trimmed length = 2 - 2 = 0
  • Importance = k + 0 = 2 + 0 = 2
  • Total cost = 2 + dfs(2)

Option 3: Subarray [0,2] = [1,2,1]

  • Start with cnt = {1: 1, 2: 1}, one = 2
  • Add nums[2]=1: cnt = {1: 2, 2: 1}, one = 1 (only 2 appears once now)
  • Trimmed length = 3 - 1 = 2
  • Importance = k + 2 = 2 + 2 = 4
  • Total cost = 4 + dfs(3)

Option 4: Subarray [0,3] = [1,2,1,2]

  • Continue from previous: add nums[3]=2
  • cnt = {1: 2, 2: 2}, one = 0 (no numbers appear once)
  • Trimmed length = 4 - 0 = 4
  • Importance = k + 4 = 2 + 4 = 6
  • Total cost = 6 + dfs(4)

Option 5: Subarray [0,4] = [1,2,1,2,1]

  • Continue: add nums[4]=1
  • cnt = {1: 3, 2: 2}, one = 0 (still no numbers appear once)
  • Trimmed length = 5 - 0 = 5
  • Importance = k + 5 = 2 + 5 = 7
  • Total cost = 7 + dfs(5) = 7 + 0 = 7

Now let's evaluate dfs(2) (needed for Option 2):

  • Subarray [2,2] = [1]: importance = 2, total = 2 + dfs(3)
  • Subarray [2,3] = [1,2]: importance = 2, total = 2 + dfs(4)
  • Subarray [2,4] = [1,2,1]: importance = 4, total = 4 + 0 = 4

And dfs(3):

  • Subarray [3,3] = [2]: importance = 2, total = 2 + dfs(4)
  • Subarray [3,4] = [2,1]: importance = 2, total = 2 + 0 = 2

And dfs(4):

  • Subarray [4,4] = [1]: importance = 2, total = 2 + 0 = 2

Working backwards with memoization:

  • dfs(4) = 2
  • dfs(3) = min(2 + 2, 2) = 2
  • dfs(2) = min(2 + 2, 2 + 2, 4) = 4
  • dfs(1) = min(2 + 4, 2 + 2, 4 + 0, 5) = 4
  • dfs(0) = min(2 + 4, 2 + 4, 4 + 2, 6 + 2, 7) = 6

The minimum cost is 6, achieved by splitting as [1] [2,1,2,1] or [1,2] [1,2,1].

Solution Implementation

1class Solution:
2    def minCost(self, nums: List[int], k: int) -> int:
3        from functools import cache
4        from collections import Counter
5        from typing import List
6      
7        @cache
8        def dp(start_idx):
9            """
10            Dynamic programming function to find minimum cost from index start_idx to end.
11          
12            Args:
13                start_idx: Starting index of the current subarray
14              
15            Returns:
16                Minimum cost to partition from start_idx to end of array
17            """
18            # Base case: if we've processed all elements
19            if start_idx >= n:
20                return 0
21          
22            # Track frequency of elements in current subarray
23            frequency_map = Counter()
24            # Count of elements that appear exactly once
25            unique_count = 0
26            # Initialize minimum cost to infinity
27            min_cost = float('inf')
28          
29            # Try all possible end positions for current subarray
30            for end_idx in range(start_idx, n):
31                current_element = nums[end_idx]
32                frequency_map[current_element] += 1
33              
34                # Update count of unique elements
35                if frequency_map[current_element] == 1:
36                    # Element appears for first time
37                    unique_count += 1
38                elif frequency_map[current_element] == 2:
39                    # Element now appears twice, no longer unique
40                    unique_count -= 1
41              
42                # Calculate cost for current subarray:
43                # k (constant cost) + subarray_length - unique_elements + cost of remaining array
44                subarray_length = end_idx - start_idx + 1
45                importance_score = subarray_length - unique_count
46                current_cost = k + importance_score + dp(end_idx + 1)
47              
48                # Update minimum cost
49                min_cost = min(min_cost, current_cost)
50          
51            return min_cost
52      
53        # Store array length for easier access
54        n = len(nums)
55      
56        # Start dynamic programming from index 0
57        return dp(0)
58
1class Solution {
2    private Integer[] memo;           // Memoization array for dynamic programming
3    private int[] nums;               // Input array
4    private int arrayLength;          // Length of the input array
5    private int fixedCost;           // Fixed cost k for each partition
6  
7    /**
8     * Finds the minimum cost to partition the array.
9     * Each partition has a cost of k + length - unique_elements
10     * 
11     * @param nums The input array to partition
12     * @param k The fixed cost for each partition
13     * @return The minimum total cost to partition the entire array
14     */
15    public int minCost(int[] nums, int k) {
16        this.arrayLength = nums.length;
17        this.fixedCost = k;
18        this.nums = nums;
19        this.memo = new Integer[arrayLength];
20      
21        return findMinCostFromIndex(0);
22    }
23  
24    /**
25     * Recursively finds the minimum cost to partition the array starting from index i.
26     * Uses memoization to avoid redundant calculations.
27     * 
28     * @param startIndex The starting index of the current partition
29     * @return The minimum cost to partition the array from startIndex to the end
30     */
31    private int findMinCostFromIndex(int startIndex) {
32        // Base case: if we've processed all elements
33        if (startIndex >= arrayLength) {
34            return 0;
35        }
36      
37        // Check if we've already computed this subproblem
38        if (memo[startIndex] != null) {
39            return memo[startIndex];
40        }
41      
42        // Track frequency of each element in current partition
43        int[] frequency = new int[arrayLength];
44        int uniqueElements = 0;  // Count of elements appearing exactly once
45        int minCost = Integer.MAX_VALUE;
46      
47        // Try all possible end points for the current partition
48        for (int endIndex = startIndex; endIndex < arrayLength; endIndex++) {
49            int currentElement = nums[endIndex];
50            int currentFrequency = ++frequency[currentElement];
51          
52            // Update count of unique elements
53            if (currentFrequency == 1) {
54                uniqueElements++;
55            } else if (currentFrequency == 2) {
56                // Element is no longer unique (appears twice now)
57                uniqueElements--;
58            }
59          
60            // Calculate cost for current partition [startIndex, endIndex]
61            int partitionLength = endIndex - startIndex + 1;
62            int nonUniqueElements = partitionLength - uniqueElements;
63            int currentPartitionCost = fixedCost + nonUniqueElements;
64          
65            // Total cost = current partition cost + cost of remaining array
66            int totalCost = currentPartitionCost + findMinCostFromIndex(endIndex + 1);
67            minCost = Math.min(minCost, totalCost);
68        }
69      
70        // Store and return the result
71        memo[startIndex] = minCost;
72        return minCost;
73    }
74}
75
1class Solution {
2public:
3    int minCost(vector<int>& nums, int k) {
4        int n = nums.size();
5      
6        // dp[i] stores the minimum cost to split nums[i..n-1]
7        // Using memoization to cache results
8        vector<int> dp(n, 0);
9      
10        // Recursive function with memoization to find minimum cost
11        // starting from index 'startIdx'
12        function<int(int)> findMinCost = [&](int startIdx) -> int {
13            // Base case: if we've processed all elements
14            if (startIdx >= n) {
15                return 0;
16            }
17          
18            // Return cached result if already computed
19            if (dp[startIdx] != 0) {
20                return dp[startIdx];
21            }
22          
23            // frequency[i] tracks the count of element i in current subarray
24            vector<int> frequency(n, 0);
25          
26            // Count of elements that appear exactly once in current subarray
27            int uniqueElements = 0;
28          
29            // Initialize with a large value to find minimum
30            int minCostFromHere = INT_MAX;
31          
32            // Try all possible ending positions for the current subarray
33            for (int endIdx = startIdx; endIdx < n; ++endIdx) {
34                // Update frequency of current element
35                int currentElement = nums[endIdx];
36                int newFrequency = ++frequency[currentElement];
37              
38                // Update count of unique elements
39                if (newFrequency == 1) {
40                    // Element appears for the first time
41                    ++uniqueElements;
42                } else if (newFrequency == 2) {
43                    // Element now appears twice, no longer unique
44                    --uniqueElements;
45                }
46              
47                // Calculate cost for current subarray:
48                // k (fixed cost) + subarray length - unique elements + cost of remaining array
49                int subarrayLength = endIdx - startIdx + 1;
50                int nonUniqueCount = subarrayLength - uniqueElements;
51                int currentCost = k + nonUniqueCount + findMinCost(endIdx + 1);
52              
53                // Update minimum cost
54                minCostFromHere = min(minCostFromHere, currentCost);
55            }
56          
57            // Cache and return the result
58            dp[startIdx] = minCostFromHere;
59            return minCostFromHere;
60        };
61      
62        // Start the recursion from index 0
63        return findMinCost(0);
64    }
65};
66
1/**
2 * Finds the minimum cost to partition an array into subarrays
3 * @param nums - The input array of numbers
4 * @param k - The constant cost added for each subarray
5 * @returns The minimum total cost of partitioning
6 */
7function minCost(nums: number[], k: number): number {
8    const n: number = nums.length;
9    // Memoization array to store minimum costs starting from index i
10    const memo: number[] = new Array(n).fill(0);
11  
12    /**
13     * Recursive function to calculate minimum cost starting from index i
14     * @param startIndex - The starting index of the current subarray
15     * @returns The minimum cost from startIndex to the end of array
16     */
17    const calculateMinCost = (startIndex: number): number => {
18        // Base case: reached beyond array bounds
19        if (startIndex >= n) {
20            return 0;
21        }
22      
23        // Return memoized result if already calculated
24        if (memo[startIndex]) {
25            return memo[startIndex];
26        }
27      
28        // Frequency counter for elements in current subarray
29        const frequencyCount: number[] = new Array(n).fill(0);
30        // Count of elements appearing exactly once
31        let uniqueElements: number = 0;
32        // Initialize minimum cost with a large value
33        let minCostResult: number = 1 << 30;
34      
35        // Try all possible ending positions for current subarray
36        for (let endIndex: number = startIndex; endIndex < n; ++endIndex) {
37            // Update frequency of current element
38            const currentFrequency: number = ++frequencyCount[nums[endIndex]];
39          
40            // Update count of unique elements
41            if (currentFrequency === 1) {
42                // Element appears for the first time
43                ++uniqueElements;
44            } else if (currentFrequency === 2) {
45                // Element now appears twice, no longer unique
46                --uniqueElements;
47            }
48          
49            // Calculate cost for current subarray [startIndex, endIndex]
50            // Cost = k + length - uniqueElements + cost of remaining array
51            const subarrayLength: number = endIndex - startIndex + 1;
52            const currentCost: number = k + subarrayLength - uniqueElements + calculateMinCost(endIndex + 1);
53            minCostResult = Math.min(minCostResult, currentCost);
54        }
55      
56        // Store and return the minimum cost
57        memo[startIndex] = minCostResult;
58        return memo[startIndex];
59    };
60  
61    // Start calculation from index 0
62    return calculateMinCost(0);
63}
64

Time and Space Complexity

Time Complexity: O(n^2)

The algorithm uses dynamic programming with memoization. The dfs(i) function is called for each possible starting position from 0 to n-1, resulting in at most n unique states. For each state dfs(i), the function iterates through all positions from i to n-1, performing O(n-i) operations. In the worst case, this gives us:

  • State 0: iterates through positions 0 to n-1 → n operations
  • State 1: iterates through positions 1 to n-1 → n-1 operations
  • ...
  • State n-1: iterates through position n-1 → 1 operation

The total operations sum up to n + (n-1) + ... + 1 = n(n+1)/2 = O(n^2).

Within each iteration, operations like updating the counter and checking conditions take O(1) time, so they don't affect the overall complexity.

Space Complexity: O(n)

The space complexity consists of:

  1. Recursion stack: In the worst case, the recursion depth can be O(n) when each subarray contains only one element.
  2. Memoization cache: The @cache decorator stores results for at most n different states (one for each starting position i).
  3. Counter object: Within each recursive call, a Counter is used which can store at most O(n) distinct elements in the worst case, but only one Counter exists at any point in the recursion path.

The dominant factor is the recursion stack depth plus the memoization storage, both of which are O(n), resulting in an overall space complexity of O(n).

Learn more about how to find time and space complexity quickly.

Common Pitfalls

1. Incorrect Unique Element Count Update Logic

The Pitfall: A common mistake is incorrectly updating the unique_count when elements transition between different frequency states. Developers often forget to handle the case when an element's frequency increases beyond 2, leading to incorrect trimmed subarray calculations.

Incorrect Implementation:

# WRONG: Only considering transitions between 1 and 2
if frequency_map[current_element] == 1:
    unique_count += 1
elif frequency_map[current_element] == 2:
    unique_count -= 1
# Missing: What if frequency goes from 2 to 3, 3 to 4, etc.?

Why It's Wrong: When an element appears 3 or more times, the unique count shouldn't change, but some might mistakenly try to update it further or reset it.

Correct Solution: The provided code is actually correct - it only updates unique_count when:

  • Frequency changes from 0→1 (element becomes unique): increment
  • Frequency changes from 1→2 (element no longer unique): decrement
  • Frequency changes from 2→3+ (no change needed): do nothing

2. Off-by-One Error in Importance Value Calculation

The Pitfall: Miscalculating the importance value by confusing the trimmed length formula.

Incorrect Calculation:

# WRONG: Subtracting unique count from k instead of subarray length
importance_score = k - unique_count
current_cost = importance_score + dp(end_idx + 1)

Correct Solution:

# CORRECT: Importance = k + (trimmed_length)
# Where trimmed_length = total_length - unique_elements
subarray_length = end_idx - start_idx + 1
trimmed_length = subarray_length - unique_count
importance_score = k + trimmed_length
current_cost = importance_score + dp(end_idx + 1)

3. Using Global Counter Instead of Local

The Pitfall: Using a single Counter object across all recursive calls, causing frequency counts to bleed between different subarray considerations.

Incorrect Implementation:

class Solution:
    def minCost(self, nums: List[int], k: int) -> int:
        frequency_map = Counter()  # WRONG: Global counter
      
        @cache
        def dp(start_idx):
            # Using the global frequency_map
            for end_idx in range(start_idx, n):
                frequency_map[nums[end_idx]] += 1  # Modifies global state
                # ... rest of logic

Why It's Wrong: This would accumulate counts across different recursive paths, leading to incorrect frequency calculations.

Correct Solution: Create a new Counter for each call to dp():

def dp(start_idx):
    frequency_map = Counter()  # Local to this function call
    unique_count = 0
    # ... rest of logic

4. Forgetting to Reset or Clear Cache Between Test Cases

The Pitfall: When testing multiple cases or submitting to an online judge, the @cache decorator might retain values from previous test cases if the function is defined at class level.

Solution: Either:

  1. Define the cached function inside the main method (as shown in the correct code)
  2. Explicitly clear the cache if needed: dp.cache_clear()
  3. Use manual memoization with a dictionary that's initialized within the method

The provided solution correctly handles this by defining the dp function inside the minCost method, ensuring a fresh cache for each problem instance.

Discover Your Strengths and Weaknesses: Take Our 5-Minute Quiz to Tailor Your Study Plan:

Which of the following is a min heap?


Recommended Readings

Want a Structured Path to Master System Design Too? Don’t Miss This!

Load More