2547. Minimum Cost to Split an Array
Problem Description
You have an integer array nums
and an integer k
. Your task is to split this array into some number of non-empty subarrays and find the minimum possible total cost of the split.
To calculate the cost of a split, you need to understand two key concepts:
Trimmed Subarray: When you trim a subarray, you remove all numbers that appear only once in that subarray. For example:
trimmed([3,1,2,4,3,4]) = [3,4,3,4]
(numbers 1 and 2 appear only once, so they're removed)trimmed([1,2,3,3,3,4,4]) = [3,3,3,4,4]
(numbers 1 and 2 appear only once, so they're removed)
Importance Value: The importance value of a subarray is calculated as k + length of trimmed subarray
. For instance:
- If a subarray is
[1,2,3,3,3,4,4]
, its trimmed version is[3,3,3,4,4]
with length 5 - The importance value would be
k + 5
Total Cost: The total cost of a split is the sum of importance values of all subarrays in that split.
Your goal is to find a way to split the array nums
into subarrays such that the total cost is minimized. Each subarray must be contiguous (elements next to each other in the original array) and non-empty.
For example, if you have nums = [1,2,3,4]
and k = 2
, you could:
- Keep it as one subarray
[1,2,3,4]
where all elements appear once, so trimmed is[]
, giving importance value2 + 0 = 2
- Or split it into
[1,2]
and[3,4]
, each having importance value2 + 0 = 2
, for total cost4
- Or other combinations
The challenge is finding the optimal split that gives the minimum total cost.
Intuition
The key insight is that this is an optimal subproblem structure where we need to make decisions about where to split the array. At each position, we face a choice: where should the current subarray end?
Think of it this way: if we're standing at index i
, we need to decide where to make the next cut. We could cut immediately after i
, or after i+1
, or after i+2
, and so on. Each choice creates a subarray from i
to some position j
, and then we need to solve the same problem for the remaining array starting from j+1
.
This naturally leads to a recursive approach: dfs(i)
represents "what's the minimum cost to split the array starting from index i
?" For each starting position i
, we try all possible ending positions j
and calculate:
- The cost of the subarray from
i
toj
- Plus the minimum cost to handle the rest of the array from
j+1
The tricky part is calculating the importance value efficiently. Instead of actually creating the trimmed array, we can be clever: the length of the trimmed array equals the total length minus the count of numbers that appear only once. So we track:
cnt
: frequency of each number in the current subarrayone
: count of numbers that appear exactly once
As we extend the subarray by including nums[j]
:
- If
cnt[nums[j]]
becomes 1, we incrementone
(a new unique number) - If
cnt[nums[j]]
becomes 2, we decrementone
(a number is no longer unique)
The importance value is then k + (j - i + 1) - one
, where (j - i + 1)
is the subarray length and one
is the count of unique numbers.
Since we might recalculate dfs(i)
for the same i
multiple times through different splitting paths, we use memoization to cache results and avoid redundant calculations.
Learn more about Dynamic Programming patterns.
Solution Approach
The solution uses dynamic programming with memoization to find the minimum cost. Here's how the implementation works:
Main Function Structure:
We define a recursive function dfs(i)
that calculates the minimum cost to split the array starting from index i
. The answer to our problem is dfs(0)
.
Base Case:
When i >= n
(reached or passed the end of array), we return 0
since there's nothing left to split.
Recursive Case:
For each starting position i
, we try all possible ending positions j
from i
to n-1
:
-
Initialize tracking variables:
cnt = Counter()
: A hash table to track frequency of each number in current subarrayone = 0
: Count of numbers appearing exactly onceans = inf
: Store the minimum cost found
-
Extend the subarray: For each
j
fromi
ton-1
:- Add
nums[j]
to our frequency counter:cnt[nums[j]] += 1
- Update the
one
counter:- If
cnt[nums[j]] == 1
: A new unique number, soone += 1
- If
cnt[nums[j]] == 2
: A number is no longer unique, soone -= 1
- If
- Add
-
Calculate cost for this split:
- Length of current subarray:
j - i + 1
- Length of trimmed subarray:
(j - i + 1) - one
- Importance value:
k + (j - i + 1) - one
- Total cost:
k + j - i + 1 - one + dfs(j + 1)
- Length of current subarray:
-
Track minimum: Update
ans = min(ans, current_cost)
Memoization:
The @cache
decorator automatically memoizes the results of dfs(i)
. This prevents recalculating the same subproblem multiple times, reducing time complexity from exponential to O(n²)
.
Time Complexity: O(n²)
- We have n
states for dfs(i)
, and for each state, we iterate through at most n
positions.
Space Complexity: O(n)
- For the recursion stack and memoization cache.
Ready to land your dream job?
Unlock your dream job with a 5-minute evaluator for a personalized learning plan!
Start EvaluatorExample Walkthrough
Let's walk through the solution with nums = [1,2,1,2,1]
and k = 2
.
We start with dfs(0)
, which needs to find the minimum cost to split the entire array starting from index 0.
From index 0, we try different ending positions:
Option 1: Subarray [0,0] = [1]
cnt = {1: 1}
,one = 1
(number 1 appears once)- Trimmed length = 1 - 1 = 0
- Importance = k + 0 = 2 + 0 = 2
- Total cost = 2 +
dfs(1)
Option 2: Subarray [0,1] = [1,2]
- Start with
cnt = {1: 1}
,one = 1
- Add nums[1]=2:
cnt = {1: 1, 2: 1}
,one = 2
(both appear once) - Trimmed length = 2 - 2 = 0
- Importance = k + 0 = 2 + 0 = 2
- Total cost = 2 +
dfs(2)
Option 3: Subarray [0,2] = [1,2,1]
- Start with
cnt = {1: 1, 2: 1}
,one = 2
- Add nums[2]=1:
cnt = {1: 2, 2: 1}
,one = 1
(only 2 appears once now) - Trimmed length = 3 - 1 = 2
- Importance = k + 2 = 2 + 2 = 4
- Total cost = 4 +
dfs(3)
Option 4: Subarray [0,3] = [1,2,1,2]
- Continue from previous: add nums[3]=2
cnt = {1: 2, 2: 2}
,one = 0
(no numbers appear once)- Trimmed length = 4 - 0 = 4
- Importance = k + 4 = 2 + 4 = 6
- Total cost = 6 +
dfs(4)
Option 5: Subarray [0,4] = [1,2,1,2,1]
- Continue: add nums[4]=1
cnt = {1: 3, 2: 2}
,one = 0
(still no numbers appear once)- Trimmed length = 5 - 0 = 5
- Importance = k + 5 = 2 + 5 = 7
- Total cost = 7 +
dfs(5)
= 7 + 0 = 7
Now let's evaluate dfs(2)
(needed for Option 2):
- Subarray [2,2] = [1]: importance = 2, total = 2 +
dfs(3)
- Subarray [2,3] = [1,2]: importance = 2, total = 2 +
dfs(4)
- Subarray [2,4] = [1,2,1]: importance = 4, total = 4 + 0 = 4
And dfs(3)
:
- Subarray [3,3] = [2]: importance = 2, total = 2 +
dfs(4)
- Subarray [3,4] = [2,1]: importance = 2, total = 2 + 0 = 2
And dfs(4)
:
- Subarray [4,4] = [1]: importance = 2, total = 2 + 0 = 2
Working backwards with memoization:
dfs(4)
= 2dfs(3)
= min(2 + 2, 2) = 2dfs(2)
= min(2 + 2, 2 + 2, 4) = 4dfs(1)
= min(2 + 4, 2 + 2, 4 + 0, 5) = 4dfs(0)
= min(2 + 4, 2 + 4, 4 + 2, 6 + 2, 7) = 6
The minimum cost is 6, achieved by splitting as [1] [2,1,2,1] or [1,2] [1,2,1].
Solution Implementation
1class Solution:
2 def minCost(self, nums: List[int], k: int) -> int:
3 from functools import cache
4 from collections import Counter
5 from typing import List
6
7 @cache
8 def dp(start_idx):
9 """
10 Dynamic programming function to find minimum cost from index start_idx to end.
11
12 Args:
13 start_idx: Starting index of the current subarray
14
15 Returns:
16 Minimum cost to partition from start_idx to end of array
17 """
18 # Base case: if we've processed all elements
19 if start_idx >= n:
20 return 0
21
22 # Track frequency of elements in current subarray
23 frequency_map = Counter()
24 # Count of elements that appear exactly once
25 unique_count = 0
26 # Initialize minimum cost to infinity
27 min_cost = float('inf')
28
29 # Try all possible end positions for current subarray
30 for end_idx in range(start_idx, n):
31 current_element = nums[end_idx]
32 frequency_map[current_element] += 1
33
34 # Update count of unique elements
35 if frequency_map[current_element] == 1:
36 # Element appears for first time
37 unique_count += 1
38 elif frequency_map[current_element] == 2:
39 # Element now appears twice, no longer unique
40 unique_count -= 1
41
42 # Calculate cost for current subarray:
43 # k (constant cost) + subarray_length - unique_elements + cost of remaining array
44 subarray_length = end_idx - start_idx + 1
45 importance_score = subarray_length - unique_count
46 current_cost = k + importance_score + dp(end_idx + 1)
47
48 # Update minimum cost
49 min_cost = min(min_cost, current_cost)
50
51 return min_cost
52
53 # Store array length for easier access
54 n = len(nums)
55
56 # Start dynamic programming from index 0
57 return dp(0)
58
1class Solution {
2 private Integer[] memo; // Memoization array for dynamic programming
3 private int[] nums; // Input array
4 private int arrayLength; // Length of the input array
5 private int fixedCost; // Fixed cost k for each partition
6
7 /**
8 * Finds the minimum cost to partition the array.
9 * Each partition has a cost of k + length - unique_elements
10 *
11 * @param nums The input array to partition
12 * @param k The fixed cost for each partition
13 * @return The minimum total cost to partition the entire array
14 */
15 public int minCost(int[] nums, int k) {
16 this.arrayLength = nums.length;
17 this.fixedCost = k;
18 this.nums = nums;
19 this.memo = new Integer[arrayLength];
20
21 return findMinCostFromIndex(0);
22 }
23
24 /**
25 * Recursively finds the minimum cost to partition the array starting from index i.
26 * Uses memoization to avoid redundant calculations.
27 *
28 * @param startIndex The starting index of the current partition
29 * @return The minimum cost to partition the array from startIndex to the end
30 */
31 private int findMinCostFromIndex(int startIndex) {
32 // Base case: if we've processed all elements
33 if (startIndex >= arrayLength) {
34 return 0;
35 }
36
37 // Check if we've already computed this subproblem
38 if (memo[startIndex] != null) {
39 return memo[startIndex];
40 }
41
42 // Track frequency of each element in current partition
43 int[] frequency = new int[arrayLength];
44 int uniqueElements = 0; // Count of elements appearing exactly once
45 int minCost = Integer.MAX_VALUE;
46
47 // Try all possible end points for the current partition
48 for (int endIndex = startIndex; endIndex < arrayLength; endIndex++) {
49 int currentElement = nums[endIndex];
50 int currentFrequency = ++frequency[currentElement];
51
52 // Update count of unique elements
53 if (currentFrequency == 1) {
54 uniqueElements++;
55 } else if (currentFrequency == 2) {
56 // Element is no longer unique (appears twice now)
57 uniqueElements--;
58 }
59
60 // Calculate cost for current partition [startIndex, endIndex]
61 int partitionLength = endIndex - startIndex + 1;
62 int nonUniqueElements = partitionLength - uniqueElements;
63 int currentPartitionCost = fixedCost + nonUniqueElements;
64
65 // Total cost = current partition cost + cost of remaining array
66 int totalCost = currentPartitionCost + findMinCostFromIndex(endIndex + 1);
67 minCost = Math.min(minCost, totalCost);
68 }
69
70 // Store and return the result
71 memo[startIndex] = minCost;
72 return minCost;
73 }
74}
75
1class Solution {
2public:
3 int minCost(vector<int>& nums, int k) {
4 int n = nums.size();
5
6 // dp[i] stores the minimum cost to split nums[i..n-1]
7 // Using memoization to cache results
8 vector<int> dp(n, 0);
9
10 // Recursive function with memoization to find minimum cost
11 // starting from index 'startIdx'
12 function<int(int)> findMinCost = [&](int startIdx) -> int {
13 // Base case: if we've processed all elements
14 if (startIdx >= n) {
15 return 0;
16 }
17
18 // Return cached result if already computed
19 if (dp[startIdx] != 0) {
20 return dp[startIdx];
21 }
22
23 // frequency[i] tracks the count of element i in current subarray
24 vector<int> frequency(n, 0);
25
26 // Count of elements that appear exactly once in current subarray
27 int uniqueElements = 0;
28
29 // Initialize with a large value to find minimum
30 int minCostFromHere = INT_MAX;
31
32 // Try all possible ending positions for the current subarray
33 for (int endIdx = startIdx; endIdx < n; ++endIdx) {
34 // Update frequency of current element
35 int currentElement = nums[endIdx];
36 int newFrequency = ++frequency[currentElement];
37
38 // Update count of unique elements
39 if (newFrequency == 1) {
40 // Element appears for the first time
41 ++uniqueElements;
42 } else if (newFrequency == 2) {
43 // Element now appears twice, no longer unique
44 --uniqueElements;
45 }
46
47 // Calculate cost for current subarray:
48 // k (fixed cost) + subarray length - unique elements + cost of remaining array
49 int subarrayLength = endIdx - startIdx + 1;
50 int nonUniqueCount = subarrayLength - uniqueElements;
51 int currentCost = k + nonUniqueCount + findMinCost(endIdx + 1);
52
53 // Update minimum cost
54 minCostFromHere = min(minCostFromHere, currentCost);
55 }
56
57 // Cache and return the result
58 dp[startIdx] = minCostFromHere;
59 return minCostFromHere;
60 };
61
62 // Start the recursion from index 0
63 return findMinCost(0);
64 }
65};
66
1/**
2 * Finds the minimum cost to partition an array into subarrays
3 * @param nums - The input array of numbers
4 * @param k - The constant cost added for each subarray
5 * @returns The minimum total cost of partitioning
6 */
7function minCost(nums: number[], k: number): number {
8 const n: number = nums.length;
9 // Memoization array to store minimum costs starting from index i
10 const memo: number[] = new Array(n).fill(0);
11
12 /**
13 * Recursive function to calculate minimum cost starting from index i
14 * @param startIndex - The starting index of the current subarray
15 * @returns The minimum cost from startIndex to the end of array
16 */
17 const calculateMinCost = (startIndex: number): number => {
18 // Base case: reached beyond array bounds
19 if (startIndex >= n) {
20 return 0;
21 }
22
23 // Return memoized result if already calculated
24 if (memo[startIndex]) {
25 return memo[startIndex];
26 }
27
28 // Frequency counter for elements in current subarray
29 const frequencyCount: number[] = new Array(n).fill(0);
30 // Count of elements appearing exactly once
31 let uniqueElements: number = 0;
32 // Initialize minimum cost with a large value
33 let minCostResult: number = 1 << 30;
34
35 // Try all possible ending positions for current subarray
36 for (let endIndex: number = startIndex; endIndex < n; ++endIndex) {
37 // Update frequency of current element
38 const currentFrequency: number = ++frequencyCount[nums[endIndex]];
39
40 // Update count of unique elements
41 if (currentFrequency === 1) {
42 // Element appears for the first time
43 ++uniqueElements;
44 } else if (currentFrequency === 2) {
45 // Element now appears twice, no longer unique
46 --uniqueElements;
47 }
48
49 // Calculate cost for current subarray [startIndex, endIndex]
50 // Cost = k + length - uniqueElements + cost of remaining array
51 const subarrayLength: number = endIndex - startIndex + 1;
52 const currentCost: number = k + subarrayLength - uniqueElements + calculateMinCost(endIndex + 1);
53 minCostResult = Math.min(minCostResult, currentCost);
54 }
55
56 // Store and return the minimum cost
57 memo[startIndex] = minCostResult;
58 return memo[startIndex];
59 };
60
61 // Start calculation from index 0
62 return calculateMinCost(0);
63}
64
Time and Space Complexity
Time Complexity: O(n^2)
The algorithm uses dynamic programming with memoization. The dfs(i)
function is called for each possible starting position from 0
to n-1
, resulting in at most n
unique states. For each state dfs(i)
, the function iterates through all positions from i
to n-1
, performing O(n-i)
operations. In the worst case, this gives us:
- State 0: iterates through positions 0 to n-1 →
n
operations - State 1: iterates through positions 1 to n-1 →
n-1
operations - ...
- State n-1: iterates through position n-1 →
1
operation
The total operations sum up to n + (n-1) + ... + 1 = n(n+1)/2 = O(n^2)
.
Within each iteration, operations like updating the counter and checking conditions take O(1)
time, so they don't affect the overall complexity.
Space Complexity: O(n)
The space complexity consists of:
- Recursion stack: In the worst case, the recursion depth can be
O(n)
when each subarray contains only one element. - Memoization cache: The
@cache
decorator stores results for at mostn
different states (one for each starting positioni
). - Counter object: Within each recursive call, a Counter is used which can store at most
O(n)
distinct elements in the worst case, but only one Counter exists at any point in the recursion path.
The dominant factor is the recursion stack depth plus the memoization storage, both of which are O(n)
, resulting in an overall space complexity of O(n)
.
Learn more about how to find time and space complexity quickly.
Common Pitfalls
1. Incorrect Unique Element Count Update Logic
The Pitfall:
A common mistake is incorrectly updating the unique_count
when elements transition between different frequency states. Developers often forget to handle the case when an element's frequency increases beyond 2, leading to incorrect trimmed subarray calculations.
Incorrect Implementation:
# WRONG: Only considering transitions between 1 and 2 if frequency_map[current_element] == 1: unique_count += 1 elif frequency_map[current_element] == 2: unique_count -= 1 # Missing: What if frequency goes from 2 to 3, 3 to 4, etc.?
Why It's Wrong: When an element appears 3 or more times, the unique count shouldn't change, but some might mistakenly try to update it further or reset it.
Correct Solution:
The provided code is actually correct - it only updates unique_count
when:
- Frequency changes from 0→1 (element becomes unique): increment
- Frequency changes from 1→2 (element no longer unique): decrement
- Frequency changes from 2→3+ (no change needed): do nothing
2. Off-by-One Error in Importance Value Calculation
The Pitfall: Miscalculating the importance value by confusing the trimmed length formula.
Incorrect Calculation:
# WRONG: Subtracting unique count from k instead of subarray length importance_score = k - unique_count current_cost = importance_score + dp(end_idx + 1)
Correct Solution:
# CORRECT: Importance = k + (trimmed_length) # Where trimmed_length = total_length - unique_elements subarray_length = end_idx - start_idx + 1 trimmed_length = subarray_length - unique_count importance_score = k + trimmed_length current_cost = importance_score + dp(end_idx + 1)
3. Using Global Counter Instead of Local
The Pitfall: Using a single Counter object across all recursive calls, causing frequency counts to bleed between different subarray considerations.
Incorrect Implementation:
class Solution:
def minCost(self, nums: List[int], k: int) -> int:
frequency_map = Counter() # WRONG: Global counter
@cache
def dp(start_idx):
# Using the global frequency_map
for end_idx in range(start_idx, n):
frequency_map[nums[end_idx]] += 1 # Modifies global state
# ... rest of logic
Why It's Wrong: This would accumulate counts across different recursive paths, leading to incorrect frequency calculations.
Correct Solution:
Create a new Counter for each call to dp()
:
def dp(start_idx):
frequency_map = Counter() # Local to this function call
unique_count = 0
# ... rest of logic
4. Forgetting to Reset or Clear Cache Between Test Cases
The Pitfall:
When testing multiple cases or submitting to an online judge, the @cache
decorator might retain values from previous test cases if the function is defined at class level.
Solution: Either:
- Define the cached function inside the main method (as shown in the correct code)
- Explicitly clear the cache if needed:
dp.cache_clear()
- Use manual memoization with a dictionary that's initialized within the method
The provided solution correctly handles this by defining the dp
function inside the minCost
method, ensuring a fresh cache for each problem instance.
Which of the following is a min heap?
Recommended Readings
What is Dynamic Programming Prerequisite DFS problems dfs_intro Backtracking problems backtracking Memoization problems memoization_intro Pruning problems backtracking_pruning Dynamic programming is an algorithmic optimization technique that breaks down a complicated problem into smaller overlapping sub problems in a recursive manner and uses solutions to the sub problems to construct a solution
Coding Interview Patterns Your Personal Dijkstra's Algorithm to Landing Your Dream Job The goal of AlgoMonster is to help you get a job in the shortest amount of time possible in a data driven way We compiled datasets of tech interview problems and broke them down by patterns This way
Recursion Recursion is one of the most important concepts in computer science Simply speaking recursion is the process of a function calling itself Using a real life analogy imagine a scenario where you invite your friends to lunch https assets algo monster recursion jpg You first call Ben and ask
Want a Structured Path to Master System Design Too? Don’t Miss This!