960. Delete Columns to Make Sorted III


Problem Description

The problem provides an array of n strings, strs, with each string being of the same length. The goal is to determine the fewest number of index deletions required so that, after the deletions, the remaining letters within each string form a non-decreasing sequence in lexicographic (alphabetical) order.

To visualize, imagine each string as a row in a grid, with the columns aligned vertically. Deleting an index corresponds to removing a column from this grid. The challenge is to delete the fewest columns necessary so that the letters in each row of the grid strictly increase or stay the same as you move left to right.

The result of the process is a set of indices that, if deleted, would satisfy the condition for every string in the array to be lexicographically ordered from the first to the last character. The task is to return the minimum size of such a set of deletion indices.

Intuition

The approach to this problem involves dynamic programming. Dynamic programming is a strategy for solving problems by breaking them down into simpler subproblems and storing the solutions to these subproblems to avoid redundant work.

Here's the intuition behind the solution:

  1. The problem resembles the classic problem of finding the longest increasing subsequence (LIS), but inverted. Instead of finding the longest subsequence, we're trying to find the minimum number of deletions, which correlates to finding the maximum length of an "ordered" subsequence that doesn't require deletion.

  2. We initialize a dynamic programming array dp where dp[i] represents the length of the longest ordered subsequence ending with the ith character (inclusive).

  3. We iterate over every pair i and j, where i > j. If the jth character is less than or equal to the ith character for all strings, it means we can extend the ordered subsequence ending at j to include i. We update dp[i] if this subsequence is longer than the current one.

  4. The maximum value in the dp array represents the length of the longest possible ordered subsequence that we can obtain without any deletions. Therefore, we subtract this value from the total number of columns n to get the minimum number of deletions required.

This solution essentially finds the longest subsequence of columns that are already in non-decreasing order for all rows and removes the rest, minimizing the number of deletions required while achieving the goal.

Learn more about Dynamic Programming patterns.

Not Sure What to Study? Take the 2-min Quiz to Find Your Missing Piece:

Which of the two traversal algorithms (BFS and DFS) can be used to find whether two nodes are connected?

Solution Approach

The implementation of the solution utilizes dynamic programming, which is evident from the use of the dp array where each dp[i] keeps track of the length of the longest non-decreasing subsequence of characters ending with the ith column. Let's walk through the approach step by step:

  1. Define n to be the length of the strings, which is also the number of columns if each string is considered a row in a grid.

  2. Initialize the dp array of size n with all elements set to 1. Each dp[i] represents the length of the longest non-decreasing subsequence of columns when considering columns 0 to i. Initially, we set them all to 1 since each column by itself can be a subsequence.

  3. Use two nested loops with indices i and j, where i ranges from 1 to n-1 and j ranges from 0 to i-1. These loops are used to find the length of the longest non-decreasing subsequence ending at each column i.

  4. For each pair of i and j, check if the jth column is less than or equal to the ith column for all strings (all(s[j] <= s[i] for s in strs)). This ensures that including the character at column i after j maintains the non-decreasing order.

  5. If the condition is true, update dp[i] to the maximum of its current value or dp[j] + 1. In other words, extend the length of the ordered subsequence ending at j by one to include i.

  6. After filling the dp array, find the length of the longest non-decreasing subsequence by finding the maximum value in dp (max(dp)).

  7. Since we need to find the minimum number of deletions, subtract the length of the longest non-decreasing subsequence from the total number of columns, n - max(dp). This gives us the minimum columns that need to be deleted to ensure all strings are in lexicographic order.

The principle algorithms and patterns used in this solution include:

  • Dynamic Programming: Through the dp array to store intermediate results and avoid redundant computations.
  • Nested Loops: To compare every possible pair of columns.
  • Greedy Choice: At each step, choosing the longest subsequence ending at j that can be extended by i.

By incorporating dynamic programming, the solution effectively leverages previously computed results to build upon and find the longest subsequence, reducing the complexity compared to naive approaches that might check every possible combination of deletions.

Discover Your Strengths and Weaknesses: Take Our 2-Minute Quiz to Tailor Your Study Plan:

What is the worst case running time for finding an element in a binary search tree(not necessarily balanced) of size n?

Example Walkthrough

Let's use a small example to illustrate the solution approach with the given strings array strs = ["cba", "daf", "ghi"].

  1. Since each string has three characters, we have n = 3.

  2. Initialize the dp array with n elements to [1, 1, 1] because each character alone is a non-decreasing subsequence.

  3. We start the nested loop iteration.

    • For i = 1, we compare with j = 0.
      • Compare every string's character at index j with index i.
      • For "cba", "daf", and "ghi", we compare pairs ("c", "b"), ("d", "a"), ("g", "h").
      • Since "c" > "b", "d" > "a", "g" > "h", the condition isn't met and we don't update dp[1].
    • Move to i = 2, again, compare with j = 0 and then j = 1.
      • Comparing index 0 with 2, for "cba" ("c", "a"), "daf" ("d", "f"), and "ghi" ("g", "i"), it's not non-decreasing for all sequences.
      • Now compare index 1 with 2, for "cba" ("b", "a"), "daf" ("a", "f"), and "ghi" ("h", "i"). Only in "ghi" is "h" <= "i". Since not all strings fulfill the condition, dp[2] remains unchanged.
  4. At this point, our dp array is still [1, 1, 1] as no increasing subsequences were found that include more than one column.

  5. The length of the longest non-decreasing subsequence is simply the max value in dp, which is 1.

  6. To find the minimum number of deletions, n - max(dp), we subtract the longest subsequence's length (1) from the total number of columns (3).

Therefore, 3 - 1 = 2 is the minimum number of deletions required. For this example:

  • Deleting the first and second columns ("c" from "cba", "d" from "daf", "g" from "ghi" and "b" from "cba", "a" from "daf", "h" from "ghi") leaves us with ["a", "f", "i"], which are in non-decreasing order for each string.

By following the dynamic programming approach, we efficiently found the minimum number of columns to delete without exhaustively checking every combination. This example illustrates the steps and logic of the solution approach clearly.

Not Sure What to Study? Take the 2-min Quiz:

Which of the following is a min heap?

Python Solution

1from typing import List
2
3class Solution:
4    def minDeletionSize(self, strs: List[str]) -> int:
5        # Calculate the length of the strings (assuming all strings have the same length)
6        str_length = len(strs[0])
7      
8        # Initialize the dynamic programming (DP) array where dp[i] represents the 
9        # length of the longest subsequence that ends with the i-th character.
10        dp = [1] * str_length
11      
12        # Iterate over each character in the strings starting from the second character
13        for i in range(1, str_length):
14            # Compare the current character with all characters before it
15            for j in range(i):
16                # Check if the current character is greater than or equal to all 
17                # corresponding characters in previous columns
18                if all(s[j] <= s[i] for s in strs):
19                    # Update the DP array by considering the length of the subsequence 
20                    # ending at the j-th character plus the current character
21                    dp[i] = max(dp[i], dp[j] + 1)
22      
23        # Calculate the minimum number of columns to delete by subtracting the length 
24        # of the longest increasing subsequence from the total number of columns
25        return str_length - max(dp)
26
27# The above code finds the length of the longest subsequence of characters that is
28# increasing across all strings. Then, it subtracts this length from the total number
29# of columns (characters in a string) to find the minimum number of deletions required.
30

Java Solution

1class Solution {
2    public int minDeletionSize(String[] strs) {
3        int numColumns = strs[0].length(); // Length of the strings, representing the number of columns.
4        int[] longestIncreasingSubsequence = new int[numColumns]; // DP array to store the length of longest increasing subsequences ending at each index.
5        Arrays.fill(longestIncreasingSubsequence, 1); // Initially, each sequence is of length 1 (the character itself).
6        int maxSequenceLength = 1; // Initialize the maximum subsequence length to 1.
7
8        // Iterate over all pairs of columns.
9        for (int currentIndex = 1; currentIndex < numColumns; ++currentIndex) {
10            for (int previousIndex = 0; previousIndex < currentIndex; ++previousIndex) {
11                // If the current column can follow the previous column in the increasing subsequence...
12                if (isNonDecreasingAcrossStrings(previousIndex, currentIndex, strs)) {
13                    // Update the longest subsequence for the current column if it's greater after adding the current column.
14                    longestIncreasingSubsequence[currentIndex] = Math.max(
15                            longestIncreasingSubsequence[currentIndex],
16                            longestIncreasingSubsequence[previousIndex] + 1
17                    );
18                }
19            }
20            // Update maximum subsequence length found so far.
21            maxSequenceLength = Math.max(maxSequenceLength, longestIncreasingSubsequence[currentIndex]);
22        }
23      
24        // The minimum number of deletions is the total columns minus the length of the longest increasing subsequence.
25        return numColumns - maxSequenceLength;
26    }
27
28    // Helper method to check if all strings have a non-decreasing order from column 'prevIndex' to 'currentIndex'.
29    private boolean isNonDecreasingAcrossStrings(int prevIndex, int currentIndex, String[] strs) {
30        for (String str : strs) {
31            if (str.charAt(currentIndex) < str.charAt(prevIndex)) {
32                return false; // If any string has a decreasing pair, return false.
33            }
34        }
35        return true; // All strings have non-decreasing order for this column index pair.
36    }
37}
38

C++ Solution

1#include <vector>
2#include <string>
3#include <algorithm>
4
5using namespace std;
6
7class Solution {
8public:
9    // This method returns the minimum number of columns that need to be deleted
10    // so that the remaining columns are in non-decreasing sorted order
11    int minDeletionSize(vector<string>& strs) {
12        // Get the number of columns in the strings
13        int columnCount = strs[0].size();
14      
15        // Create a dynamic programming table where dp[i] represents the length
16        // of the longest non-decreasing subsequence that ends with column i
17        vector<int> dp(columnCount, 1);
18      
19        // Store the maximum length of non-decreasing subsequence found
20        int maxLength = 1;
21      
22        // Iterate over each column to find the longest non-decreasing subsequence
23        for (int current = 1; current < columnCount; ++current) {
24            for (int previous = 0; previous < current; ++previous) {
25              
26                // If the current column is in non-decreasing order compared to the previous column
27                if (isNonDecreasing(current, previous, strs)) {
28                    // Update dp[current] with the maximum length sequence found thus far
29                    dp[current] = max(dp[current], dp[previous] + 1);
30                }
31            }
32          
33            // Update the maximum length
34            maxLength = max(maxLength, dp[current]);
35        }
36      
37        // The number of columns to delete is the total number minus the length of the longest sequence
38        return columnCount - maxLength;
39    }
40
41    // This helper method checks if column 'current' is in non-decreasing order compared to column 'previous'
42    bool isNonDecreasing(int current, int previous, vector<string>& strs) {
43        // Iterate over each row
44        for (string& s : strs) {
45            // If any corresponding pair of characters in current and previous columns is in decreasing order, return false
46            if (s[current] < s[previous])
47                return false;
48        }
49      
50        // Return true if all pairs are in non-decreasing order
51        return true;
52    }
53};
54

Typescript Solution

1function minDeletionSize(strs: string[]): number {
2    // Get the number of columns in the strings
3    const columnCount = strs[0].length;
4  
5    // Create a dynamic programming array where dp[i] represents the length
6    // of the longest non-decreasing subsequence that ends with column i
7    const dp: number[] = new Array(columnCount).fill(1);
8  
9    // Store the maximum length of non-decreasing subsequence found
10    let maxLength = 1;
11  
12    // Iterate over each column to find the longest non-decreasing subsequence
13    for (let current = 1; current < columnCount; ++current) {
14        for (let previous = 0; previous < current; ++previous) {
15          
16            // If the current column is in non-decreasing order compared to the previous column
17            if (isNonDecreasing(current, previous, strs)) {
18                // Update dp[current] with the maximum length sequence found thus far
19                dp[current] = Math.max(dp[current], dp[previous] + 1);
20            }
21        }
22      
23        // Update the maximum length
24        maxLength = Math.max(maxLength, dp[current]);
25    }
26  
27    // The number of columns to delete is the total number of columns minus the length of the longest sequence
28    return columnCount - maxLength;
29}
30
31// This helper function checks if column 'current' is in non-decreasing order compared to column 'previous'
32function isNonDecreasing(current: number, previous: number, strs: string[]): boolean {
33    // Iterate over each row
34    for (let s of strs) {
35        // If any corresponding pair of characters in current and previous columns is in decreasing order, return false
36        if (s[current] < s[previous]) {
37            return false;
38        }
39    }
40  
41    // Return true if all pairs are in non-decreasing order
42    return true;
43}
44
Fast Track Your Learning with Our Quick Skills Quiz:

What are the most two important steps in writing a depth first search function? (Select 2)

Time and Space Complexity

The provided code performs a dynamic programming approach to find the minimum number of columns that need to be deleted from a list of equal-length strings to ensure that the remaining columns are lexicographically sorted. Here's an analysis of its complexities:

Time Complexity

The time complexity of the code is O(n^2 * m), where n is the length of each string in strs and m is the number of strings. This is because the code uses a nested loop structure where the outer loop runs n times (the number of columns), and the inner loop also runs up to n times for each iteration of the outer loop. Inside the inner loop, there is a comparison that runs m times for checking if every string maintains the lexicographic order for the pair of columns being considered. Multiplying all these together gives O(n^2 * m) as the time complexity.

Space Complexity

The space complexity of the function is O(n), which is due to the dynamic programming array dp of size n, where each element represents the length of the longest subsequence of sorted columns including the current column as the last sorted column. Additional space usage is constant and doesn't scale with input size (n or m), hence the overall space complexity is O(n).

Learn more about how to find time and space complexity quickly.


Recommended Readings


Got a question? Ask the Teaching Assistant anything you don't understand.

Still not clear? Ask in the Forum,  Discord or Submit the part you don't understand to our editors.


TA 👨‍🏫