Facebook Pixel

955. Delete Columns to Make Sorted II

Problem Description

You have an array of n strings where all strings have the same length. Your task is to delete some columns (character positions) from all strings such that the remaining strings are in lexicographic order.

When you delete a column, you remove the character at that position from every string in the array. For instance, if you have strs = ["abcdef","uvwxyz"] and you delete columns at indices {0, 2, 3}, you remove the characters at positions 0, 2, and 3 from both strings, resulting in ["bef", "vyz"].

Your goal is to find the minimum number of columns you need to delete so that after deletion, the strings satisfy the condition: strs[0] <= strs[1] <= strs[2] <= ... <= strs[n - 1] in lexicographic order.

The solution uses a greedy approach that processes each column from left to right. For each column, it checks if keeping that column would violate the lexicographic order. The cut array tracks which adjacent pairs of strings have already been "separated" (meaning their order has been determined by earlier columns).

The algorithm works as follows:

  • For each column j, check all adjacent pairs of strings
  • If a pair hasn't been separated yet (!cut[i]) and keeping this column would make A[i] > A[i+1] at position j, then this column must be deleted
  • If the column is kept, update the cut array for pairs where A[i].charAt(j) < A[i+1].charAt(j), marking them as separated
  • Count the total number of columns that need to be deleted

The key insight is that once two adjacent strings are properly ordered by a column (one has a smaller character than the other at that position), we don't need to check their order for subsequent columns.

Quick Interview Experience
Help others by sharing your interview experience
Have you seen this problem before?

Intuition

The key observation is that we need to process columns from left to right because lexicographic ordering is determined by the leftmost differing character. Think of it like comparing words in a dictionary - you compare the first letters first, and only look at subsequent letters if the earlier ones are equal.

We start by asking: for each column, should we keep it or delete it? A column must be deleted if keeping it would violate the lexicographic order. But here's the crucial insight: once two adjacent strings have been "separated" (meaning one is definitively less than the other based on columns we've already kept), we don't need to worry about their relative order anymore.

Consider two adjacent strings. As we scan columns from left to right:

  1. If their characters are equal at a column, we need to keep checking future columns
  2. If the first string has a smaller character, they're now properly ordered - we can stop worrying about this pair
  3. If the first string has a larger character, we have a problem - but only if this pair hasn't already been properly ordered by an earlier column

This leads us to the cut array idea. The cut[i] being true means that strs[i] and strs[i+1] have already been properly ordered by some previous column we kept. Once a pair is "cut" (separated), subsequent columns can't mess up their order.

The greedy approach works because deleting a column never helps establish order - it only removes potential ordering information. So we only delete when absolutely necessary: when keeping a column would create a violation for pairs that haven't been separated yet. This minimizes deletions while ensuring the final result maintains lexicographic order.

Learn more about Greedy patterns.

Solution Approach

The implementation uses a greedy algorithm that processes each column from left to right, deciding whether to keep or delete it based on maintaining lexicographic order.

Data Structures:

  • cut[]: A boolean array where cut[i] indicates whether strings at indices i and i+1 have already been properly ordered by previous columns

Algorithm Steps:

  1. Initialize variables:

    • res = 0 to count deleted columns
    • cut[] array initialized to all false
  2. Process each column j from 0 to wordLen-1:

    a. Check if column j should be deleted:

    • For each adjacent pair (i, i+1):
      • If !cut[i] (pair not yet separated) AND A[i].charAt(j) > A[i+1].charAt(j):
        • This column violates order, must delete it
        • Increment res and continue to next column using labeled continue search

    b. If column is kept, update the cut array:

    • For each adjacent pair (i, i+1):
      • If A[i].charAt(j) < A[i+1].charAt(j):
        • Set cut[i] = true (this pair is now properly ordered)
  3. Return res as the minimum number of deletions

Key Implementation Details:

  • The labeled search: loop allows us to immediately skip to the next column when we decide to delete the current one
  • We only check pairs where !cut[i] because once a pair is properly ordered, subsequent columns can't violate their order
  • The algorithm is greedy - we keep every column that doesn't cause a violation, ensuring minimum deletions

Time Complexity: O(n * m) where n is the number of strings and m is the length of each string Space Complexity: O(n) for the cut array

Ready to land your dream job?

Unlock your dream job with a 5-minute evaluator for a personalized learning plan!

Start Evaluator

Example Walkthrough

Let's walk through the algorithm with strs = ["xga", "xfb", "yfa"].

Initial Setup:

  • We have 3 strings, each with length 3
  • Initialize res = 0 (columns deleted)
  • Initialize cut = [false, false] (tracking pairs: (0,1) and (1,2))

Column 0 (characters: 'x', 'x', 'y'):

  • Check adjacent pairs:
    • Pair (0,1): 'x' vs 'x' β†’ equal, no violation
    • Pair (1,2): 'x' vs 'y' β†’ 'x' < 'y', no violation
  • Column is kept (no deletions needed)
  • Update cut array:
    • Pair (0,1): 'x' = 'x' β†’ remains cut[0] = false
    • Pair (1,2): 'x' < 'y' β†’ set cut[1] = true (this pair is now separated)

Column 1 (characters: 'g', 'f', 'f'):

  • Check adjacent pairs:
    • Pair (0,1): cut[0] = false, so check: 'g' > 'f' β†’ VIOLATION!
    • Must delete this column
  • Increment res = 1
  • Continue to next column (no cut updates)

Column 2 (characters: 'a', 'b', 'a'):

  • Check adjacent pairs:
    • Pair (0,1): cut[0] = false, so check: 'a' < 'b' β†’ no violation
    • Pair (1,2): cut[1] = true β†’ skip check (already separated)
  • Column is kept
  • Update cut array:
    • Pair (0,1): 'a' < 'b' β†’ set cut[0] = true
    • Pair (1,2): already true, no need to update

Result:

  • Deleted 1 column (column 1)
  • After deletion: ["xa", "xb", "ya"] which satisfies "xa" ≀ "xb" ≀ "ya"
  • Return res = 1

The key insight demonstrated: once pair (1,2) was separated at column 0, we never needed to check it again, even though column 2 had 'b' > 'a' for that pair - it didn't matter because their order was already established.

Solution Implementation

1class Solution:
2    def minDeletionSize(self, A: List[str]) -> int:
3        # Handle edge cases: empty list or single string list
4        if not A or len(A) <= 1:
5            return 0
6      
7        num_strings = len(A)
8        string_length = len(A[0])
9        deleted_columns = 0
10      
11        # Track which pairs of adjacent strings are already sorted
12        # If sorted[i] is True, it means A[i] < A[i+1] lexicographically
13        sorted_pairs = [False] * num_strings
14      
15        # Process each column from left to right
16        for col in range(string_length):
17            # Check if current column should be deleted
18            # A column must be deleted if it breaks the sorting order
19            should_delete = False
20          
21            for row in range(num_strings - 1):
22                # Only check pairs that aren't already determined to be sorted
23                if not sorted_pairs[row] and A[row][col] > A[row + 1][col]:
24                    # Column breaks sorting order, must delete it
25                    deleted_columns += 1
26                    should_delete = True
27                    break  # Skip to next column
28          
29            # If column was deleted, continue to next column
30            if should_delete:
31                continue
32          
33            # Column is valid, update sorted status for string pairs
34            # Mark pairs as sorted if current column makes them strictly increasing
35            for row in range(num_strings - 1):
36                if A[row][col] < A[row + 1][col]:
37                    sorted_pairs[row] = True
38      
39        return deleted_columns
40
1class Solution {
2    public int minDeletionSize(String[] A) {
3        // Handle edge cases: null or single string array
4        if (A == null || A.length <= 1) {
5            return 0;
6        }
7      
8        int numStrings = A.length;
9        int stringLength = A[0].length();
10        int deletedColumns = 0;
11      
12        // Track which pairs of adjacent strings are already sorted
13        // If sorted[i] is true, it means A[i] < A[i+1] lexicographically
14        boolean[] sorted = new boolean[numStrings];
15      
16        // Process each column from left to right
17        columnLoop:
18        for (int col = 0; col < stringLength; col++) {
19            // Check if current column should be deleted
20            // A column must be deleted if it breaks the sorting order
21            for (int row = 0; row < numStrings - 1; row++) {
22                // Only check pairs that aren't already determined to be sorted
23                if (!sorted[row] && A[row].charAt(col) > A[row + 1].charAt(col)) {
24                    // Column breaks sorting order, must delete it
25                    deletedColumns++;
26                    continue columnLoop;  // Skip to next column
27                }
28            }
29          
30            // Column is valid, update sorted status for string pairs
31            // Mark pairs as sorted if current column makes them strictly increasing
32            for (int row = 0; row < numStrings - 1; row++) {
33                if (A[row].charAt(col) < A[row + 1].charAt(col)) {
34                    sorted[row] = true;
35                }
36            }
37        }
38      
39        return deletedColumns;
40    }
41}
42
1class Solution {
2public:
3    int minDeletionSize(vector<string>& A) {
4        // Handle edge cases: empty array or single string
5        if (A.empty() || A.size() <= 1) {
6            return 0;
7        }
8      
9        int numStrings = A.size();
10        int stringLength = A[0].length();
11        int deletedColumns = 0;
12      
13        // Track which pairs of adjacent strings are already sorted
14        // If sorted[i] is true, it means A[i] < A[i+1] lexicographically
15        vector<bool> sorted(numStrings - 1, false);
16      
17        // Process each column from left to right
18        for (int col = 0; col < stringLength; col++) {
19            // Flag to determine if current column should be deleted
20            bool shouldDelete = false;
21          
22            // Check if current column breaks the sorting order
23            // A column must be deleted if it makes any unsorted pair decrease
24            for (int row = 0; row < numStrings - 1; row++) {
25                // Only check pairs that aren't already determined to be sorted
26                if (!sorted[row] && A[row][col] > A[row + 1][col]) {
27                    // Column breaks sorting order, must delete it
28                    shouldDelete = true;
29                    break;
30                }
31            }
32          
33            if (shouldDelete) {
34                // Increment deletion count and move to next column
35                deletedColumns++;
36                continue;
37            }
38          
39            // Column is valid, update sorted status for string pairs
40            // Mark pairs as sorted if current column makes them strictly increasing
41            for (int row = 0; row < numStrings - 1; row++) {
42                if (A[row][col] < A[row + 1][col]) {
43                    sorted[row] = true;
44                }
45            }
46        }
47      
48        return deletedColumns;
49    }
50};
51
1function minDeletionSize(A: string[]): number {
2    // Handle edge cases: null or single string array
3    if (!A || A.length <= 1) {
4        return 0;
5    }
6  
7    const numStrings: number = A.length;
8    const stringLength: number = A[0].length;
9    let deletedColumns: number = 0;
10  
11    // Track which pairs of adjacent strings are already sorted
12    // If sorted[i] is true, it means A[i] < A[i+1] lexicographically
13    const sorted: boolean[] = new Array(numStrings).fill(false);
14  
15    // Process each column from left to right
16    columnLoop:
17    for (let col = 0; col < stringLength; col++) {
18        // Check if current column should be deleted
19        // A column must be deleted if it breaks the sorting order
20        for (let row = 0; row < numStrings - 1; row++) {
21            // Only check pairs that aren't already determined to be sorted
22            if (!sorted[row] && A[row].charAt(col) > A[row + 1].charAt(col)) {
23                // Column breaks sorting order, must delete it
24                deletedColumns++;
25                continue columnLoop;  // Skip to next column
26            }
27        }
28      
29        // Column is valid, update sorted status for string pairs
30        // Mark pairs as sorted if current column makes them strictly increasing
31        for (let row = 0; row < numStrings - 1; row++) {
32            if (A[row].charAt(col) < A[row + 1].charAt(col)) {
33                sorted[row] = true;
34            }
35        }
36    }
37  
38    return deletedColumns;
39}
40

Time and Space Complexity

Time Complexity: O(n * m)

The algorithm uses nested loops to process the input:

  • The outer loop iterates through each column, running m times where m is the length of each string (wordLen)
  • For each column, there are two inner loops that iterate through the array of strings:
    • The first inner loop checks if the column should be deleted by comparing adjacent strings, running up to n-1 times where n is the number of strings (len)
    • The second inner loop updates the cut array, also running n-1 times
  • In the worst case, both inner loops execute completely for each column

Therefore, the total time complexity is O(m * n) where n is the number of strings and m is the length of each string.

Space Complexity: O(n)

The algorithm uses additional space for:

  • The cut boolean array of size n to track which positions have been "cut" (i.e., where we've already established lexicographic order)
  • A few integer variables (len, wordLen, res) which take O(1) space

The dominant space usage comes from the cut array, making the overall space complexity O(n) where n is the number of strings in the input array.

Learn more about how to find time and space complexity quickly.

Common Pitfalls

1. Incorrectly Handling Equal Characters

Pitfall: A common mistake is to mark pairs as "sorted" when characters are equal at a position. If A[i][col] == A[i+1][col], some might incorrectly set sorted_pairs[i] = True, thinking the strings are in order.

Why it's wrong: Equal characters don't determine the lexicographic order - we need to look at future columns to decide. Setting sorted_pairs[i] = True prematurely would prevent us from checking this pair in future columns, potentially missing violations.

Example:

A = ["ba", "bb"]
# At column 0: both have 'b', strings are equal so far
# If we incorrectly mark as sorted, we'd miss that at column 1: 'a' > 'b'
# This would give us 0 deletions when we actually need 1

Solution: Only mark pairs as sorted when there's a strict inequality (A[i][col] < A[i+1][col]).

2. Forgetting to Skip Remaining Checks After Deletion Decision

Pitfall: After deciding to delete a column, continuing to check remaining pairs or trying to update the sorted_pairs array for that column.

Why it's wrong: Once we decide to delete a column, we should immediately move to the next column without any further processing. Continuing to update sorted_pairs based on a deleted column would be incorrect since those characters won't exist in the final result.

Incorrect Code:

for col in range(string_length):
    for row in range(num_strings - 1):
        if not sorted_pairs[row] and A[row][col] > A[row + 1][col]:
            deleted_columns += 1
            # WRONG: Not breaking here, continues checking other pairs
  
    # WRONG: Updates sorted_pairs even after deciding to delete
    for row in range(num_strings - 1):
        if A[row][col] < A[row + 1][col]:
            sorted_pairs[row] = True

Solution: Use a flag or break statement to skip all remaining processing for a deleted column:

should_delete = False
for row in range(num_strings - 1):
    if not sorted_pairs[row] and A[row][col] > A[row + 1][col]:
        deleted_columns += 1
        should_delete = True
        break  # Stop checking immediately

if should_delete:
    continue  # Skip to next column without updating sorted_pairs

3. Resetting the sorted_pairs Array

Pitfall: Resetting or reinitializing the sorted_pairs array for each column.

Why it's wrong: The whole point of the sorted_pairs array is to remember which pairs have been definitively ordered by previous columns. Resetting it would lose this crucial information and lead to incorrect results.

Solution: Initialize sorted_pairs only once at the beginning and maintain its state throughout the algorithm. Once a pair is marked as sorted, it stays sorted for all subsequent columns.

Discover Your Strengths and Weaknesses: Take Our 5-Minute Quiz to Tailor Your Study Plan:

You are given an array of intervals where intervals[i] = [start_i, end_i] represent the start and end of the ith interval. You need to merge all overlapping intervals and return an array of the non-overlapping intervals that cover all the intervals in the input.


Recommended Readings

Want a Structured Path to Master System Design Too? Don’t Miss This!

Load More