955. Delete Columns to Make Sorted II
Problem Description
You have an array of n
strings where all strings have the same length. Your task is to delete some columns (character positions) from all strings such that the remaining strings are in lexicographic order.
When you delete a column, you remove the character at that position from every string in the array. For instance, if you have strs = ["abcdef","uvwxyz"]
and you delete columns at indices {0, 2, 3}
, you remove the characters at positions 0, 2, and 3 from both strings, resulting in ["bef", "vyz"]
.
Your goal is to find the minimum number of columns you need to delete so that after deletion, the strings satisfy the condition: strs[0] <= strs[1] <= strs[2] <= ... <= strs[n - 1]
in lexicographic order.
The solution uses a greedy approach that processes each column from left to right. For each column, it checks if keeping that column would violate the lexicographic order. The cut
array tracks which adjacent pairs of strings have already been "separated" (meaning their order has been determined by earlier columns).
The algorithm works as follows:
- For each column
j
, check all adjacent pairs of strings - If a pair hasn't been separated yet (
!cut[i]
) and keeping this column would makeA[i] > A[i+1]
at positionj
, then this column must be deleted - If the column is kept, update the
cut
array for pairs whereA[i].charAt(j) < A[i+1].charAt(j)
, marking them as separated - Count the total number of columns that need to be deleted
The key insight is that once two adjacent strings are properly ordered by a column (one has a smaller character than the other at that position), we don't need to check their order for subsequent columns.
Intuition
The key observation is that we need to process columns from left to right because lexicographic ordering is determined by the leftmost differing character. Think of it like comparing words in a dictionary - you compare the first letters first, and only look at subsequent letters if the earlier ones are equal.
We start by asking: for each column, should we keep it or delete it? A column must be deleted if keeping it would violate the lexicographic order. But here's the crucial insight: once two adjacent strings have been "separated" (meaning one is definitively less than the other based on columns we've already kept), we don't need to worry about their relative order anymore.
Consider two adjacent strings. As we scan columns from left to right:
- If their characters are equal at a column, we need to keep checking future columns
- If the first string has a smaller character, they're now properly ordered - we can stop worrying about this pair
- If the first string has a larger character, we have a problem - but only if this pair hasn't already been properly ordered by an earlier column
This leads us to the cut
array idea. The cut[i]
being true
means that strs[i]
and strs[i+1]
have already been properly ordered by some previous column we kept. Once a pair is "cut" (separated), subsequent columns can't mess up their order.
The greedy approach works because deleting a column never helps establish order - it only removes potential ordering information. So we only delete when absolutely necessary: when keeping a column would create a violation for pairs that haven't been separated yet. This minimizes deletions while ensuring the final result maintains lexicographic order.
Learn more about Greedy patterns.
Solution Approach
The implementation uses a greedy algorithm that processes each column from left to right, deciding whether to keep or delete it based on maintaining lexicographic order.
Data Structures:
cut[]
: A boolean array wherecut[i]
indicates whether strings at indicesi
andi+1
have already been properly ordered by previous columns
Algorithm Steps:
-
Initialize variables:
res = 0
to count deleted columnscut[]
array initialized to allfalse
-
Process each column
j
from 0 towordLen-1
:a. Check if column
j
should be deleted:- For each adjacent pair
(i, i+1)
:- If
!cut[i]
(pair not yet separated) ANDA[i].charAt(j) > A[i+1].charAt(j)
:- This column violates order, must delete it
- Increment
res
and continue to next column using labeledcontinue search
- If
b. If column is kept, update the
cut
array:- For each adjacent pair
(i, i+1)
:- If
A[i].charAt(j) < A[i+1].charAt(j)
:- Set
cut[i] = true
(this pair is now properly ordered)
- Set
- If
- For each adjacent pair
-
Return
res
as the minimum number of deletions
Key Implementation Details:
- The labeled
search:
loop allows us to immediately skip to the next column when we decide to delete the current one - We only check pairs where
!cut[i]
because once a pair is properly ordered, subsequent columns can't violate their order - The algorithm is greedy - we keep every column that doesn't cause a violation, ensuring minimum deletions
Time Complexity: O(n * m)
where n
is the number of strings and m
is the length of each string
Space Complexity: O(n)
for the cut
array
Ready to land your dream job?
Unlock your dream job with a 5-minute evaluator for a personalized learning plan!
Start EvaluatorExample Walkthrough
Let's walk through the algorithm with strs = ["xga", "xfb", "yfa"]
.
Initial Setup:
- We have 3 strings, each with length 3
- Initialize
res = 0
(columns deleted) - Initialize
cut = [false, false]
(tracking pairs: (0,1) and (1,2))
Column 0 (characters: 'x', 'x', 'y'):
- Check adjacent pairs:
- Pair (0,1): 'x' vs 'x' β equal, no violation
- Pair (1,2): 'x' vs 'y' β 'x' < 'y', no violation
- Column is kept (no deletions needed)
- Update
cut
array:- Pair (0,1): 'x' = 'x' β remains
cut[0] = false
- Pair (1,2): 'x' < 'y' β set
cut[1] = true
(this pair is now separated)
- Pair (0,1): 'x' = 'x' β remains
Column 1 (characters: 'g', 'f', 'f'):
- Check adjacent pairs:
- Pair (0,1):
cut[0] = false
, so check: 'g' > 'f' β VIOLATION! - Must delete this column
- Pair (0,1):
- Increment
res = 1
- Continue to next column (no
cut
updates)
Column 2 (characters: 'a', 'b', 'a'):
- Check adjacent pairs:
- Pair (0,1):
cut[0] = false
, so check: 'a' < 'b' β no violation - Pair (1,2):
cut[1] = true
β skip check (already separated)
- Pair (0,1):
- Column is kept
- Update
cut
array:- Pair (0,1): 'a' < 'b' β set
cut[0] = true
- Pair (1,2): already true, no need to update
- Pair (0,1): 'a' < 'b' β set
Result:
- Deleted 1 column (column 1)
- After deletion:
["xa", "xb", "ya"]
which satisfies"xa" β€ "xb" β€ "ya"
- Return
res = 1
The key insight demonstrated: once pair (1,2) was separated at column 0, we never needed to check it again, even though column 2 had 'b' > 'a' for that pair - it didn't matter because their order was already established.
Solution Implementation
1class Solution:
2 def minDeletionSize(self, A: List[str]) -> int:
3 # Handle edge cases: empty list or single string list
4 if not A or len(A) <= 1:
5 return 0
6
7 num_strings = len(A)
8 string_length = len(A[0])
9 deleted_columns = 0
10
11 # Track which pairs of adjacent strings are already sorted
12 # If sorted[i] is True, it means A[i] < A[i+1] lexicographically
13 sorted_pairs = [False] * num_strings
14
15 # Process each column from left to right
16 for col in range(string_length):
17 # Check if current column should be deleted
18 # A column must be deleted if it breaks the sorting order
19 should_delete = False
20
21 for row in range(num_strings - 1):
22 # Only check pairs that aren't already determined to be sorted
23 if not sorted_pairs[row] and A[row][col] > A[row + 1][col]:
24 # Column breaks sorting order, must delete it
25 deleted_columns += 1
26 should_delete = True
27 break # Skip to next column
28
29 # If column was deleted, continue to next column
30 if should_delete:
31 continue
32
33 # Column is valid, update sorted status for string pairs
34 # Mark pairs as sorted if current column makes them strictly increasing
35 for row in range(num_strings - 1):
36 if A[row][col] < A[row + 1][col]:
37 sorted_pairs[row] = True
38
39 return deleted_columns
40
1class Solution {
2 public int minDeletionSize(String[] A) {
3 // Handle edge cases: null or single string array
4 if (A == null || A.length <= 1) {
5 return 0;
6 }
7
8 int numStrings = A.length;
9 int stringLength = A[0].length();
10 int deletedColumns = 0;
11
12 // Track which pairs of adjacent strings are already sorted
13 // If sorted[i] is true, it means A[i] < A[i+1] lexicographically
14 boolean[] sorted = new boolean[numStrings];
15
16 // Process each column from left to right
17 columnLoop:
18 for (int col = 0; col < stringLength; col++) {
19 // Check if current column should be deleted
20 // A column must be deleted if it breaks the sorting order
21 for (int row = 0; row < numStrings - 1; row++) {
22 // Only check pairs that aren't already determined to be sorted
23 if (!sorted[row] && A[row].charAt(col) > A[row + 1].charAt(col)) {
24 // Column breaks sorting order, must delete it
25 deletedColumns++;
26 continue columnLoop; // Skip to next column
27 }
28 }
29
30 // Column is valid, update sorted status for string pairs
31 // Mark pairs as sorted if current column makes them strictly increasing
32 for (int row = 0; row < numStrings - 1; row++) {
33 if (A[row].charAt(col) < A[row + 1].charAt(col)) {
34 sorted[row] = true;
35 }
36 }
37 }
38
39 return deletedColumns;
40 }
41}
42
1class Solution {
2public:
3 int minDeletionSize(vector<string>& A) {
4 // Handle edge cases: empty array or single string
5 if (A.empty() || A.size() <= 1) {
6 return 0;
7 }
8
9 int numStrings = A.size();
10 int stringLength = A[0].length();
11 int deletedColumns = 0;
12
13 // Track which pairs of adjacent strings are already sorted
14 // If sorted[i] is true, it means A[i] < A[i+1] lexicographically
15 vector<bool> sorted(numStrings - 1, false);
16
17 // Process each column from left to right
18 for (int col = 0; col < stringLength; col++) {
19 // Flag to determine if current column should be deleted
20 bool shouldDelete = false;
21
22 // Check if current column breaks the sorting order
23 // A column must be deleted if it makes any unsorted pair decrease
24 for (int row = 0; row < numStrings - 1; row++) {
25 // Only check pairs that aren't already determined to be sorted
26 if (!sorted[row] && A[row][col] > A[row + 1][col]) {
27 // Column breaks sorting order, must delete it
28 shouldDelete = true;
29 break;
30 }
31 }
32
33 if (shouldDelete) {
34 // Increment deletion count and move to next column
35 deletedColumns++;
36 continue;
37 }
38
39 // Column is valid, update sorted status for string pairs
40 // Mark pairs as sorted if current column makes them strictly increasing
41 for (int row = 0; row < numStrings - 1; row++) {
42 if (A[row][col] < A[row + 1][col]) {
43 sorted[row] = true;
44 }
45 }
46 }
47
48 return deletedColumns;
49 }
50};
51
1function minDeletionSize(A: string[]): number {
2 // Handle edge cases: null or single string array
3 if (!A || A.length <= 1) {
4 return 0;
5 }
6
7 const numStrings: number = A.length;
8 const stringLength: number = A[0].length;
9 let deletedColumns: number = 0;
10
11 // Track which pairs of adjacent strings are already sorted
12 // If sorted[i] is true, it means A[i] < A[i+1] lexicographically
13 const sorted: boolean[] = new Array(numStrings).fill(false);
14
15 // Process each column from left to right
16 columnLoop:
17 for (let col = 0; col < stringLength; col++) {
18 // Check if current column should be deleted
19 // A column must be deleted if it breaks the sorting order
20 for (let row = 0; row < numStrings - 1; row++) {
21 // Only check pairs that aren't already determined to be sorted
22 if (!sorted[row] && A[row].charAt(col) > A[row + 1].charAt(col)) {
23 // Column breaks sorting order, must delete it
24 deletedColumns++;
25 continue columnLoop; // Skip to next column
26 }
27 }
28
29 // Column is valid, update sorted status for string pairs
30 // Mark pairs as sorted if current column makes them strictly increasing
31 for (let row = 0; row < numStrings - 1; row++) {
32 if (A[row].charAt(col) < A[row + 1].charAt(col)) {
33 sorted[row] = true;
34 }
35 }
36 }
37
38 return deletedColumns;
39}
40
Time and Space Complexity
Time Complexity: O(n * m)
The algorithm uses nested loops to process the input:
- The outer loop iterates through each column, running
m
times wherem
is the length of each string (wordLen
) - For each column, there are two inner loops that iterate through the array of strings:
- The first inner loop checks if the column should be deleted by comparing adjacent strings, running up to
n-1
times wheren
is the number of strings (len
) - The second inner loop updates the
cut
array, also runningn-1
times
- The first inner loop checks if the column should be deleted by comparing adjacent strings, running up to
- In the worst case, both inner loops execute completely for each column
Therefore, the total time complexity is O(m * n)
where n
is the number of strings and m
is the length of each string.
Space Complexity: O(n)
The algorithm uses additional space for:
- The
cut
boolean array of sizen
to track which positions have been "cut" (i.e., where we've already established lexicographic order) - A few integer variables (
len
,wordLen
,res
) which takeO(1)
space
The dominant space usage comes from the cut
array, making the overall space complexity O(n)
where n
is the number of strings in the input array.
Learn more about how to find time and space complexity quickly.
Common Pitfalls
1. Incorrectly Handling Equal Characters
Pitfall: A common mistake is to mark pairs as "sorted" when characters are equal at a position. If A[i][col] == A[i+1][col]
, some might incorrectly set sorted_pairs[i] = True
, thinking the strings are in order.
Why it's wrong: Equal characters don't determine the lexicographic order - we need to look at future columns to decide. Setting sorted_pairs[i] = True
prematurely would prevent us from checking this pair in future columns, potentially missing violations.
Example:
A = ["ba", "bb"] # At column 0: both have 'b', strings are equal so far # If we incorrectly mark as sorted, we'd miss that at column 1: 'a' > 'b' # This would give us 0 deletions when we actually need 1
Solution: Only mark pairs as sorted when there's a strict inequality (A[i][col] < A[i+1][col]
).
2. Forgetting to Skip Remaining Checks After Deletion Decision
Pitfall: After deciding to delete a column, continuing to check remaining pairs or trying to update the sorted_pairs
array for that column.
Why it's wrong: Once we decide to delete a column, we should immediately move to the next column without any further processing. Continuing to update sorted_pairs
based on a deleted column would be incorrect since those characters won't exist in the final result.
Incorrect Code:
for col in range(string_length):
for row in range(num_strings - 1):
if not sorted_pairs[row] and A[row][col] > A[row + 1][col]:
deleted_columns += 1
# WRONG: Not breaking here, continues checking other pairs
# WRONG: Updates sorted_pairs even after deciding to delete
for row in range(num_strings - 1):
if A[row][col] < A[row + 1][col]:
sorted_pairs[row] = True
Solution: Use a flag or break statement to skip all remaining processing for a deleted column:
should_delete = False
for row in range(num_strings - 1):
if not sorted_pairs[row] and A[row][col] > A[row + 1][col]:
deleted_columns += 1
should_delete = True
break # Stop checking immediately
if should_delete:
continue # Skip to next column without updating sorted_pairs
3. Resetting the sorted_pairs Array
Pitfall: Resetting or reinitializing the sorted_pairs
array for each column.
Why it's wrong: The whole point of the sorted_pairs
array is to remember which pairs have been definitively ordered by previous columns. Resetting it would lose this crucial information and lead to incorrect results.
Solution: Initialize sorted_pairs
only once at the beginning and maintain its state throughout the algorithm. Once a pair is marked as sorted, it stays sorted for all subsequent columns.
You are given an array of intervals where intervals[i] = [start_i, end_i]
represent the start and end of the ith
interval. You need to merge all overlapping intervals and return an array of the non-overlapping intervals that cover all the intervals in the input.
Recommended Readings
Greedy Introduction div class responsive iframe iframe src https www youtube com embed WTslqPbj7I title YouTube video player frameborder 0 allow accelerometer autoplay clipboard write encrypted media gyroscope picture in picture web share allowfullscreen iframe div When do we use greedy Greedy algorithms tend to solve optimization problems Typically they will ask you to calculate the max min of some value Commonly you may see this phrased in the problem as max min longest shortest largest smallest etc These keywords can be identified by just scanning
Coding Interview Patterns Your Personal Dijkstra's Algorithm to Landing Your Dream Job The goal of AlgoMonster is to help you get a job in the shortest amount of time possible in a data driven way We compiled datasets of tech interview problems and broke them down by patterns This way
Recursion Recursion is one of the most important concepts in computer science Simply speaking recursion is the process of a function calling itself Using a real life analogy imagine a scenario where you invite your friends to lunch https assets algo monster recursion jpg You first call Ben and ask
Want a Structured Path to Master System Design Too? Donβt Miss This!