944. Delete Columns to Make Sorted
Problem Description
You have an array of n
strings called strs
, where each string has the same length. These strings can be arranged to form a grid where each string occupies one row.
For example, if strs = ["abc", "bce", "cae"]
, the grid would look like:
abc bce cae
Your task is to find and count the columns that are not sorted in lexicographical order (not in alphabetical order from top to bottom).
Looking at each column:
- Column 0 contains
'a'
,'b'
,'c'
- this is sorted (a ≤ b ≤ c) - Column 1 contains
'b'
,'c'
,'a'
- this is NOT sorted (b ≤ c but c > a) - Column 2 contains
'c'
,'e'
,'e'
- this is sorted (c ≤ e ≤ e)
Since column 1 is not sorted lexicographically, you would need to delete it.
The function should return the total number of columns that need to be deleted because they are not in lexicographical order.
Intuition
To determine if a column needs to be deleted, we need to check if it's sorted lexicographically. The key insight is that we don't need to look at the entire column at once - we can check it incrementally by comparing consecutive pairs of characters.
Think of it this way: a column is sorted if and only if each character is greater than or equal to the one above it. So for a column to be sorted, we need strs[0][j] ≤ strs[1][j] ≤ strs[2][j] ≤ ...
for column j
.
The moment we find any pair where a character is less than the one above it (like strs[i][j] < strs[i-1][j]
), we know this column violates the sorting requirement and must be deleted. Once we find such a violation, there's no need to check the rest of that column - we already know it needs to be deleted.
This leads us to a straightforward approach: iterate through each column, and for each column, compare consecutive pairs of characters from top to bottom. If we find a violation, increment our deletion counter and move to the next column. If we check all pairs in a column without finding a violation, the column is sorted and doesn't need deletion.
The time complexity is O(n * m)
where n
is the number of strings and m
is the length of each string, as we potentially need to check every character in the grid once.
Solution Approach
The implementation follows a column-by-column comparison strategy:
-
Initialize variables: We first get the dimensions of our grid -
m
represents the number of columns (length of each string), andn
represents the number of rows (number of strings). We also initialize a counterans
to track how many columns need to be deleted. -
Iterate through each column: We use the outer loop
for j in range(m)
to traverse each column index from 0 tom-1
. -
Check if column is sorted: For each column
j
, we compare consecutive pairs of characters:- Start from row index 1 (second row) using
for i in range(1, n)
- Compare the current character
strs[i][j]
with the previous row's characterstrs[i-1][j]
- If
strs[i][j] < strs[i-1][j]
, the column is not sorted lexicographically
- Start from row index 1 (second row) using
-
Count unsorted columns: When we find a violation (current character is less than the previous one), we:
- Increment the deletion counter
ans += 1
- Use
break
to exit the inner loop immediately since we've already determined this column needs deletion - Move on to check the next column
- Increment the deletion counter
-
Return the result: After checking all columns, return
ans
which contains the total number of columns that need to be deleted.
The algorithm efficiently identifies unsorted columns with early termination - once a column is found to be unsorted, we don't waste time checking the remaining characters in that column. This optimization ensures we only do the minimum necessary comparisons while maintaining O(n * m)
worst-case complexity when all columns are sorted.
Ready to land your dream job?
Unlock your dream job with a 5-minute evaluator for a personalized learning plan!
Start EvaluatorExample Walkthrough
Let's walk through a small example with strs = ["cba", "daf", "ghi"]
.
First, we arrange this as a grid:
cba daf ghi
Initial Setup:
n = 3
(number of strings/rows)m = 3
(length of each string/columns)ans = 0
(counter for columns to delete)
Column 0 (index j=0):
- Characters:
'c'
,'d'
,'g'
- Compare row 1 with row 0:
'd' >= 'c'
✓ (continues) - Compare row 2 with row 1:
'g' >= 'd'
✓ (continues) - Column is sorted, no deletion needed
Column 1 (index j=1):
- Characters:
'b'
,'a'
,'h'
- Compare row 1 with row 0:
'a' < 'b'
✗ (violation found!) - Increment
ans = 1
- Break early (no need to check
'h'
vs'a'
)
Column 2 (index j=2):
- Characters:
'a'
,'f'
,'i'
- Compare row 1 with row 0:
'f' >= 'a'
✓ (continues) - Compare row 2 with row 1:
'i' >= 'f'
✓ (continues) - Column is sorted, no deletion needed
Result: Return ans = 1
(only column 1 needs deletion)
The key efficiency here is that for column 1, as soon as we found 'a' < 'b'
, we immediately marked it for deletion and moved on without checking the remaining pair in that column. This early termination saves unnecessary comparisons while still correctly identifying all unsorted columns.
Solution Implementation
1class Solution:
2 def minDeletionSize(self, strs: List[str]) -> int:
3 # Get dimensions: number of columns (string length) and rows (number of strings)
4 num_cols = len(strs[0])
5 num_rows = len(strs)
6
7 # Counter for columns that need to be deleted
8 columns_to_delete = 0
9
10 # Check each column
11 for col_idx in range(num_cols):
12 # Check if current column is sorted by comparing adjacent elements
13 for row_idx in range(1, num_rows):
14 # If current character is less than previous character in same column,
15 # the column is not sorted and must be deleted
16 if strs[row_idx][col_idx] < strs[row_idx - 1][col_idx]:
17 columns_to_delete += 1
18 break # No need to check rest of column once we find it's unsorted
19
20 return columns_to_delete
21
1class Solution {
2 public int minDeletionSize(String[] strs) {
3 // Get dimensions: number of columns (string length) and rows (array length)
4 int columnCount = strs[0].length();
5 int rowCount = strs.length;
6
7 // Counter for columns that need to be deleted
8 int deletionCount = 0;
9
10 // Iterate through each column
11 for (int columnIndex = 0; columnIndex < columnCount; columnIndex++) {
12 // Check if current column is sorted by comparing adjacent rows
13 for (int rowIndex = 1; rowIndex < rowCount; rowIndex++) {
14 // If character in current row is less than character in previous row,
15 // the column is not sorted in non-decreasing order
16 if (strs[rowIndex].charAt(columnIndex) < strs[rowIndex - 1].charAt(columnIndex)) {
17 // Increment deletion count for this unsorted column
18 deletionCount++;
19 // No need to check remaining rows for this column
20 break;
21 }
22 }
23 }
24
25 // Return total number of columns to delete
26 return deletionCount;
27 }
28}
29
1class Solution {
2public:
3 int minDeletionSize(vector<string>& strs) {
4 // Get dimensions: number of columns (string length) and rows (number of strings)
5 int numColumns = strs[0].size();
6 int numRows = strs.size();
7
8 // Counter for columns that need to be deleted
9 int deletionCount = 0;
10
11 // Iterate through each column
12 for (int col = 0; col < numColumns; ++col) {
13 // Check if current column is sorted in non-decreasing order
14 for (int row = 1; row < numRows; ++row) {
15 // If current character is less than previous character in same column,
16 // the column is not sorted and needs to be deleted
17 if (strs[row][col] < strs[row - 1][col]) {
18 ++deletionCount;
19 break; // No need to check rest of this column
20 }
21 }
22 }
23
24 return deletionCount;
25 }
26};
27
1/**
2 * Finds the minimum number of columns to delete to make each column sorted
3 * @param strs - Array of equal-length strings
4 * @returns Number of columns that need to be deleted
5 */
6function minDeletionSize(strs: string[]): number {
7 // Get the length of each string (number of columns) and the number of strings (number of rows)
8 const columnCount: number = strs[0].length;
9 const rowCount: number = strs.length;
10
11 // Initialize counter for columns to delete
12 let columnsToDelete: number = 0;
13
14 // Iterate through each column
15 for (let columnIndex = 0; columnIndex < columnCount; columnIndex++) {
16 // Check if the current column is sorted by comparing adjacent rows
17 for (let rowIndex = 1; rowIndex < rowCount; rowIndex++) {
18 // If current character is less than previous character in the same column,
19 // the column is not sorted
20 if (strs[rowIndex][columnIndex] < strs[rowIndex - 1][columnIndex]) {
21 // Increment the deletion counter
22 columnsToDelete++;
23 // No need to check further rows for this column
24 break;
25 }
26 }
27 }
28
29 // Return the total number of columns that need to be deleted
30 return columnsToDelete;
31}
32
Time and Space Complexity
The time complexity is O(m * n)
, where m
is the length of each string and n
is the number of strings in the array. This can also be expressed as O(L)
, where L
is the total length of all strings combined (since L = m * n
). The algorithm iterates through each column position (m
iterations) and for each column, it checks up to n-1
adjacent pairs of strings in the worst case.
The space complexity is O(1)
as the algorithm only uses a constant amount of extra space for variables m
, n
, ans
, and loop indices i
and j
, regardless of the input size.
Learn more about how to find time and space complexity quickly.
Common Pitfalls
1. Confusion Between Row and Column Indices
One of the most common mistakes is mixing up row and column indices when accessing characters. Since we're working with a list of strings, strs[i]
gives us the i-th string (row), and strs[i][j]
gives us the j-th character (column) in that string.
Incorrect approach:
# Wrong: Trying to access column first, then row
for col_idx in range(num_cols):
for row_idx in range(1, num_rows):
if strs[col_idx][row_idx] < strs[col_idx][row_idx - 1]: # This will cause IndexError
Correct approach:
# Correct: Access row first (string), then column (character position)
for col_idx in range(num_cols):
for row_idx in range(1, num_rows):
if strs[row_idx][col_idx] < strs[row_idx - 1][col_idx]:
2. Forgetting to Break After Finding Unsorted Column
Without the break
statement, the code might count the same column multiple times if it has multiple unsorted pairs.
Incorrect approach:
for col_idx in range(num_cols):
for row_idx in range(1, num_rows):
if strs[row_idx][col_idx] < strs[row_idx - 1][col_idx]:
columns_to_delete += 1
# Missing break - will count same column multiple times!
Correct approach:
for col_idx in range(num_cols):
for row_idx in range(1, num_rows):
if strs[row_idx][col_idx] < strs[row_idx - 1][col_idx]:
columns_to_delete += 1
break # Essential to avoid counting the same column multiple times
3. Misunderstanding Lexicographical Order
Some might think we need strict increasing order (no equal consecutive elements), but lexicographical order allows equal consecutive elements.
Incorrect interpretation:
# Wrong: Marking column as unsorted if characters are equal if strs[row_idx][col_idx] <= strs[row_idx - 1][col_idx]: # This is wrong! columns_to_delete += 1
Correct interpretation:
# Correct: Only mark as unsorted if current is strictly less than previous if strs[row_idx][col_idx] < strs[row_idx - 1][col_idx]: columns_to_delete += 1
4. Starting Inner Loop from Wrong Index
The inner loop should start from index 1 (second row) to compare with the previous row. Starting from 0 would cause an out-of-bounds access when trying to compare with row_idx - 1
.
Incorrect approach:
for row_idx in range(num_rows): # Starting from 0
if strs[row_idx][col_idx] < strs[row_idx - 1][col_idx]: # row_idx - 1 = -1 when row_idx = 0
Correct approach:
for row_idx in range(1, num_rows): # Starting from 1
if strs[row_idx][col_idx] < strs[row_idx - 1][col_idx]: # Safe comparison
You are given an array of intervals where intervals[i] = [start_i, end_i]
represent the start and end of the ith
interval. You need to merge all overlapping intervals and return an array of the non-overlapping intervals that cover all the intervals in the input.
Recommended Readings
Coding Interview Patterns Your Personal Dijkstra's Algorithm to Landing Your Dream Job The goal of AlgoMonster is to help you get a job in the shortest amount of time possible in a data driven way We compiled datasets of tech interview problems and broke them down by patterns This way
Recursion Recursion is one of the most important concepts in computer science Simply speaking recursion is the process of a function calling itself Using a real life analogy imagine a scenario where you invite your friends to lunch https assets algo monster recursion jpg You first call Ben and ask
Runtime Overview When learning about algorithms and data structures you'll frequently encounter the term time complexity This concept is fundamental in computer science and offers insights into how long an algorithm takes to complete given a certain input size What is Time Complexity Time complexity represents the amount of time
Want a Structured Path to Master System Design Too? Don’t Miss This!