Facebook Pixel

944. Delete Columns to Make Sorted

Problem Description

You have an array of n strings called strs, where each string has the same length. These strings can be arranged to form a grid where each string occupies one row.

For example, if strs = ["abc", "bce", "cae"], the grid would look like:

abc
bce
cae

Your task is to find and count the columns that are not sorted in lexicographical order (not in alphabetical order from top to bottom).

Looking at each column:

  • Column 0 contains 'a', 'b', 'c' - this is sorted (a ≤ b ≤ c)
  • Column 1 contains 'b', 'c', 'a' - this is NOT sorted (b ≤ c but c > a)
  • Column 2 contains 'c', 'e', 'e' - this is sorted (c ≤ e ≤ e)

Since column 1 is not sorted lexicographically, you would need to delete it.

The function should return the total number of columns that need to be deleted because they are not in lexicographical order.

Quick Interview Experience
Help others by sharing your interview experience
Have you seen this problem before?

Intuition

To determine if a column needs to be deleted, we need to check if it's sorted lexicographically. The key insight is that we don't need to look at the entire column at once - we can check it incrementally by comparing consecutive pairs of characters.

Think of it this way: a column is sorted if and only if each character is greater than or equal to the one above it. So for a column to be sorted, we need strs[0][j] ≤ strs[1][j] ≤ strs[2][j] ≤ ... for column j.

The moment we find any pair where a character is less than the one above it (like strs[i][j] < strs[i-1][j]), we know this column violates the sorting requirement and must be deleted. Once we find such a violation, there's no need to check the rest of that column - we already know it needs to be deleted.

This leads us to a straightforward approach: iterate through each column, and for each column, compare consecutive pairs of characters from top to bottom. If we find a violation, increment our deletion counter and move to the next column. If we check all pairs in a column without finding a violation, the column is sorted and doesn't need deletion.

The time complexity is O(n * m) where n is the number of strings and m is the length of each string, as we potentially need to check every character in the grid once.

Solution Approach

The implementation follows a column-by-column comparison strategy:

  1. Initialize variables: We first get the dimensions of our grid - m represents the number of columns (length of each string), and n represents the number of rows (number of strings). We also initialize a counter ans to track how many columns need to be deleted.

  2. Iterate through each column: We use the outer loop for j in range(m) to traverse each column index from 0 to m-1.

  3. Check if column is sorted: For each column j, we compare consecutive pairs of characters:

    • Start from row index 1 (second row) using for i in range(1, n)
    • Compare the current character strs[i][j] with the previous row's character strs[i-1][j]
    • If strs[i][j] < strs[i-1][j], the column is not sorted lexicographically
  4. Count unsorted columns: When we find a violation (current character is less than the previous one), we:

    • Increment the deletion counter ans += 1
    • Use break to exit the inner loop immediately since we've already determined this column needs deletion
    • Move on to check the next column
  5. Return the result: After checking all columns, return ans which contains the total number of columns that need to be deleted.

The algorithm efficiently identifies unsorted columns with early termination - once a column is found to be unsorted, we don't waste time checking the remaining characters in that column. This optimization ensures we only do the minimum necessary comparisons while maintaining O(n * m) worst-case complexity when all columns are sorted.

Ready to land your dream job?

Unlock your dream job with a 5-minute evaluator for a personalized learning plan!

Start Evaluator

Example Walkthrough

Let's walk through a small example with strs = ["cba", "daf", "ghi"].

First, we arrange this as a grid:

cba
daf
ghi

Initial Setup:

  • n = 3 (number of strings/rows)
  • m = 3 (length of each string/columns)
  • ans = 0 (counter for columns to delete)

Column 0 (index j=0):

  • Characters: 'c', 'd', 'g'
  • Compare row 1 with row 0: 'd' >= 'c' ✓ (continues)
  • Compare row 2 with row 1: 'g' >= 'd' ✓ (continues)
  • Column is sorted, no deletion needed

Column 1 (index j=1):

  • Characters: 'b', 'a', 'h'
  • Compare row 1 with row 0: 'a' < 'b' ✗ (violation found!)
  • Increment ans = 1
  • Break early (no need to check 'h' vs 'a')

Column 2 (index j=2):

  • Characters: 'a', 'f', 'i'
  • Compare row 1 with row 0: 'f' >= 'a' ✓ (continues)
  • Compare row 2 with row 1: 'i' >= 'f' ✓ (continues)
  • Column is sorted, no deletion needed

Result: Return ans = 1 (only column 1 needs deletion)

The key efficiency here is that for column 1, as soon as we found 'a' < 'b', we immediately marked it for deletion and moved on without checking the remaining pair in that column. This early termination saves unnecessary comparisons while still correctly identifying all unsorted columns.

Solution Implementation

1class Solution:
2    def minDeletionSize(self, strs: List[str]) -> int:
3        # Get dimensions: number of columns (string length) and rows (number of strings)
4        num_cols = len(strs[0])
5        num_rows = len(strs)
6      
7        # Counter for columns that need to be deleted
8        columns_to_delete = 0
9      
10        # Check each column
11        for col_idx in range(num_cols):
12            # Check if current column is sorted by comparing adjacent elements
13            for row_idx in range(1, num_rows):
14                # If current character is less than previous character in same column,
15                # the column is not sorted and must be deleted
16                if strs[row_idx][col_idx] < strs[row_idx - 1][col_idx]:
17                    columns_to_delete += 1
18                    break  # No need to check rest of column once we find it's unsorted
19      
20        return columns_to_delete
21
1class Solution {
2    public int minDeletionSize(String[] strs) {
3        // Get dimensions: number of columns (string length) and rows (array length)
4        int columnCount = strs[0].length();
5        int rowCount = strs.length;
6      
7        // Counter for columns that need to be deleted
8        int deletionCount = 0;
9      
10        // Iterate through each column
11        for (int columnIndex = 0; columnIndex < columnCount; columnIndex++) {
12            // Check if current column is sorted by comparing adjacent rows
13            for (int rowIndex = 1; rowIndex < rowCount; rowIndex++) {
14                // If character in current row is less than character in previous row,
15                // the column is not sorted in non-decreasing order
16                if (strs[rowIndex].charAt(columnIndex) < strs[rowIndex - 1].charAt(columnIndex)) {
17                    // Increment deletion count for this unsorted column
18                    deletionCount++;
19                    // No need to check remaining rows for this column
20                    break;
21                }
22            }
23        }
24      
25        // Return total number of columns to delete
26        return deletionCount;
27    }
28}
29
1class Solution {
2public:
3    int minDeletionSize(vector<string>& strs) {
4        // Get dimensions: number of columns (string length) and rows (number of strings)
5        int numColumns = strs[0].size();
6        int numRows = strs.size();
7      
8        // Counter for columns that need to be deleted
9        int deletionCount = 0;
10      
11        // Iterate through each column
12        for (int col = 0; col < numColumns; ++col) {
13            // Check if current column is sorted in non-decreasing order
14            for (int row = 1; row < numRows; ++row) {
15                // If current character is less than previous character in same column,
16                // the column is not sorted and needs to be deleted
17                if (strs[row][col] < strs[row - 1][col]) {
18                    ++deletionCount;
19                    break;  // No need to check rest of this column
20                }
21            }
22        }
23      
24        return deletionCount;
25    }
26};
27
1/**
2 * Finds the minimum number of columns to delete to make each column sorted
3 * @param strs - Array of equal-length strings
4 * @returns Number of columns that need to be deleted
5 */
6function minDeletionSize(strs: string[]): number {
7    // Get the length of each string (number of columns) and the number of strings (number of rows)
8    const columnCount: number = strs[0].length;
9    const rowCount: number = strs.length;
10  
11    // Initialize counter for columns to delete
12    let columnsToDelete: number = 0;
13  
14    // Iterate through each column
15    for (let columnIndex = 0; columnIndex < columnCount; columnIndex++) {
16        // Check if the current column is sorted by comparing adjacent rows
17        for (let rowIndex = 1; rowIndex < rowCount; rowIndex++) {
18            // If current character is less than previous character in the same column,
19            // the column is not sorted
20            if (strs[rowIndex][columnIndex] < strs[rowIndex - 1][columnIndex]) {
21                // Increment the deletion counter
22                columnsToDelete++;
23                // No need to check further rows for this column
24                break;
25            }
26        }
27    }
28  
29    // Return the total number of columns that need to be deleted
30    return columnsToDelete;
31}
32

Time and Space Complexity

The time complexity is O(m * n), where m is the length of each string and n is the number of strings in the array. This can also be expressed as O(L), where L is the total length of all strings combined (since L = m * n). The algorithm iterates through each column position (m iterations) and for each column, it checks up to n-1 adjacent pairs of strings in the worst case.

The space complexity is O(1) as the algorithm only uses a constant amount of extra space for variables m, n, ans, and loop indices i and j, regardless of the input size.

Learn more about how to find time and space complexity quickly.

Common Pitfalls

1. Confusion Between Row and Column Indices

One of the most common mistakes is mixing up row and column indices when accessing characters. Since we're working with a list of strings, strs[i] gives us the i-th string (row), and strs[i][j] gives us the j-th character (column) in that string.

Incorrect approach:

# Wrong: Trying to access column first, then row
for col_idx in range(num_cols):
    for row_idx in range(1, num_rows):
        if strs[col_idx][row_idx] < strs[col_idx][row_idx - 1]:  # This will cause IndexError

Correct approach:

# Correct: Access row first (string), then column (character position)
for col_idx in range(num_cols):
    for row_idx in range(1, num_rows):
        if strs[row_idx][col_idx] < strs[row_idx - 1][col_idx]:

2. Forgetting to Break After Finding Unsorted Column

Without the break statement, the code might count the same column multiple times if it has multiple unsorted pairs.

Incorrect approach:

for col_idx in range(num_cols):
    for row_idx in range(1, num_rows):
        if strs[row_idx][col_idx] < strs[row_idx - 1][col_idx]:
            columns_to_delete += 1
            # Missing break - will count same column multiple times!

Correct approach:

for col_idx in range(num_cols):
    for row_idx in range(1, num_rows):
        if strs[row_idx][col_idx] < strs[row_idx - 1][col_idx]:
            columns_to_delete += 1
            break  # Essential to avoid counting the same column multiple times

3. Misunderstanding Lexicographical Order

Some might think we need strict increasing order (no equal consecutive elements), but lexicographical order allows equal consecutive elements.

Incorrect interpretation:

# Wrong: Marking column as unsorted if characters are equal
if strs[row_idx][col_idx] <= strs[row_idx - 1][col_idx]:  # This is wrong!
    columns_to_delete += 1

Correct interpretation:

# Correct: Only mark as unsorted if current is strictly less than previous
if strs[row_idx][col_idx] < strs[row_idx - 1][col_idx]:
    columns_to_delete += 1

4. Starting Inner Loop from Wrong Index

The inner loop should start from index 1 (second row) to compare with the previous row. Starting from 0 would cause an out-of-bounds access when trying to compare with row_idx - 1.

Incorrect approach:

for row_idx in range(num_rows):  # Starting from 0
    if strs[row_idx][col_idx] < strs[row_idx - 1][col_idx]:  # row_idx - 1 = -1 when row_idx = 0

Correct approach:

for row_idx in range(1, num_rows):  # Starting from 1
    if strs[row_idx][col_idx] < strs[row_idx - 1][col_idx]:  # Safe comparison
Discover Your Strengths and Weaknesses: Take Our 5-Minute Quiz to Tailor Your Study Plan:

You are given an array of intervals where intervals[i] = [start_i, end_i] represent the start and end of the ith interval. You need to merge all overlapping intervals and return an array of the non-overlapping intervals that cover all the intervals in the input.


Recommended Readings

Want a Structured Path to Master System Design Too? Don’t Miss This!

Load More