1143. Longest Common Subsequence


Problem Description

In this LeetCode problem, we are dealing with finding the longest common subsequence between two strings text1 and text2. A subsequence is defined as a sequence that can be derived from another sequence by deleting some elements without changing the order of the remaining elements. A common subsequence means a sequence that exists in both strings.

The challenge is to determine the length of the longest sequence that exists in both text1 and text2. If there is no such sequence, then the result should be 0.

The task is not to be confused with finding common substrings (which are required to occupy consecutive positions within the original strings). Subsequences are more forgiving as they are not bound by the need for continuity.

Intuition

The solution leverages Dynamic Programming (DP), a method for solving complex problems by breaking them down into simpler subproblems. The idea is to build up a solution using previously computed results for smaller problems.

To approach this, we create a 2D array, f, with dimensions (m+1) x (n+1), where m and n are the lengths of text1 and text2, respectively. The value of f[i][j] will hold the length of the longest common subsequence between the first i characters of text1 and the first j characters of text2.

We iterate over both strings and fill this table based on the following rules:

  • If the characters at the current position in both strings match, then the longest common subsequence would be that of the previous characters of both strings plus one (because we include this matching character).
  • If the characters do not match, we take the maximum of two possible cases:
    • Including one less character from text1 and the current number of characters from text2.
    • Including the current number of characters from text1 and one less from text2.

The final answer, the length of the longest common subsequence, will be the value stored in f[m][n] after the entire table is filled.

Learn more about Dynamic Programming patterns.

Not Sure What to Study? Take the 2-min Quiz to Find Your Missing Piece

What is the worst case running time for finding an element in a binary tree (not necessarily binary search tree) of size n?

Solution Approach

The implemented solution uses a two-dimensional dynamic programming table f to store the lengths of longest common subsequences for different parts of text1 and text2. The structure of f makes it easy to build the solution incrementally. Each entry f[i][j] in this table represents the length of the longest common subsequence between the first i characters of text1 and the first j characters of text2.

The dynamic programming algorithm follows these steps:

  1. Initialize a 2D array f with (m + 1) rows and (n + 1) columns to 0, where m is the length of text1 and n is the length of text2. The extra row and column are used to handle the base case where one of the strings is of length 0, which naturally results in a common subsequence of length 0.

  2. Iterate over the rows of the array (corresponding to each character in text1) and the columns (corresponding to each character in text2), starting from index 1 to include the first character from each string.

  3. At each (i, j) position, check if characters text1[i - 1] and text2[j - 1] are equal. If they are, set f[i][j] = f[i - 1][j - 1] + 1. This step follows from the fact that if the current characters match, the longest common subsequence would extend from the longest subsequence obtained without including these characters, hence we add 1.

  4. If the characters at text1[i - 1] and text2[j - 1] aren't equal, we have to choose the longer subsequence that we could obtain by either discarding the current character from text1 or text2. Thus we set f[i][j] = max(f[i - 1][j], f[i][j - 1]).

  5. After completing the iteration process, the value f[m][n] gives the length of the longest common subsequence of text1 and text2.

This algorithm uses dynamic programming's bottom-up approach, where we start by solving the smallest subproblems (i.e., subsequences of length 0 or 1) and use their solutions to build up to the answer for the entire sequence. The choice between when to add + 1 to the subsequence length or when to take the maximum of two choices is determined by the match between characters.

The beauty of this approach lies in its efficiency and the fact that it guarantees the optimal solution by systematically exploring all possibilities and making the optimal choice at each step, based on the choices made in the previous steps.

Discover Your Strengths and Weaknesses: Take Our 2-Minute Quiz to Tailor Your Study Plan:

What are the most two important steps in writing a depth first search function? (Select 2)

Example Walkthrough

Let's take a small example with text1 = "abcde" and text2 = "ace". Here, we want to find the length of the longest common subsequence between these two strings.

  1. We initialize a 2D array f with dimensions (5 + 1) x (3 + 1) to 0, resulting in a table with 6 rows and 4 columns. Initially, the table is filled with zeros. The extra row and column act as base cases for subproblems where one of the strings is empty.

  2. We start iterating from i = 1 to 5 (for text1) and j = 1 to 3 (for text2). For understanding, let's describe each character's ASCII value from each string: text1 'a'=97, 'b'=98, etc.; text2 'a'=97, 'c'=99, 'e'=101.

  3. When i = 1 and j = 1, we compare text1[i - 1] (which is 'a') with text2[j - 1] (which is also 'a'). Since they match, we set f[i][j] to f[i - 1][j - 1] + 1. Thus f[1][1] = f[0][0] + 1 = 1.

  4. We continue this process. When i = 2 and j = 1, the characters text1[i - 1] (which is 'b') and text2[j - 1] (which is 'a') do not match. We compare f[i - 1][j] and f[i][j - 1] (both are 1), and take the maximum, which is 1. So f[2][1] remains 1.

  5. The process repeats. At i = 3, j = 2, 'c' matches 'c', so f[3][2] = f[2][1] + 1 = 2.

After iterating through all the characters, the table f would look like this:

1 0  0  0  0
2 0  1  1  1
3 0  1  1  1
4 0  1  2  2
5 0  1  2  2
6 0  1  2  3
  1. The last cell f[5][3] contains the value 3, which is the length of the longest common subsequence of text1 and text2. This sequence can be read off from the table by tracing back the cells that gave rise to the maximum values. In this case, the sequence is "ace" which is indeed the longest common subsequence of text1 and text2.

This example walkthrough illustrates how dynamic programming methodically fills out the 2D table to come up with the length of the longest common subsequence. Starting from small subproblems, it builds up the solution to the entire problem, ensuring that all possibilities are considered and the most optimal decision is made at every step.

Solution Implementation

1class Solution:
2    def longestCommonSubsequence(self, text1: str, text2: str) -> int:
3        # Get the lengths of both input strings
4        len_text1, len_text2 = len(text1), len(text2)
5      
6        # Initialize a 2D array (list of lists) with zeros for dynamic programming
7        # The array has (len_text1 + 1) rows and (len_text2 + 1) columns
8        dp_matrix = [[0] * (len_text2 + 1) for _ in range(len_text1 + 1)]
9      
10        # Loop through each character index of text1 and text2
11        for i in range(1, len_text1 + 1):
12            for j in range(1, len_text2 + 1):
13                # If the characters match, take the diagonal value and add 1
14                if text1[i - 1] == text2[j - 1]:
15                    dp_matrix[i][j] = dp_matrix[i - 1][j - 1] + 1
16                else:
17                    # If the characters do not match, take the maximum of the value from the left and above
18                    dp_matrix[i][j] = max(dp_matrix[i - 1][j], dp_matrix[i][j - 1])
19      
20        # The bottom-right value in the matrix contains the length of the longest common subsequence
21        return dp_matrix[len_text1][len_text2]
22
1class Solution {
2    public int longestCommonSubsequence(String text1, String text2) {
3        // Lengths of the input strings
4        int length1 = text1.length();
5        int length2 = text2.length();
6      
7        // Create a 2D array to store the lengths of longest common subsequences
8        // for all subproblems, initialized with zero
9        int[][] dp = new int[length1 + 1][length2 + 1];
10
11        // Build the dp array from the bottom up
12        for (int i = 1; i <= length1; ++i) {
13            for (int j = 1; j <= length2; ++j) {
14                // If characters match, take diagonal value and add 1
15                if (text1.charAt(i - 1) == text2.charAt(j - 1)) {
16                    dp[i][j] = dp[i - 1][j - 1] + 1;
17                }
18                // If characters do not match, take the maximum value from 
19                // the left (dp[i][j-1]) or above (dp[i-1][j])
20                else {
21                    dp[i][j] = Math.max(dp[i - 1][j], dp[i][j - 1]);
22                }
23            }
24        }
25        // The bottom-right cell contains the length of the longest
26        // common subsequence of text1 and text2
27        return dp[length1][length2];
28    }
29}
30
1class Solution {
2public:
3    // Function to find the length of the longest common subsequence in two strings.
4    int longestCommonSubsequence(string text1, string text2) {
5        int text1Length = text1.size(), text2Length = text2.size();
6        // Create a 2D array to store lengths of common subsequence at each index.
7        int dp[text1Length + 1][text2Length + 1];
8      
9        // Initialize the 2D array with zero.
10        memset(dp, 0, sizeof dp);
11      
12        // Loop through both strings and fill the dp array.
13        for (int i = 1; i <= text1Length; ++i) {
14            for (int j = 1; j <= text2Length; ++j) {
15                // If current characters match, add 1 to the length of the sequence
16                // until the previous character from both strings.
17                if (text1[i - 1] == text2[j - 1]) {
18                    dp[i][j] = dp[i - 1][j - 1] + 1;
19                } else {
20                    // If current characters do not match, take the maximum length
21                    // achieved by either skipping the current character of text1 or text2.
22                    dp[i][j] = max(dp[i - 1][j], dp[i][j - 1]);
23                }
24            }
25        }
26      
27        // Return the value in the bottom-right cell which contains the
28        // length of the longest common subsequence for the entire strings.
29        return dp[text1Length][text2Length];
30    }
31};
32
1// Defines a function that computes the length of the longest common subsequence
2// between two strings, 'text1' and 'text2'.
3function longestCommonSubsequence(text1: string, text2: string): number {
4    const text1Length = text1.length; // length of the first input text
5    const text2Length = text2.length; // length of the second input text
6    // Create a 2D array 'dp' (for Dynamic Programming) to store lengths of
7    // subsequences. +1 is for the base case where one of the strings is empty.
8    const dp: number[][] = Array.from({ length: text1Length + 1 }, () => Array(text2Length + 1).fill(0));
9
10    // Populate the 'dp' array
11    for (let i = 1; i <= text1Length; i++) {
12        for (let j = 1; j <= text2Length; j++) {
13            // Check if the current character of 'text1' matches with the current
14            // character of 'text2'
15            if (text1[i - 1] === text2[j - 1]) {
16                // If there's a match, increment the length of the subsequence
17                // considering characters up to i and j are part of the subsequence
18                dp[i][j] = dp[i - 1][j - 1] + 1;
19            } else {
20                // If there's no match, take the maximum length from either
21                // excluding the current character of 'text1' or 'text2'
22                dp[i][j] = Math.max(dp[i - 1][j], dp[i][j - 1]);
23            }
24        }
25    }
26  
27    // The bottom-right value in 'dp' matrix contains the length of the longest
28    // common subsequence of 'text1' and 'text2'
29    return dp[text1Length][text2Length];
30}
31
Not Sure What to Study? Take the 2-min Quiz

Which type of traversal does breadth first search do?

Time and Space Complexity

The given code defines a function longestCommonSubsequence that calculates the length of the longest common subsequence between two strings, text1 and text2. Here is an analysis of its time and space complexities:

  • Time Complexity: The function uses two nested loops that iterate over the lengths of text1 and text2. Each cell f[i][j] is computed only once and in constant time. Therefore, the total number of operations is proportional to the product of the lengths of the two strings. This results in a time complexity of O(m * n) where m is the length of text1 and n is the length of text2.

  • Space Complexity: Space is allocated for a 2D list f with (m + 1) * (n + 1) elements, where each element takes up constant space. Considering m and n as the lengths of text1 and text2 respectively, the space complexity is O(m * n), because it is proportional to the product of the two lengths.

Learn more about how to find time and space complexity quickly using problem constraints.

Fast Track Your Learning with Our Quick Skills Quiz:

Which algorithm is best for finding the shortest distance between two points in an unweighted graph?


Recommended Readings


Got a question? Ask the Teaching Assistant anything you don't understand.

Still not clear? Ask in the Forum,  Discord or Submit the part you don't understand to our editors.

←
↑TA đŸ‘šâ€đŸ«