1062. Longest Repeating Substring

MediumStringBinary SearchDynamic ProgrammingSuffix ArrayHash FunctionRolling Hash
Leetcode Link

Problem Description

The problem asks for the length of the longest repeating substring within a given string s. A repeating substring is defined as a sequence of characters that appears more than once in the original string. To clarify, if the string was "abab", the longest repeating substring would be "ab", which has a length of 2. If no such repeating substring exists, such as in a string with all unique characters like "abc", the function should return 0.

Intuition

The solution to this problem involves dynamic programming, a method for solving complex problems by breaking them down into simpler subproblems. The intuition behind using dynamic programming for this problem is the idea of comparing characters at different positions and building a table that records the lengths of repeating substrings ended at those positions.

This is accomplished by initializing a square matrix dp with the dimensions of the string's length, where dp[i][j] will be used to store the length of the longest repeating substring that ends with the characters s[i] and s[j]. We can then iterate over each position i in the string, and for each i, iterate again through each position j from i+1 to the end of the string. If the characters at positions i and j are the same, it means there's a repeating character, and we can build upon the length of previously found repeating substrings.

Specifically, if s[i] equals s[j], then dp[i][j] is the length of the repeating substring that ended just before i and j (which is dp[i - 1][j - 1]), plus 1 for the current matching characters. If i is 0, there's no previous character, so dp[i][j] is just set to 1. Along the way, we keep track of the maximum length found in variable ans.

Every time we find a matching pair of characters, we update ans to hold the maximum length of any repeating substring found so far. Through this process, we arrive at the longest repeating substring's length without having to check each possible substring individually.

Learn more about Binary Search and Dynamic Programming patterns.

Not Sure What to Study? Take the 2-min Quiz to Find Your Missing Piece

What's the output of running the following function using input 56?

1KEYBOARD = {
2    '2': 'abc',
3    '3': 'def',
4    '4': 'ghi',
5    '5': 'jkl',
6    '6': 'mno',
7    '7': 'pqrs',
8    '8': 'tuv',
9    '9': 'wxyz',
10}
11
12def letter_combinations_of_phone_number(digits):
13    def dfs(path, res):
14        if len(path) == len(digits):
15            res.append(''.join(path))
16            return
17
18        next_number = digits[len(path)]
19        for letter in KEYBOARD[next_number]:
20            path.append(letter)
21            dfs(path, res)
22            path.pop()
23
24    res = []
25    dfs([], res)
26    return res
27
1private static final Map<Character, char[]> KEYBOARD = Map.of(
2    '2', "abc".toCharArray(),
3    '3', "def".toCharArray(),
4    '4', "ghi".toCharArray(),
5    '5', "jkl".toCharArray(),
6    '6', "mno".toCharArray(),
7    '7', "pqrs".toCharArray(),
8    '8', "tuv".toCharArray(),
9    '9', "wxyz".toCharArray()
10);
11
12public static List<String> letterCombinationsOfPhoneNumber(String digits) {
13    List<String> res = new ArrayList<>();
14    dfs(new StringBuilder(), res, digits.toCharArray());
15    return res;
16}
17
18private static void dfs(StringBuilder path, List<String> res, char[] digits) {
19    if (path.length() == digits.length) {
20        res.add(path.toString());
21        return;
22    }
23    char next_digit = digits[path.length()];
24    for (char letter : KEYBOARD.get(next_digit)) {
25        path.append(letter);
26        dfs(path, res, digits);
27        path.deleteCharAt(path.length() - 1);
28    }
29}
30
1const KEYBOARD = {
2    '2': 'abc',
3    '3': 'def',
4    '4': 'ghi',
5    '5': 'jkl',
6    '6': 'mno',
7    '7': 'pqrs',
8    '8': 'tuv',
9    '9': 'wxyz',
10}
11
12function letter_combinations_of_phone_number(digits) {
13    let res = [];
14    dfs(digits, [], res);
15    return res;
16}
17
18function dfs(digits, path, res) {
19    if (path.length === digits.length) {
20        res.push(path.join(''));
21        return;
22    }
23    let next_number = digits.charAt(path.length);
24    for (let letter of KEYBOARD[next_number]) {
25        path.push(letter);
26        dfs(digits, path, res);
27        path.pop();
28    }
29}
30

Solution Approach

The implementation of the solution uses dynamic programming, explicitly utilizing a 2D array (referred to as a matrix) for storing the lengths of repeating substrings. Here's how the code works:

  1. Initialize a 2D list (matrix) dp, where each element dp[i][j] represents the length of the longest repeating substring that ends with s[i] and s[j]. Set the entire matrix to 0 initially because we haven't found any repeating substrings yet.

  2. Set a variable ans to 0. This variable will keep track of the length of the longest repeating substring found so far.

  3. Iterate over each character in the string using two nested loops:

    a. The outer loop goes through each character in the string with index i.

    b. The inner loop starts from i+1 and goes to the end of the string with index j.

    c. For each pair of indices i and j, check if s[i] is equal to s[j]. If they are:

    1 i. If `i` is greater than 0, update `dp[i][j]` to be `dp[i - 1][j - 1] + 1`. This is because a matching pair of characters extends the length of the repeating substring by 1 compared to the substring that ended just before these indices.
    2
    3 ii. If `i` is equal to 0, set `dp[i][j]` to 1 since there are no previous characters to consider, and thus we have found a repeating substring of length 1.

    d. Update the ans variable with the larger of its current value or dp[i][j], ensuring ans always holds the length of the longest repeating substring found.

  4. After completing the iterations, return the value stored in ans, which now represents the length of the longest repeating substring in s.

The algorithm uses a dynamic programming matrix to avoid redundant work, building up the solution based on previously computed values. By checking only pairs of indices where j is greater than i, the algorithm ensures that only substrings are considered, not individual characters or the entire string. This approach efficiently solves the problem with O(n^2) time complexity and O(n^2) space complexity, where n is the length of the string s.

Discover Your Strengths and Weaknesses: Take Our 2-Minute Quiz to Tailor Your Study Plan:

How many ways can you arrange the three letters A, B and C?

Example Walkthrough

Let's take a small example string s = "aabba". According to the given solution approach, we want to find the length of the longest repeating substring in s. Here is how the approach works step-by-step:

  1. We initialize a matrix dp of 5x5 dimensions (since the string s has a length of 5) with all values set to 0. The matrix looks like this:

    1dp = [
    2  [0, 0, 0, 0, 0],
    3  [0, 0, 0, 0, 0],
    4  [0, 0, 0, 0, 0],
    5  [0, 0, 0, 0, 0],
    6  [0, 0, 0, 0, 0]
    7]
  2. We set ans to 0.

  3. We start iterating over each character in s using two nested loops:

    • Outer loop with index i from 0 to 4.

    • Inner loop with index j from i+1 to 4.

  4. Now we work through the loops:

    • On the first iteration (i=0, j=1), we find that s[0] (which is 'a') is equal to s[1] (also 'a'). Since i is 0, we set dp[0][1] to 1. ans is updated to 1.

      1dp = [
      2  [0, 1, 0, 0, 0],
      3  [0, 0, 0, 0, 0],
      4  ...
      5]
    • On iteration (i=1, j=2), s[1] is 'a' but s[2] is 'b', so we do nothing as they don't match.

    • On iteration (i=3, j=4), we find a match as both s[3] and s[4] are 'a'. Since i > 0, we set dp[3][4] to dp[2][3] + 1 which is 1 (since 'b'[3] and 'b'[4] don't match, dp[2][3] is 0). Now, ans is updated to 1.

      1dp = [
      2  ...
      3  [0, 0, 0, 0, 1]
      4]
  5. None of the remaining iterations of the loops yield any greater repeating substring, so ans remains 1.

  6. After iterating through the string and updating the dp matrix, we find that the value of ans remains 1.

Thus, the length of the longest repeating substring in the example string "aabba" is 1, and that repeating substring is "a".

Solution Implementation

1class Solution:
2    def longestRepeatingSubstring(self, s: str) -> int:
3        # Length of the input string
4        n = len(s)
5
6        # Initialize a 2D array(dynamically programming table)
7        # to store lengths of repeating substrings
8        dp_table = [[0] * n for _ in range(n)]
9
10        # Variable to store the length of the largest repeating substring
11        max_length = 0
12
13        # Iterate through each character in the string
14        for i in range(n):
15            # Compare with other characters in the string to the right
16            for j in range(i + 1, n):
17                # Check if the characters s[i] and s[j] match
18                if s[i] == s[j]:
19                    # Extend the length of the repeating substring
20                    # If i is not 0, get the previous length from the dp table, else just start with length 1.
21                    dp_table[i][j] = dp_table[i - 1][j - 1] + 1 if i else 1
22                    # Update the max_length if the current repeating substring is longer
23                    max_length = max(max_length, dp_table[i][j])
24
25        # Return the length of the longest repeating substring
26        return max_length
27
1class Solution {
2    public int longestRepeatingSubstring(String s) {
3        int length = s.length(); // Length of the string 's'
4        int longestLength = 0; // Variable to store the length of the longest repeating substring
5        int[][] dp = new int[length][length]; // dp[i][j] will hold the length of the longest repeating substring ending at i and j
6
7        // Iterate through the string 's'
8        for (int i = 0; i < length; ++i) {
9            for (int j = i + 1; j < length; ++j) {
10                // Check if characters at positions i and j are the same.
11                if (s.charAt(i) == s.charAt(j)) {
12                    // If they match and are not at the first character,
13                    // increment the length of the repeating substring by 1.
14                    // Otherwise, set the repeating substring length to 1 as it's the start of a possible repeating pattern.
15                    dp[i][j] = i > 0 ? dp[i - 1][j - 1] + 1 : 1;
16                    // Update the longestLength if the current repeating substring is the longest found so far.
17                    longestLength = Math.max(longestLength, dp[i][j]);
18                }
19                // If they don't match, the default value of dp[i][j] remains 0, indicating no repetition.
20            }
21        }
22        return longestLength; // Return the length of the longest repeating substring.
23    }
24}
25
1class Solution {
2public:
3    int longestRepeatingSubstring(string s) {
4        int length = s.size(); // Get the size of the string
5        // Create a 2D DP table with 'length' rows and 'length' columns initialized to 0
6        vector<vector<int>> dp(length, vector<int>(length, 0)); 
7        int longest = 0; // Variable to store the length of the longest repeating substring
8
9        // Iterate over the string with two pointers
10        for (int i = 0; i < length; ++i) {
11            for (int j = i + 1; j < length; ++j) {
12                // If characters at the current position in both pointers match
13                if (s[i] == s[j]) {
14                    // Check if it is not the first character, then add 1 to the length of substring
15                    // found till the previous characters else start with 1
16                    dp[i][j] = (i > 0) ? dp[i - 1][j - 1] + 1 : 1;
17                    // Update the longest substring found so far
18                    longest = max(longest, dp[i][j]);
19                }
20                // If characters don't match, dp[i][j] remains 0 as initialized
21            }
22        }
23        // Return the length of longest repeating substring
24        return longest;
25    }
26};
27
1function longestRepeatingSubstring(s: string): number {
2    const length = s.length; // Get the length of the string
3    // Create a 2D DP array with 'length' rows and 'length' columns initialized to 0
4    const dp: number[][] = Array.from({ length }, () => Array(length).fill(0));
5    let longest = 0; // Variable to store the length of the longest repeating substring
6
7    // Iterate over the string with two pointers
8    for (let i = 0; i < length; ++i) {
9        for (let j = i + 1; j < length; ++j) {
10            // If characters at the current position in both pointers match
11            if (s[i] === s[j]) {
12                // Check if it is not the first character, then add 1 to the length of the substring
13                // found till the previous characters; otherwise, start with 1
14                dp[i][j] = (i > 0) ? dp[i - 1][j - 1] + 1 : 1;
15                // Update the longest substring found so far
16                longest = Math.max(longest, dp[i][j]);
17            }
18            // If characters don't match, dp[i][j] remains 0 as initialized
19        }
20    }
21    // Return the length of the longest repeating substring
22    return longest;
23}
24
Not Sure What to Study? Take the 2-min Quiz

What is the worst case running time for finding an element in a binary tree (not necessarily binary search tree) of size n?

Time and Space Complexity

Time Complexity

The provided code uses a double for-loop to iterate over all possible starting positions of substrings in the string s, which has the length n. Inside these loops, it checks if two characters in s are equal and if so, updates the dynamic programming (DP) table.

The outer loop runs n times, and the inner loop can run up to n times as well, which could suggest an O(n^2) complexity. However, since the inner loop starts at i + 1, the number of iterations in the inner loop decreases as i increases. The total number of iterations could then be roughly calculated as the sum of the first n natural numbers, minus the diagonal elements (which are not accessed), which is (n * (n - 1)) / 2, which is still in the order of O(n^2).

Therefore, the time complexity of the code is O(n^2).

Space Complexity

The space complexity is dominated by the space used to store the dynamic programming (DP) table called dp, which is a 2D array of size n x n. Since the DP table is filled with 0 values at the beginning, and no other data structures scale with the size of n grow, the space complexity is O(n^2).

Learn more about how to find time and space complexity quickly using problem constraints.

Fast Track Your Learning with Our Quick Skills Quiz:

Which of the tree traversal order can be used to obtain elements in a binary search tree in sorted order?


Recommended Readings


Got a question? Ask the Teaching Assistant anything you don't understand.

Still not clear? Ask in the Forum,  Discord or Submit the part you don't understand to our editors.

←
↑TA đŸ‘šâ€đŸ«