Leetcode 1316. Distinct Echo Substrings

Problem Explanation:

The problem wants us to find the number of distinct non-empty substrings of the given text or string that can be written as the concatenation of some string with itself.

Let's look at an example for better understanding:

Input: text = "abcabcabc"

For this input string, the substrings are "abcabc", "bcabca", and "cabcab". All these substrings are the concatenation of a string with itself. In this case "abc", "bca" and "cab". So the output is 3.

Solution Approach:

The approach to solve this problem is based on using Sliding Window and Set data structure.

A variable 'k' is taken to track the target length of the substring. It ranges from 1 to half of the input string length. Another variable 'same' is initiated to keep a record of the count of the same characters from the left and right of the input string in the current comparison.

Then, we traverse through the string using a sliding window of size 'k'. If during traversal, the character at the left index is the same as the character at the right index, we increment 'same'. If 'same' equals 'k', it implies we have found a substring that can be written as the concatenation of some string with itself. Then, we insert this substring into a Set (to avoid duplicates) by extracting the substring from the main string and decrement 'same' by one.

At the end, return the size of the Set, which is the total count of distinct substrings, as per our problem statement.

Example Walkthrough:

Let's walk through this logic by taking an example:

Input: text = "abcdefdef"

For the first iteration k = 1, we have:

  • Compare "a" with "b", not equal.
  • Compare "b" with "c", not equal.
  • Compare "c" with "d", not equal.
  • Compare "d" with "e", not equal.
  • Compare "e" with "f", not equal.
  • Compare "f" with "d", not equal.
  • Compare "d" with "e", not equal.
  • Compare "e" with "f", not equal.

For the next iteration k = 2, we have:

  • Compare "ab" with "cd", not equal.
  • Compare "bc" with "de", not equal.
  • Compare "cd" with "ef", not equal.
  • Compare "de" with "fd", not equal.
  • Compare "ef" with "de", not equal.
  • Compare "fd" with "ef", not equal.

For the next iteration k = 3, we have:

  • Compare "abc" with "def", not equal.
  • Compare "bcd" with "efd", not equal.
  • Compare "cde" with "fde", not equal.

Finally for k = 4, we have:

  • Compare "abcd" with "efde", not equal.
  • Compare "bcde" with "fdef", not equal.

So, there's no such substring that can be written as the concatenation of some string with itself. Hence, the output is 0.

Solution in Python:

1
2python
3class Solution:
4    def distinctEchoSubstrings(self, text):
5        seen = set()
6        n = len(text)
7
8        for k in range(1, n // 2 + 1):
9            same = 0
10            for l in range(0, n - 2 * k + 1):
11                if text[l] == text[l+k]:
12                    same += 1
13                    if same == k:
14                        seen.add(text[l-k+1:l+1])
15                else:
16                    same = 0
17
18        return len(seen)

Solution in Java:

1
2java
3class Solution {
4    public int distinctEchoSubstrings(String text) {
5        Set<String> seen = new HashSet<>();
6        int n = text.length();
7
8        for (int k = 1; k <= n / 2; k++) {
9            int same = 0;
10            for (int i = 0; i < n - k; i++) {
11                if (text.charAt(i) == text.charAt(i+k))
12                    same++;
13                else
14                    same = 0;
15                if (same == k) {
16                    seen.add(text.substring(i + 1 - k, i + 1));
17                    same--;
18                }
19            }
20        }
21
22        return seen.size();
23    }
24}

Solution in JavaScript:

1
2javascript
3class Solution {
4    distinctEchoSubstrings(text) {
5        const seen = new Set();
6        const n = text.length;
7
8        for (let k = 1; k <= Math.floor(n / 2); k++) {
9            let same = 0;
10            for (let i = 0; i < n - k; i++) {
11                if (text.charAt(i) === text.charAt(i + k))
12                    same++;
13                else
14                    same = 0;
15                if (same === k) {
16                    seen.add(text.slice(i + 1 - k, i + 1));
17                    same--;
18                }
19            }
20        }
21
22        return seen.size; 
23    }
24}

Solution in C++:

1
2c++
3class Solution {
4public:
5    int distinctEchoSubstrings(string text) {
6        unordered_set<string> seen;
7        int n = text.length();
8
9        for (int k = 1; k <= n / 2; ++k) {
10            int same = 0;
11            for (int l = 0; l < n - k; ++l) {
12                if (text[l] == text[l + k])
13                    ++same;
14                else
15                    same = 0;
16                if (same == k) {
17                    seen.insert(text.substr(l - k + 1, k));
18                    --same;
19                }
20            }
21        }
22
23        return seen.size();
24    }
25};

Solution in C#:

1
2csharp
3public class Solution {
4    public int DistinctEchoSubstrings(string text) {
5        var seen = new HashSet<string>();
6        int n = text.Length;
7        
8        for (int k = 1; k <= n / 2; ++k) {
9            int same = 0;
10            for (int i = 0; i < n - k; ++i) {
11                if (text[i] == text[i + k])
12                    ++same;
13                else
14                    same = 0;
15                if (same == k) {
16                    seen.Add(text.Substring(i-k+1, k));
17                    --same;
18                }
19            }
20        }
21
22        return seen.Count;
23    }
24}

All solutions are using a sliding window of length 'k' to check for all possible substrings that can be written as concatenation of a string with itself. These substrings are stored in a Set and finally the size of the Set is returned as output.All of the above solutions have a time complexity of O(n^2), where n is the length of the string 'text'. This is because there are two nested loops in the solution; the outer one iterates for every possible length of the substring from 1 to n/2, and the inner one iterates through each character of the string 'text' from the current length of the substring onwards.

The space complexity is also O(n^2), as in the worst-case scenario, all possible substrings of 'text' are stored in the Set. In the worst-case scenario, this could lead to n*(n+1)/2 substrings being stored (which simplifies to an O(n^2) space complexity). Examples where this can happen include strings containing only one type of character (like 'aaaaa...').

Optimizations and Improvements:

To improve the time complexity, other algorithms such as Suffix Trees or Suffix Arrays can be used. These data structures are used in many string-manipulation applications and can be helpful for solving this problem in a more efficient way.

To reduce space complexity, we can use Rolling Hashing or Rabin-Karp algorithm. This algorithm calculates a unique hash for each possible substring and checks whether it's equal to the hash of the corresponding substring (i.e., the next substring of length k). This allows for finding repeating substrings without actually storing the substrings.

However, implementing these optimizations would greatly increase the complexity of the code, and may not be feasible depending on the constraints of the problem and the performance requirements of the system.

In conclusion, the problem of finding distinct echo substrings can be effectively solved using sliding window and set data structures in multiple programming languages. These solutions are both simple and efficient for most practical inputs, but can also be further optimized if necessary.


Got a question? Ask the Teaching Assistant anything you don't understand.

Still not clear? Ask in the Forum,  Discord or Submit the part you don't understand to our editors.


TA 👨‍🏫