Leetcode 3. Longest Substring Without Repeating Characters

Problem

We're given a string, and we must find the length of the longest substring without repeating characters.

Let's take an example, if the string is "abcabcbb", the longest substring without repeating characters is "abc", its length is 3.

If the string is "pwwkew", the longest substring without repeating characters is "wke" its length is 3. Please note that the substring should not have any repeating characters, It should also be a continuous chunk from the original string. So, "pwke" would not be a valid substring because it's not a continuous chunk from the original string even though it does not have repeating characters.

Approach and Solution

The solution to this problem involves a sliding window approach. The main idea of this approach is maintaining a set window from the given string which will be expanded or reduced based on the condition of the problem. We traverse the string from left to right, adding characters to our window and if a repeating character is added to our window we remove characters from the start of our window until window is valid again, i.e., until the repeating character is removed.

Here is a walk through with "abcabcbb" example:

  • Start with an empty window at the start of the string. As string is empty, longest substring length is 0.

  • Now, expand the window to the right by one element. "a", the length of the longest substring now becomes 1.

  • Continue to the right, "ab", the longest substring is now of length 2.

  • Expand window more to the right, "abc", the length of longest substring now becomes 3.

  • Now when we expand the window more to the right and add another 'a', we find a repeating character. So we slide our window from the left until 'a' is removed. Our window becomes "bca", and the longest substring length remains 3.

  • Follow the same steps until end of the string, and at last, the longest substring length remains 3.

To implement this, we need a way to check if a character is currently in our window which we can do with an array to serve as a frequency count table, where index is the ASCII value of a character and value at that index is count of that character in our window. This gives us O(1) time to check and add a character.

Python Solution

1
2python
3class Solution:
4    def lengthOfLongestSubstring(self, s: str) -> int:
5        freq_table = [0] * 128  # ASCII table size
6        l = r = ans = 0
7
8        for r in range(len(s)):
9            freq_table[ord(s[r])] += 1
10            while freq_table[ord(s[r])] > 1:   # Slide window from left if repeating character
11                freq_table[ord(s[l])] -= 1
12                l += 1
13            ans = max(ans, r - l + 1)
14
15        return ans

Java Solution

1
2java
3class Solution {
4    public int lengthOfLongestSubstring(String s) {
5        int frequency[] = new int[128];
6        int l = 0;
7        int ans = 0;
8
9        for (int r = 0; r < s.length(); r++) {
10            frequency[s.charAt(r)]++;
11            while (frequency[s.charAt(r)] > 1)
12                frequency[s.charAt(l++)]--;
13            ans = Math.max(ans, r - l + 1);
14        }
15
16        return ans;
17    }
18}

JavaScript Solution

1
2javascript
3const lengthOfLongestSubstring = function(s) {
4    let l = 0, r = 0, ans = 0;
5    let freq = new Array(128).fill(0);
6
7    while (r < s.length) {
8        freq[s.charCodeAt(r)]++;
9        while (freq[s.charCodeAt(r)] > 1)
10            freq[s.charCodeAt(l++)]--;
11        ans = Math.max(ans, r - l + 1);
12        r++;
13    }
14
15    return ans;
16};

C++ Solution

1
2cpp
3class Solution {
4public:
5    int lengthOfLongestSubstring(string s) {
6        vector<int> frequency(128, 0);
7        int l = 0;
8        int ans = 0;
9
10        for (int r = 0; r < s.length(); r++) {
11            frequency[s[r]]++;
12            while (frequency[s[r]] > 1)
13                frequency[s[l++]]--;
14            ans = max(ans, r - l + 1);
15        }
16
17        return ans;
18    }
19};

C# Solution

1
2Csharp
3public class Solution {
4    public int LengthOfLongestSubstring(string s) {
5        int[] frequency = new int[128];
6        int l = 0, ans = 0;
7
8        for (int r = 0; r < s.Length; r++) {
9            frequency[s[r]]++;
10            while (frequency[s[r]] > 1)
11                frequency[s[l++]]--;
12            ans = Math.Max(ans, r - l + 1);
13        }
14
15        return ans;
16    }
17}

In the above solutions, maps are used to mark the characters in the string. The two pointers or indices l and r essentially define a sliding window of characters in the string. Initially the window is empty, then it uninterruptedly includes new characters. If a character is already in the window, the repeating characters from the current window are removed from the left side. The maximum window size during this process is our ultimate answer.

Let's try understanding the solution with another example, suppose we have string "tmmzuxt". As we move from right to left:

  • We start with an empty window at "t", longest substring length is 1.
  • Increase the window size to include "m", longest substring length becomes 2 (tm).
  • Continue to include next "m", as we got repeating character we reduce the window size from the left, longest substring length remains 2 (m).
  • Then include "z", "u", "x" and "t" without any repeating character, making longest substring without repeating characters equal to 5 (mzuxt).

Testing the solutions with big inputs ensures their efficiency. All of these time complexities are O(n) where n is the size of the string. The space complexity is O(128), which is equivalent to O(1) since we have a fixed array of size 128 that is used to hold character frequencies. If the problem was modified to include all Unicode characters, the frequency array would have a size equivalent to the number of Unicode characters, and the space complexity would become O(n).

In conclusion, the sliding window technique is a very commonly used approach in string algorithms, especially when dealing with continua and contiguous sub-arrays or substrings. It is very advantageous in terms of time complexity as it only requires going through the characters of the string once. With this technique, we achieved an impressive linear time complexity solution to our given problem.


Got a question? Ask the Teaching Assistant anything you don't understand.

Still not clear? Ask in the Forum,  Discord or Submit the part you don't understand to our editors.


TA 👨‍🏫