3104. Find Longest Self-Contained Substring 🔒
Problem Description
Given a string s
, the task is to find the length of the longest self-contained substring of s
.
A substring t
of a string s
is called self-contained if t
is not equal to s
and for every character in t
, it doesn't exist in the rest of s
.
Return the length of the longest self-contained substring of s
if it exists, otherwise, return -1.
Intuition
To solve this problem, we need to identify a substring t
such that every character in t
is unique to that substring and not found elsewhere in the string s
. Initially, we recognize that for a substring to be self-contained, it should satisfy that none of its characters appear outside it in s
.
The strategy involves tracking the first and last occurrence of each character in the string. Using two dictionaries, first
and last
, we store the earliest and latest appearance of every character, respectively. This helps understand the possible boundaries of our potential self-contained substrings.
Next, we iterate through each unique character, marking the starting position for a possible self-contained substring. From this position, we attempt to expand the substring by checking each subsequent character. If any character's initial occurrence is before our starting point, it means that character is shared with the portion of s
we've already considered, so the current expansion must halt.
For valid candidates, we keep expanding until reaching a maximum boundary, determined by the latest occurrence (last
) of any characters in our current substring. Anytime we find a plausible self-contained substring (ensuring it's shorter than s
itself), we update our answer with its length if it exceeds the previous maximum found.
Finally, the answer will reflect the length of the longest valid self-contained substring, or -1 if none exist.
Learn more about Binary Search and Prefix Sum patterns.
Solution Approach
To implement the solution for finding the length of the longest self-contained substring, we will follow a structured approach using hash tables to efficiently track character positions.
Step-by-Step Implementation
-
Data Structures:
- Use two dictionaries:
first
andlast
to store the first and last occurrence indexes of each character in the strings
.
- Use two dictionaries:
-
Populating Occurrence Dictionaries:
- Traverse the string
s
while updatingfirst
andlast
for each character.first[c]
should store the earliest index where characterc
appears, andlast[c]
stores the latest index ofc
. This can be done by iterating through the string with an index.
- Traverse the string
-
Initialization:
- Begin with
ans = -1
to handle the case when no valid substring is found. This variable will hold the maximum length of a self-contained substring.
- Begin with
-
Enumerating Possible Substrings:
- For every character
c
infirst
, assume this character's first occurrence is the starting indexi
of a potential substring. - Initialize
mx
aslast[c]
to keep track of the farthest index up to which the current substring can extend without repeating characters outside it.
- For every character
-
Expanding the Substring:
- Start extending the substring from
i
towards the right. For each character at positionj
, check if the first occurrencea
of the current character is beforei
. If it is, break out because it means the character belongs partly to a previously processed section of the string.
- Start extending the substring from
-
Updating Maximum Boundary:
- Update
mx
with the largest occurrence index encountered (b = last[s[j]]
) as we iterate. - If
mx
equalsj
(indicating the end of our current valid consideration) andj - i + 1
is less than the length ofs
(to ensure it's not the whole string), updateans
with the current substring's length usingans = max(ans, j - i + 1)
.
- Update
-
Result:
- After considering all possible starting points,
ans
will provide the length of the longest self-contained substring, or it will remain as-1
if no such substring exists.
- After considering all possible starting points,
By carefully managing the first and last occurrence of characters and iteratively checking each potential substring, this approach efficiently identifies the longest self-contained portion of the string.
Ready to land your dream job?
Unlock your dream job with a 2-minute evaluator for a personalized learning plan!
Start EvaluatorExample Walkthrough
Let's walk through an example using the string s = "abacded"
to identify the longest self-contained substring.
-
Data Structures: We begin by initializing two dictionaries,
first
andlast
, to track the first and last occurrences of each character in the string.- After traversing
s
, the dictionaries are populated as follows:first = {'a': 0, 'b': 1, 'c': 2, 'd': 4, 'e': 5}
last = {'a': 2, 'b': 1, 'c': 2, 'd': 6, 'e': 5}
- After traversing
-
Initialization: We set
ans = -1
to start with, indicating that we haven't found a valid self-contained substring yet. -
Enumerating Possible Substrings: We iterate over the characters using their first occurrence as potential starting points.
- Starting with character
'a'
at index0
:- Set
mx = 2
(the last occurrence of'a'
).
- Set
- Starting with character
-
Expanding the Substring:
- Extend from index
0
and check each subsequent character:- At index
1
, the character'b'
has its first occurrence at1
, which is not before0
. Updatemx = 1
. - At index
2
, the character'a'
itself, confirms end of the potential substring (mx = 2
).
- At index
- The length is
2 - 0 + 1 = 3
, which is less than the length ofs
. Updateans = max(-1, 3) = 3
.
- Extend from index
-
Continuing with other characters:
- Starting with
'b'
at index1
:- Set
mx = 1
. However,'a'
reappears in subsequent indices (2
), so this expansion is halted.
- Set
- Starting with
'c'
at index2
:- Set
mx = 2
. No further valid expansion found.
- Set
- Starting with
'd'
at index4
:- Set
mx = 6
. Check indices4
to6
, the substring "ded" found with length3
but equal to currentans
.
- Set
- Starting with
'e'
at index5
:- The character
d
appears again later in the string, halting potential expansions.
- The character
- Starting with
-
Result: After exploring all possible starting points, the longest self-contained substring is of length
3
, given by the substring"aba"
starting at index0
.
This walkthrough highlights the method of using dictionaries to track character positions, managing indices efficiently, and iterating through potential substrings systematically for the solution.
Solution Implementation
1class Solution:
2 def maxSubstringLength(self, s: str) -> int:
3 first_indices, last_indices = {}, {}
4
5 # Record the first and last occurrence of each character
6 for index, char in enumerate(s):
7 if char not in first_indices:
8 first_indices[char] = index
9 last_indices[char] = index
10
11 max_length, string_length = -1, len(s)
12
13 # Iterate through each character's first occurrence
14 for char, start_index in first_indices.items():
15 max_last_index = last_indices[char]
16
17 # Check each substring starting from start_index
18 for j in range(start_index, string_length):
19 first_index_of_current = first_indices[s[j]]
20 last_index_of_current = last_indices[s[j]]
21
22 # If the first occurrence is before the current start, break
23 if first_index_of_current < start_index:
24 break
25
26 # Update max_last_index based on current character's last occurrence
27 max_last_index = max(max_last_index, last_index_of_current)
28
29 # Check if the current range forms a valid substring
30 if max_last_index == j and j - start_index + 1 < string_length:
31 max_length = max(max_length, j - start_index + 1)
32
33 return max_length
34
1import java.util.Arrays;
2
3class Solution {
4 // Method to find the maximum length of a substring where all characters appear together
5 public int maxSubstringLength(String s) {
6 int[] firstOccurrence = new int[26]; // Array to store the first occurrence index of each character
7 int[] lastOccurrence = new int[26]; // Array to store the last occurrence index of each character
8 Arrays.fill(firstOccurrence, -1); // Initialize all elements of the firstOccurrence array to -1
9
10 int length = s.length(); // Length of the input string
11
12 // Populate the firstOccurrence and lastOccurrence arrays
13 for (int i = 0; i < length; ++i) {
14 int charIndex = s.charAt(i) - 'a'; // Calculate the index of the character in the alphabet
15 if (firstOccurrence[charIndex] == -1) {
16 firstOccurrence[charIndex] = i; // Set first occurrence if it has not been set
17 }
18 lastOccurrence[charIndex] = i; // Update last occurrence to the current index
19 }
20
21 int maxSubstringLen = -1; // Variable to store the maximum substring length found
22
23 // Iterate over each character in the alphabet
24 for (int k = 0; k < 26; ++k) {
25 int firstIndex = firstOccurrence[k]; // Get the first occurrence index
26 if (firstIndex == -1) {
27 continue; // If the character does not appear in the string, skip
28 }
29 int maxLastIndex = lastOccurrence[k]; // Get the last occurrence index
30
31 // Iterate over the substring starting from the first occurrence of character k
32 for (int j = firstIndex; j < length; ++j) {
33 int start = firstOccurrence[s.charAt(j) - 'a'];
34 int end = lastOccurrence[s.charAt(j) - 'a'];
35 if (start < firstIndex) {
36 break; // If a character appears before the start of this substring, break
37 }
38 maxLastIndex = Math.max(maxLastIndex, end); // Update maxLastIndex to ensure covering the full substring
39
40 // Check if current segment is a valid substring and update the max length
41 if (maxLastIndex == j && j - firstIndex + 1 < length) {
42 maxSubstringLen = Math.max(maxSubstringLen, j - firstIndex + 1);
43 }
44 }
45 }
46
47 return maxSubstringLen;
48 }
49}
50
1class Solution {
2public:
3 // Method to find the maximum length of a substring which contains each character
4 // appearing an even number of times, or -1 if no such substring exists
5 int maxSubstringLength(std::string s) {
6 // Vectors to store the first and last occurrence of each character
7 std::vector<int> firstOccurrence(26, -1);
8 std::vector<int> lastOccurrence(26);
9
10 int n = s.length();
11
12 // Iterate over the string to fill the first and last occurrence vectors
13 for (int i = 0; i < n; ++i) {
14 int charIndex = s[i] - 'a';
15 // If it's the first time we've seen this character
16 if (firstOccurrence[charIndex] == -1) {
17 firstOccurrence[charIndex] = i; // Save the first occurrence
18 }
19 lastOccurrence[charIndex] = i; // Update the last occurrence
20 }
21
22 int maxLength = -1; // Initialize the result with -1 (consider only positive lengths)
23
24 // Check each character for possible valid substrings
25 for (int k = 0; k < 26; ++k) {
26 int start = firstOccurrence[k];
27 if (start == -1) {
28 continue; // Skip if the character didn't appear in the string
29 }
30
31 int maxIndex = lastOccurrence[k];
32
33 // Iterate from start to see if we can find a valid substring
34 for (int j = start; j < n; ++j) {
35 int currentStart = firstOccurrence[s[j] - 'a'];
36 int currentEnd = lastOccurrence[s[j] - 'a'];
37
38 if (currentStart < start) {
39 // If any character starts before current start, break
40 break;
41 }
42
43 // Update the maximum end index seen so far
44 maxIndex = std::max(maxIndex, currentEnd);
45
46 // Check if current segment is valid
47 if (maxIndex == j && j - start + 1 < n) {
48 maxLength = std::max(maxLength, j - start + 1); // Update max length
49 }
50 }
51 }
52
53 return maxLength;
54 }
55};
56
1function maxSubstringLength(s: string): number {
2 // Arrays to store the first and last occurrence of each letter in the string
3 const first: number[] = Array(26).fill(-1);
4 const last: number[] = Array(26).fill(0);
5 const n = s.length; // Length of the string
6
7 // Populate the first and last occurrence arrays
8 for (let i = 0; i < n; ++i) {
9 const index = s.charCodeAt(i) - 97; // Get the index of the character
10 if (first[index] === -1) {
11 first[index] = i; // Store first occurrence if not recorded
12 }
13 last[index] = i; // Update last occurrence
14 }
15
16 let maxLength = -1; // Initialize the maximum length of the substring
17
18 for (let k = 0; k < 26; ++k) {
19 const start = first[k]; // Start position of the current letter
20
21 if (start === -1) {
22 continue; // Skip if the letter doesn't exist in the string
23 }
24
25 let maxEnd = last[k]; // Initialize the maximum end position for the substring
26
27 for (let j = start; j < n; ++j) {
28 const currentCharIndex = s.charCodeAt(j) - 97; // Current character's index
29 const firstOccurrence = first[currentCharIndex]; // First occurrence of the current character
30
31 if (firstOccurrence < start) {
32 break; // Stop if any character in the substring starts before the selected start
33 }
34
35 const lastOccurrence = last[currentCharIndex]; // Last occurrence of the current character
36 maxEnd = Math.max(maxEnd, lastOccurrence); // Update max end position if needed
37
38 // Update maxLength if the current substring is smaller than the entire string
39 if (maxEnd === j && j - start + 1 < n) {
40 maxLength = Math.max(maxLength, j - start + 1);
41 }
42 }
43 }
44 return maxLength;
45}
46
Time and Space Complexity
The time complexity of the code is O(n × |\Sigma|)
, where n
is the length of the string s
, and |\Sigma|
represents the size of the character set. In this case, the character set consists of lowercase letters, resulting in |\Sigma| = 26
.
The primary reason for this complexity is the nested loops: the outer loop iterates over each character (unique characters, thus |\Sigma|
iterations), and the inner loop iterates over each subsequent character in the string (up to n
iterations).
The space complexity is O(|\Sigma|)
. This arises from the use of the first
and last
dictionaries, each of which stores at most one entry for every character in the character set, leading to constant space usage given |\Sigma| = 26
.
Learn more about how to find time and space complexity quickly.
The three-steps of Depth First Search are:
- Identify states;
- Draw the state-space tree;
- DFS on the state-space tree.
Recommended Readings
Binary Search Speedrun For each of the Speedrun questions you will be given a binary search related problem and a corresponding multiple choice question The multiple choice questions are related to the techniques and template s introduced in the binary search section It's recommended that you have gone through at
Prefix Sum The prefix sum is an incredibly powerful and straightforward technique Its primary goal is to allow for constant time range sum queries on an array What is Prefix Sum The prefix sum of an array at index i is the sum of all numbers from index 0 to i By
Coding Interview Patterns Your Personal Dijkstra's Algorithm to Landing Your Dream Job The goal of AlgoMonster is to help you get a job in the shortest amount of time possible in a data driven way We compiled datasets of tech interview problems and broke them down by patterns This way
Want a Structured Path to Master System Design Too? Don’t Miss This!