Leetcode 828. Unique Letter String
Problem Explanation
The problem at hand asks to calculate the sum of unique characters for all possible substrings of a given string. It requires to iterate over all substrings of a given string and compute the count of unique letters in each. For instance, for a given string "ABC", all possible substrings are "A","B","C","AB","BC" and "ABC". Each of these substrings has unique characters so the total sum would be 1 + 1 + 1 + 2 + 2 + 3 = 10.
The challenge of this problem lies in calculating the sum of unique characters for substrings efficiently, as there can be a large number of substrings, especially for long strings.
Walkthrough of Solution
The approach used for the solution is to iterate over the string while keeping track of the count of unique letters until the current position and the last seen position for each letter. For each position in the string, the count is updated by subtracting the last count for the current letter and adding the current count. The new count is then added to the total sum of unique characters.
The data structures used are two vector arrays:
- lastCount array that stores the last count of each letter.
- lastSeen array that keeps the position of the last occurrence of each letter.
Starting from the beginning of the string, for each character, we calculate how many substrings it's the end point with this new unique character, which is the length from its previous occurrence to the current position.
For example, consider a string "ABCA". For the first three characters, we have unique characters and there are 6 substrings as result("A","B","C","AB","BC","ABC"). When we come to the last 'A', "ABCA" is not a unique string and the count should be decreased by one because there is an 'A' already. But "BCA" is still a unique string and we should count it in. So, the substrings that contribute to the result when we add 'A' to "ABC" are "A", "CA", "BCA". The length of these strings to the current position is i - lastSeen[c]
.
Solution in Python
1 2python 3class Solution: 4 def uniqueLetterString(self, s: str) -> int: 5 ans = 0 6 count = 0 7 lastCount = [0]*26 8 lastSeen = [-1]*26 9 10 # Iterate over the string 11 for i in range(len(s)): 12 c = ord(s[i]) - ord('A') 13 currentCount = i - lastSeen[c] 14 15 # Update the count and total sum of unique characters 16 count = count - lastCount[c] + currentCount 17 lastCount[c] = currentCount 18 lastSeen[c] = i 19 ans += count 20 21 # Return the total sum of unique characters 22 return ans % (10**9 + 7)
Solution in JAVA
1 2java 3class Solution { 4 public int uniqueLetterString(String s) { 5 int ans = 0; 6 int count = 0; 7 int[] lastCount = new int[26]; 8 int[] lastSeen = new int[26]; 9 10 // Initialize lastSeen array to -1 11 Arrays.fill(lastSeen, -1); 12 13 // Iterate over the string 14 for (int i = 0; i < s.length(); i++) { 15 int c = s.charAt(i) - 'A'; 16 int currentCount = i - lastSeen[c]; 17 18 // Update the count and total sum of unique characters 19 count = count - lastCount[c] + currentCount; 20 lastCount[c] = currentCount; 21 lastSeen[c] = i; 22 ans += count; 23 } 24 25 // Return the total sum of unique characters 26 return ans % (1000000000 + 7); 27 } 28}
Solution in Javascript
1 2javascript 3var uniqueLetterString = function(s) { 4 let ans = 0; 5 let count = 0; 6 let lastCount = new Array(26).fill(0); 7 let lastSeen = new Array(26).fill(-1); 8 9 for (let i = 0; i < s.length; i++) { 10 let c = s.charCodeAt(i) - 65; 11 let currentCount = i - lastSeen[c]; 12 13 // update count and ans 14 count = count - lastCount[c] + currentCount; 15 lastCount[c] = currentCount; 16 lastSeen[c] = i; 17 ans += count; 18 } 19 20 return ans % 1000000007; 21};
Solution in C++
1 2c++ 3class Solution { 4public: 5 int uniqueLetterString(string s) { 6 int ans = 0, count = 0; 7 vector<int> lastCount(26), lastSeen(26, -1); 8 9 for (int i = 0; i < s.length(); ++i) { 10 int c = s[i] - 'A'; 11 int currentCount = i - lastSeen[c]; 12 13 // update count and ans 14 count = count - lastCount[c] + currentCount; 15 lastCount[c] = currentCount; 16 lastSeen[c] = i; 17 ans += count; 18 } 19 20 return ans % 1000000007; 21 } 22};
Solution in C#
1 2csharp 3public class Solution { 4 public int UniqueLetterString(string s) { 5 int ans = 0, count = 0; 6 int[] lastCount = new int[26]; 7 int[] lastSeen = new int[26]; 8 9 for(int i=0; i<26; i++) { 10 lastSeen[i] = -1; 11 } 12 13 for (int i = 0; i < s.Length; ++i) { 14 int c = s[i] - 'A'; 15 int currentCount = i - lastSeen[c]; 16 17 // update count and ans 18 count = count - lastCount[c] + currentCount; 19 lastCount[c] = currentCount; 20 lastSeen[c] = i; 21 ans += count; 22 } 23 24 return ans % 1000000007; 25 } 26}
Overall, it is clear that tracking the nth-to-last position of occurrence of a character being analyzed can be a powerful strategy for this type of problem. As the solution requires us to persistently track two data points (the character's nth-to-last position and the count of unique substrings), we make use of fixed-size arrays to keep track of these data points for all 26 alphabetic characters.
This type of problem exposes the programmer to thinking about how to use data structures like arrays to record past information to solve a current problem efficiently, a strategy commonly referred to as dynamic programming.
Additionally, the problem also demonstrates that hashing to arbitrary indices can be done to improve program performance, as with the use of the ord() function in Python and charCodeAt() in Javascript which maps a character to its ASCII equivalent, then subtracting the ASCII value of 'A' or 'a' to get '0' through '25' values for each letter.
C++, C#, and Java provide a similar solution to the explained Python solution in their respective syntax. All of them also rely on persisting count information in arrays or vectors indexed by character from 'A' to 'Z' to solve this problem.
Conclusion
The varying solutions contain algorithms that enable you to find the count of unique characters for all possible substrings of a string. Taking you through Python, Java, JavaScript, C++, and C# languages, this problem along with these diverse solutions provide not only resolutions to finding the track of unique characters in substrings but also gives an insight into the dynamic programming strategy. This can be beneficial in solving many problems that ask for optimization or counting the number of ways to do something, and are a common theme in competitive programming and many interview questions. It also demonstrates the effective use of basic data structures and built-in functions in different programming languages.
Got a question?ย Ask the Teaching Assistantย anything you don't understand.
Still not clear? Ask in the Forum, ย Discordย orย Submitย the part you don't understand to our editors.