2416. Sum of Prefix Scores of Strings
Problem Description
The problem presents us with an array of non-empty strings called words
. The task is to calculate a score for each string, which represents the count of other strings in the array that have this string as a prefix. Specifically, for a given string word
from the array, its score is determined by how many times it serves as a prefix for the other strings in words
. It’s worth highlighting that a string is always considered a prefix of itself.
As an example, if words
contains ["a", "ab", "abc", "cab"]
, the score for the string "ab"
is 2
because it is a prefix for both "ab"
and "abc"
. We are asked to calculate and return an array answer
where answer[i]
is equal to the sum of scores for all non-empty prefixes of words[i]
.
Intuition
The core idea to solve this problem is to use a Trie data structure. A Trie is an efficient information retrieval data structure that is used to represent the "retrieval" of data. It specializes in managing strings that are in a particularly large dataset, allowing for fast and efficient prefix-based searches.
Here's why using Trie is an effective solution:
-
When we insert each word into the Trie, we simultaneously increase the count at each node. This count can be interpreted as the number of words in which the string represented by the path to that node serves as a prefix.
-
Inserting all the words into the Trie ensures that every prefix is accounted for, and their respective scores are updated properly throughout the Trie nodes.
-
To find the total score of all non-empty prefixes of a word, we search the Trie for that word. As we go through each character of the word, we add to the answer the counts encountered at each node. These counts represent the score of the prefix ending at that node.
-
By doing this for each word in
words
, we end up with an array of total prefix scores corresponding to each word, as required.
Using this approach, we can efficiently calculate the score for each word without re-comparing each string with all other strings in every iteration, avoiding a brute-force solution that would be much less efficient.
Learn more about Trie patterns.
Solution Approach
The provided solution uses a Trie data structure to efficiently compute the sum of scores for the prefixes of each word in the input array words
. Here's a step-by-step explanation of how the solution works:
Class Design
A class [Trie](/problems/trie_intro)
is defined with two primary attributes: children
and cnt
. children
is a list of length 26, which corresponds to the 26 letters of the English alphabet, and cnt
is an integer representing the count of words that pass through this particular node.
-
Initialization: The
[Trie](/problems/trie_intro)
instance is initialized withchildren
set to an array ofNone
values (since it's empty initially) and thecnt
count set to0
. -
Insert Method: The
insert
method takes a word (w
) and adds it to the Trie. With each character insertion, the respective character index is computed (ord(c) - ord('a')
) to identify the relative position of the node inchildren
. If the node isNone
, a newTrie
node is created. Every time a character is inserted,node.cnt
is incremented, indicating that one more word that has this prefix has been added to the Trie. -
Search Method: The
search
method computes the total score of prefixes for a given word (w
). It iterates through each character of the word, moving through the Trie nodes and summing up thecnt
values encountered at each step. If it encountersNone
, meaning that no further prefix exists, it returns the sum collected so far. Otherwise, it continues until all characters of the word have been processed, accumulating the total score.
Solution Class
The Solution
class contains the sumPrefixScores
method, which accepts the words
list and returns a list of total prefix scores for each word in words
.
-
A new
[Trie](/problems/trie_intro)
instance is created. -
Each word in the
words
array is inserted into the Trie using theinsert
method of theTrie
class. This builds the Trie structure with the correctcnt
values for each node, representing the scores of the corresponding prefixes. -
Finally, the
search
method of the[Trie](/problems/trie_intro)
class is used to calculate the total score of non-empty prefixes for each word. This is done inside a list comprehension that returns the resulting list of scores.
By following this approach, the algorithm efficiently calculates the score of each word by leveraging the Trie structure, avoiding redundancy and excessive comparisons that would be inherent in a brute-force approach.
Ready to land your dream job?
Unlock your dream job with a 2-minute evaluator for a personalized learning plan!
Start EvaluatorExample Walkthrough
Let's walk through the solution approach using the example array words = ["top", "tops", "toss", "taco", "tap"]
. We’d like to calculate the score for each string in the array by determining the number of times it serves as a prefix for other strings in the array.
Initial Trie Construction
- We start by creating an empty Trie.
- We insert the first word "top". Now "t", "to", and "top" all have a count of 1 in their respective nodes.
- We insert the second word "tops". Since "top" is already there, when we insert "s", "tops" also has a count of 1, and the "top" node count increases to 2.
- We proceed similarly with "toss", increasing the count of "tos" and "toss" to 1, and the node "t" will have a count of 3 because 3 words have been inserted starting with "t".
- We insert "taco", similarly building up counts for "t", "ta", "tac", and "taco".
- We insert "tap", also building up counts for those nodes.
Now, for example, if we consider the node representing "ta", it has a count of 2 because it serves as a prefix for "taco" and "tap".
Computing Scores
-
For each word in
words
, we will traverse the Trie node by node, summing the counts. -
Taking "top" as an example, the traversal would be 't' -> 'to' -> 'top', and for each of those we would sum the counts encountered: 3 (at 't') + 3 (at 'to') + 2 (at 'top') = 8.
-
Similarly, we traverse for "tops", resulting in counts of 3 (at 't') + 3 (at 'to') + 2 (at 'top') + 1 (at 'tops') = 9.
-
We repeat this for each word: "toss" (8), "taco" (5), and "tap" (6).
Result
As a result, for the given 'words' array, the sumPrefixScores
function will return the array [8, 9, 8, 5, 6]
. Each element of the returned array corresponds to the total score of all the non-empty prefixes for each word in the words
array.
Thus, the Trie-based approach efficiently computes the prefix scores for all words in the given array without redundant comparisons.
Solution Implementation
1class Trie:
2 def __init__(self):
3 # Initialize each Trie node with 26 pointers for each lowercase letter in the alphabet
4 self.children = [None] * 26
5 # Initialize a variable to keep count of the number of words passing through this node
6 self.count = 0
7
8 def insert(self, word):
9 # Start from the root node
10 node = self
11 for char in word:
12 # Convert character to the corresponding index (0-25 for 'a'-'z')
13 index = ord(char) - ord('a')
14 # If the child node for this character doesn't exist, create it
15 if node.children[index] is None:
16 node.children[index] = Trie()
17 # Move to the child node
18 node = node.children[index]
19 # Increment the count for each node that is part of a word
20 node.count += 1
21
22 def search(self, word):
23 # Start from the root node
24 node = self
25 # To gather the cumulative count of all prefixes of the searched word
26 total_count = 0
27 for char in word:
28 # Convert character to index
29 index = ord(char) - ord('a')
30 # If the node doesn't have a child at this index, return current total count
31 if node.children[index] is None:
32 return total_count
33 # Move to the child node
34 node = node.children[index]
35 # Add the node's count to the total count
36 total_count += node.count
37 # Return the total count which represents the sum of the prefix scores for the word
38 return total_count
39
40
41class Solution:
42 def sum_prefix_scores(self, words):
43 # Create an instance of the Trie
44 trie = Trie()
45 # Insert each word into the trie, building up prefix counts
46 for word in words:
47 trie.insert(word)
48 # Retrieve and return the sum of the prefix scores for each word
49 return [trie.search(word) for word in words]
50
1class Trie {
2 private Trie[] children = new Trie[26]; // Each trie node can have up to 26 children for each letter of the alphabet
3 private int count; // This variable counts the number of times a node is visited during insertions
4
5 // Function to insert a word into the Trie
6 public void insert(String word) {
7 Trie node = this;
8 for (char c : word.toCharArray()) {
9 int index = c - 'a'; // Obtain the index by subtracting the ASCII value of 'a' from the character
10 if (node.children[index] == null) {
11 node.children[index] = new Trie(); // Create a new Trie node if it does not exist
12 }
13 node = node.children[index];
14 node.count++; // Increment the count for each node visited
15 }
16 }
17
18 // Function to compute the sum of the prefix scores for the given word
19 public int search(String word) {
20 Trie node = this;
21 int sum = 0;
22 for (char c : word.toCharArray()) {
23 int index = c - 'a'; // Obtain the index as done in insert function
24 if (node.children[index] == null) {
25 return sum; // Return the current sum if the node is null (prefix does not exist)
26 }
27 node = node.children[index];
28 sum += node.count; // Add the count of the visited node to the sum
29 }
30 return sum;
31 }
32}
33
34class Solution {
35 // Function to compute the sum of prefix scores for each word in the array
36 public int[] sumPrefixScores(String[] words) {
37 Trie trie = new Trie(); // Initialize a new Trie
38 for (String word : words) {
39 trie.insert(word); // Insert each word from the array into the Trie
40 }
41 int[] answer = new int[words.length]; // Create an array to hold the result
42 for (int i = 0; i < words.length; i++) {
43 answer[i] = trie.search(words[i]); // Compute the sum of prefix scores for each word in the array
44 }
45 return answer; // Return the computed scores
46 }
47}
48
1#include <vector>
2#include <string>
3using namespace std;
4
5// Trie node structure to store the children and the word count.
6class Trie {
7private:
8 vector<Trie*> children; // Children nodes of the current node.
9 int count; // Count of words passing through this node.
10
11public:
12 // Constructor initializes the children to hold 26 Trie pointers, one for each letter.
13 Trie()
14 : children(26), count(0) {}
15
16 // Insert a word into the Trie.
17 void insert(const string& word) {
18 Trie* node = this;
19 for (char letter : word) {
20 int index = letter - 'a'; // Convert character to index (0-25).
21 // If no child exists at the index, create a new Trie node.
22 if (!node->children[index]) {
23 node->children[index] = new Trie();
24 }
25 node = node->children[index];
26 ++node->count; // Increment count on traversed nodes.
27 }
28 }
29
30 // Search for a word in the Trie and return the sum of the counts of all its prefixes.
31 int search(const string& word) {
32 Trie* node = this;
33 int sum = 0;
34 for (char letter : word) {
35 int index = letter - 'a'; // Convert character to index (0-25).
36 // If the child at the index doesn't exist, return the current sum.
37 if (!node->children[index]) return sum;
38 node = node->children[index];
39 sum += node->count; // Add count at the node to the sum.
40 }
41 return sum; // Return the total sum for the word.
42 }
43};
44
45// Main Solution class to solve the problem using the Trie.
46class Solution {
47public:
48 // Method to calculate the sum of prefix scores for each word in the input vector.
49 vector<int> sumPrefixScores(vector<string>& words) {
50 Trie* trie = new Trie(); // Create a new Trie.
51 // Insert each word from the input vector into the Trie.
52 for (const string& word : words) {
53 trie->insert(word);
54 }
55 vector<int> answer; // Vector to store the result.
56 // Search each word and calculate the sum of prefix scores.
57 for (const string& word : words) {
58 answer.push_back(trie->search(word));
59 }
60 return answer; // Return the resulting vector of sums.
61 }
62};
63
1// Declare the trie node structure.
2interface TrieNode {
3 children: TrieNode[];
4 count: number;
5}
6
7// Create a function to initialize a trie node.
8function createTrieNode(): TrieNode {
9 return {
10 children: new Array(26),
11 count: 0
12 };
13}
14
15// Initialize the trie root node.
16let root: TrieNode = createTrieNode();
17
18// Function to insert a word into the trie.
19function insert(word: string): void {
20 let node = root;
21 for (const char of word) {
22 const index = char.charCodeAt(0) - 'a'.charCodeAt(0);
23 if (!node.children[index]) {
24 node.children[index] = createTrieNode();
25 }
26 node = node.children[index];
27 node.count++;
28 }
29}
30
31// Function to search for a word in the trie and calculate the sum of counts of its prefixes.
32function search(word: string): number {
33 let node = root;
34 let sum = 0;
35 for (const char of word) {
36 const index = char.charCodeAt(0) - 'a'.charCodeAt(0);
37 if (!node.children[index]) {
38 return sum;
39 }
40 node = node.children[index];
41 sum += node.count;
42 }
43 return sum;
44}
45
46// Function to calculate the prefix scores for a list of words.
47function sumPrefixScores(words: string[]): number[] {
48 // Reset the trie for new entries.
49 root = createTrieNode();
50
51 // Insert words to the trie.
52 for (const word of words) {
53 insert(word);
54 }
55
56 // Compute and collect the prefix scores for each word.
57 let scores: number[] = [];
58 for (const word of words) {
59 scores.push(search(word));
60 }
61
62 // Return the collected scores.
63 return scores;
64}
65
Time and Space Complexity
Time Complexity
The time complexity of the insert
method in the Trie
class is O(L)
, where L
is the length of the word being inserted. This is because we iterate through each character in the word once to insert it into the trie.
Given N
words, inserting them all into the trie would take O(N * L)
time, where L
is the average length of the words.
The search
method also runs in O(L)
for a similar reason – we go through each character of the word to compute the accumulated scores.
Therefore, when searching for the prefix scores for all words, assuming L
is again the average length of the words, this operation would also take O(N * L)
time.
Overall, the combined time complexity for constructing the trie and then performing the search for all words is O(N * L)
.
Space Complexity
The space complexity of the trie is O(M * 26)
where M
is the total number of unique nodes in the trie. M
depends on the total number of unique letters in all inserted words. Since each node has an array of 26 elements to hold references to its children, the 26
factor is included. In the worst-case scenario where there is no overlap of prefixes, M
can be as large as N * L
, leading to a space complexity of O(N * L * 26)
. However, in practice, since there is likely significant overlap (common prefixes) when inserting words in English, the actual space used is usually much less than this upper bound.
Thus, we can more accurately denote the space complexity as O(M)
, acknowledging that in the average case, this does not reach the worst-case scenario of O(N * L * 26)
because of the common prefixes in the trie.
Learn more about how to find time and space complexity quickly using problem constraints.
Which technique can we use to find the middle of a linked list?
Recommended Readings
Trie Introduction Imagine you have a magic dictionary where as soon as you start spelling a word it immediately tells you how many words that start with those letters How cool would that be This is exactly the kind of magic a data structure called trie performs What's a Trie
LeetCode Patterns Your Personal Dijkstra's Algorithm to Landing Your Dream Job The goal of AlgoMonster is to help you get a job in the shortest amount of time possible in a data driven way We compiled datasets of tech interview problems and broke them down by patterns This way we
Recursion Recursion is one of the most important concepts in computer science Simply speaking recursion is the process of a function calling itself Using a real life analogy imagine a scenario where you invite your friends to lunch https algomonster s3 us east 2 amazonaws com recursion jpg You first