2416. Sum of Prefix Scores of Strings

HardTrieArrayStringCounting
Leetcode Link

Problem Description

The problem presents us with an array of non-empty strings called words. The task is to calculate a score for each string, which represents the count of other strings in the array that have this string as a prefix. Specifically, for a given string word from the array, its score is determined by how many times it serves as a prefix for the other strings in words. It’s worth highlighting that a string is always considered a prefix of itself.

As an example, if words contains ["a", "ab", "abc", "cab"], the score for the string "ab" is 2 because it is a prefix for both "ab" and "abc". We are asked to calculate and return an array answer where answer[i] is equal to the sum of scores for all non-empty prefixes of words[i].

Intuition

The core idea to solve this problem is to use a Trie data structure. A Trie is an efficient information retrieval data structure that is used to represent the "retrieval" of data. It specializes in managing strings that are in a particularly large dataset, allowing for fast and efficient prefix-based searches.

Here's why using Trie is an effective solution:

  1. When we insert each word into the Trie, we simultaneously increase the count at each node. This count can be interpreted as the number of words in which the string represented by the path to that node serves as a prefix.

  2. Inserting all the words into the Trie ensures that every prefix is accounted for, and their respective scores are updated properly throughout the Trie nodes.

  3. To find the total score of all non-empty prefixes of a word, we search the Trie for that word. As we go through each character of the word, we add to the answer the counts encountered at each node. These counts represent the score of the prefix ending at that node.

  4. By doing this for each word in words, we end up with an array of total prefix scores corresponding to each word, as required.

Using this approach, we can efficiently calculate the score for each word without re-comparing each string with all other strings in every iteration, avoiding a brute-force solution that would be much less efficient.

Learn more about Trie patterns.

Solution Approach

The provided solution uses a Trie data structure to efficiently compute the sum of scores for the prefixes of each word in the input array words. Here's a step-by-step explanation of how the solution works:

Class Design

A class [Trie](/problems/trie_intro) is defined with two primary attributes: children and cnt. children is a list of length 26, which corresponds to the 26 letters of the English alphabet, and cnt is an integer representing the count of words that pass through this particular node.

  1. Initialization: The [Trie](/problems/trie_intro) instance is initialized with children set to an array of None values (since it's empty initially) and the cnt count set to 0.

  2. Insert Method: The insert method takes a word (w) and adds it to the Trie. With each character insertion, the respective character index is computed (ord(c) - ord('a')) to identify the relative position of the node in children. If the node is None, a new Trie node is created. Every time a character is inserted, node.cnt is incremented, indicating that one more word that has this prefix has been added to the Trie.

  3. Search Method: The search method computes the total score of prefixes for a given word (w). It iterates through each character of the word, moving through the Trie nodes and summing up the cnt values encountered at each step. If it encounters None, meaning that no further prefix exists, it returns the sum collected so far. Otherwise, it continues until all characters of the word have been processed, accumulating the total score.

Solution Class

The Solution class contains the sumPrefixScores method, which accepts the words list and returns a list of total prefix scores for each word in words.

  1. A new [Trie](/problems/trie_intro) instance is created.

  2. Each word in the words array is inserted into the Trie using the insert method of the Trie class. This builds the Trie structure with the correct cnt values for each node, representing the scores of the corresponding prefixes.

  3. Finally, the search method of the [Trie](/problems/trie_intro) class is used to calculate the total score of non-empty prefixes for each word. This is done inside a list comprehension that returns the resulting list of scores.

By following this approach, the algorithm efficiently calculates the score of each word by leveraging the Trie structure, avoiding redundancy and excessive comparisons that would be inherent in a brute-force approach.

Ready to land your dream job?

Unlock your dream job with a 2-minute evaluator for a personalized learning plan!

Start Evaluator

Example Walkthrough

Let's walk through the solution approach using the example array words = ["top", "tops", "toss", "taco", "tap"]. We’d like to calculate the score for each string in the array by determining the number of times it serves as a prefix for other strings in the array.

Initial Trie Construction

  1. We start by creating an empty Trie.
  2. We insert the first word "top". Now "t", "to", and "top" all have a count of 1 in their respective nodes.
  3. We insert the second word "tops". Since "top" is already there, when we insert "s", "tops" also has a count of 1, and the "top" node count increases to 2.
  4. We proceed similarly with "toss", increasing the count of "tos" and "toss" to 1, and the node "t" will have a count of 3 because 3 words have been inserted starting with "t".
  5. We insert "taco", similarly building up counts for "t", "ta", "tac", and "taco".
  6. We insert "tap", also building up counts for those nodes.

Now, for example, if we consider the node representing "ta", it has a count of 2 because it serves as a prefix for "taco" and "tap".

Computing Scores

  1. For each word in words, we will traverse the Trie node by node, summing the counts.

  2. Taking "top" as an example, the traversal would be 't' -> 'to' -> 'top', and for each of those we would sum the counts encountered: 3 (at 't') + 3 (at 'to') + 2 (at 'top') = 8.

  3. Similarly, we traverse for "tops", resulting in counts of 3 (at 't') + 3 (at 'to') + 2 (at 'top') + 1 (at 'tops') = 9.

  4. We repeat this for each word: "toss" (8), "taco" (5), and "tap" (6).

Result

As a result, for the given 'words' array, the sumPrefixScores function will return the array [8, 9, 8, 5, 6]. Each element of the returned array corresponds to the total score of all the non-empty prefixes for each word in the words array.

Thus, the Trie-based approach efficiently computes the prefix scores for all words in the given array without redundant comparisons.

Solution Implementation

1class Trie:
2    def __init__(self):
3        # Initialize each Trie node with 26 pointers for each lowercase letter in the alphabet
4        self.children = [None] * 26
5        # Initialize a variable to keep count of the number of words passing through this node
6        self.count = 0
7
8    def insert(self, word):
9        # Start from the root node
10        node = self
11        for char in word:
12            # Convert character to the corresponding index (0-25 for 'a'-'z')
13            index = ord(char) - ord('a')
14            # If the child node for this character doesn't exist, create it
15            if node.children[index] is None:
16                node.children[index] = Trie()
17            # Move to the child node
18            node = node.children[index]
19            # Increment the count for each node that is part of a word
20            node.count += 1
21
22    def search(self, word):
23        # Start from the root node
24        node = self
25        # To gather the cumulative count of all prefixes of the searched word
26        total_count = 0
27        for char in word:
28            # Convert character to index
29            index = ord(char) - ord('a')
30            # If the node doesn't have a child at this index, return current total count
31            if node.children[index] is None:
32                return total_count
33            # Move to the child node
34            node = node.children[index]
35            # Add the node's count to the total count
36            total_count += node.count
37        # Return the total count which represents the sum of the prefix scores for the word
38        return total_count
39
40
41class Solution:
42    def sum_prefix_scores(self, words):
43        # Create an instance of the Trie
44        trie = Trie()
45        # Insert each word into the trie, building up prefix counts
46        for word in words:
47            trie.insert(word)
48        # Retrieve and return the sum of the prefix scores for each word
49        return [trie.search(word) for word in words]
50
1class Trie {
2    private Trie[] children = new Trie[26]; // Each trie node can have up to 26 children for each letter of the alphabet
3    private int count; // This variable counts the number of times a node is visited during insertions
4
5    // Function to insert a word into the Trie
6    public void insert(String word) {
7        Trie node = this;
8        for (char c : word.toCharArray()) {
9            int index = c - 'a'; // Obtain the index by subtracting the ASCII value of 'a' from the character
10            if (node.children[index] == null) {
11                node.children[index] = new Trie(); // Create a new Trie node if it does not exist
12            }
13            node = node.children[index];
14            node.count++; // Increment the count for each node visited
15        }
16    }
17
18    // Function to compute the sum of the prefix scores for the given word
19    public int search(String word) {
20        Trie node = this;
21        int sum = 0;
22        for (char c : word.toCharArray()) {
23            int index = c - 'a'; // Obtain the index as done in insert function
24            if (node.children[index] == null) {
25                return sum; // Return the current sum if the node is null (prefix does not exist)
26            }
27            node = node.children[index];
28            sum += node.count; // Add the count of the visited node to the sum
29        }
30        return sum;
31    }
32}
33
34class Solution {
35    // Function to compute the sum of prefix scores for each word in the array
36    public int[] sumPrefixScores(String[] words) {
37        Trie trie = new Trie(); // Initialize a new Trie
38        for (String word : words) {
39            trie.insert(word); // Insert each word from the array into the Trie
40        }
41        int[] answer = new int[words.length]; // Create an array to hold the result
42        for (int i = 0; i < words.length; i++) {
43            answer[i] = trie.search(words[i]); // Compute the sum of prefix scores for each word in the array
44        }
45        return answer; // Return the computed scores
46    }
47}
48
1#include <vector>
2#include <string>
3using namespace std;
4
5// Trie node structure to store the children and the word count.
6class Trie {
7private:
8    vector<Trie*> children; // Children nodes of the current node.
9    int count; // Count of words passing through this node.
10
11public:
12    // Constructor initializes the children to hold 26 Trie pointers, one for each letter.
13    Trie()
14        : children(26), count(0) {}
15
16    // Insert a word into the Trie.
17    void insert(const string& word) {
18        Trie* node = this;
19        for (char letter : word) {
20            int index = letter - 'a'; // Convert character to index (0-25).
21            // If no child exists at the index, create a new Trie node.
22            if (!node->children[index]) {
23                node->children[index] = new Trie();
24            }
25            node = node->children[index];
26            ++node->count; // Increment count on traversed nodes.
27        }
28    }
29
30    // Search for a word in the Trie and return the sum of the counts of all its prefixes.
31    int search(const string& word) {
32        Trie* node = this; 
33        int sum = 0;
34        for (char letter : word) {
35            int index = letter - 'a'; // Convert character to index (0-25).
36            // If the child at the index doesn't exist, return the current sum.
37            if (!node->children[index]) return sum;
38            node = node->children[index];
39            sum += node->count; // Add count at the node to the sum.
40        }
41        return sum; // Return the total sum for the word.
42    }
43};
44
45// Main Solution class to solve the problem using the Trie.
46class Solution {
47public:
48    // Method to calculate the sum of prefix scores for each word in the input vector.
49    vector<int> sumPrefixScores(vector<string>& words) {
50        Trie* trie = new Trie(); // Create a new Trie.
51        // Insert each word from the input vector into the Trie.
52        for (const string& word : words) {
53            trie->insert(word);
54        }
55        vector<int> answer; // Vector to store the result.
56        // Search each word and calculate the sum of prefix scores.
57        for (const string& word : words) {
58            answer.push_back(trie->search(word));
59        }
60        return answer; // Return the resulting vector of sums.
61    }
62};
63
1// Declare the trie node structure.
2interface TrieNode {
3  children: TrieNode[];
4  count: number;
5}
6
7// Create a function to initialize a trie node.
8function createTrieNode(): TrieNode {
9  return {
10    children: new Array(26),
11    count: 0
12  };
13}
14
15// Initialize the trie root node.
16let root: TrieNode = createTrieNode();
17
18// Function to insert a word into the trie.
19function insert(word: string): void {
20  let node = root;
21  for (const char of word) {
22    const index = char.charCodeAt(0) - 'a'.charCodeAt(0);
23    if (!node.children[index]) {
24      node.children[index] = createTrieNode();
25    }
26    node = node.children[index];
27    node.count++;
28  }
29}
30
31// Function to search for a word in the trie and calculate the sum of counts of its prefixes.
32function search(word: string): number {
33  let node = root;
34  let sum = 0;
35  for (const char of word) {
36    const index = char.charCodeAt(0) - 'a'.charCodeAt(0);
37    if (!node.children[index]) {
38      return sum;
39    }
40    node = node.children[index];
41    sum += node.count;
42  }
43  return sum;
44}
45
46// Function to calculate the prefix scores for a list of words.
47function sumPrefixScores(words: string[]): number[] {
48  // Reset the trie for new entries.
49  root = createTrieNode();
50
51  // Insert words to the trie.
52  for (const word of words) {
53    insert(word);
54  }
55
56  // Compute and collect the prefix scores for each word.
57  let scores: number[] = [];
58  for (const word of words) {
59    scores.push(search(word));
60  }
61
62  // Return the collected scores.
63  return scores;
64}
65

Time and Space Complexity

Time Complexity

The time complexity of the insert method in the Trie class is O(L), where L is the length of the word being inserted. This is because we iterate through each character in the word once to insert it into the trie.

Given N words, inserting them all into the trie would take O(N * L) time, where L is the average length of the words.

The search method also runs in O(L) for a similar reason – we go through each character of the word to compute the accumulated scores.

Therefore, when searching for the prefix scores for all words, assuming L is again the average length of the words, this operation would also take O(N * L) time.

Overall, the combined time complexity for constructing the trie and then performing the search for all words is O(N * L).

Space Complexity

The space complexity of the trie is O(M * 26) where M is the total number of unique nodes in the trie. M depends on the total number of unique letters in all inserted words. Since each node has an array of 26 elements to hold references to its children, the 26 factor is included. In the worst-case scenario where there is no overlap of prefixes, M can be as large as N * L, leading to a space complexity of O(N * L * 26). However, in practice, since there is likely significant overlap (common prefixes) when inserting words in English, the actual space used is usually much less than this upper bound.

Thus, we can more accurately denote the space complexity as O(M), acknowledging that in the average case, this does not reach the worst-case scenario of O(N * L * 26) because of the common prefixes in the trie.

Learn more about how to find time and space complexity quickly using problem constraints.


Discover Your Strengths and Weaknesses: Take Our 2-Minute Quiz to Tailor Your Study Plan:
Question 1 out of 10

Which technique can we use to find the middle of a linked list?


Recommended Readings