Facebook Pixel

527. Word Abbreviation πŸ”’

Problem Description

You are given an array of distinct strings words. Your task is to return the minimal possible abbreviations for every word.

The abbreviation rules work as follows:

  1. Initial abbreviation: Each word starts with an abbreviation formed by taking the first character, followed by the count of characters in between, followed by the last character. For example, "abcdef" becomes "a4f" (first char 'a', 4 characters in between, last char 'f').

  2. Handling conflicts: When multiple words share the same abbreviation, you need to resolve the conflict by:

    • Increasing the prefix (the first part) of their abbreviations by 1 character at a time
    • Continue this process until all abbreviations become unique
    • For example: If "abcdef" and "abndef" both initially abbreviate to "a4f", you would:
      • First expand to "ab3f" and "ab3f" (still conflicting)
      • Then expand to "abc2f" and "abn2f" (now unique)
  3. Final check: After finding the unique abbreviation, if the abbreviation is not shorter than the original word, keep the original word instead.

The goal is to find the shortest possible unique abbreviation for each word following these rules. The output should be an array where each element is either the abbreviated form or the original word (if abbreviation doesn't save space).

Quick Interview Experience
Help others by sharing your interview experience
Have you seen this problem before?

Intuition

The key insight is recognizing that words can only have the same abbreviation if they share certain properties: they must have the same length and the same last letter. This is because the abbreviation format is prefix + count + last_letter, where the count depends on the total length minus the prefix length minus 1.

For example, "apple" (length 5, last letter 'e') can only conflict with other 5-letter words ending in 'e'. It could never conflict with "apply" (different last letter) or "sample" (different length).

This observation suggests we can group words by (length, last_letter) pairs and handle each group independently. Within each group, we need to find the minimum prefix length that makes each word's abbreviation unique.

To efficiently find the minimum distinguishing prefix for each word in a group, we can use a Trie (prefix tree). The Trie allows us to track how many words share each prefix. As we traverse down the Trie for a word, the first node where cnt = 1 indicates we've found a prefix that uniquely identifies this word among its group.

The algorithm becomes:

  1. Group words by (length, last_letter)
  2. Build a separate Trie for each group
  3. For each word, traverse its group's Trie to find the minimum prefix length where it becomes unique
  4. Generate the abbreviation using this prefix length, but only use it if it's shorter than the original word

This approach is efficient because we only compare words that could potentially conflict, and the Trie structure allows us to find the minimum distinguishing prefix in a single traversal per word.

Learn more about Greedy, Trie and Sorting patterns.

Solution Approach

The solution implements the grouped Trie approach through two main components: a Trie class and the main solution logic.

Trie Implementation

The [Trie](/problems/trie_intro) class has the following structure:

  • children: An array of size 26 representing child nodes for each lowercase letter
  • cnt: Counter tracking how many words pass through this node

The Trie provides two key methods:

insert(w): Inserts a word into the Trie

  • Traverse each character of the word
  • For each character, create a child node if it doesn't exist
  • Increment the cnt at each node visited
  • This cnt value tells us how many words share this prefix

search(w): Finds the minimum prefix length for unique identification

  • Traverse the word character by character
  • Keep track of the prefix length (cnt)
  • When we find a node where node.cnt == 1, we've found the unique prefix
  • Return this prefix length, or the full word length if no unique prefix exists before the end

Main Algorithm

  1. Group words by (length, last_letter):

    tries = {}
    for w in words:
        m = len(w)
        if (m, w[-1]) not in tries:
            tries[(m, w[-1])] = [Trie](/problems/trie_intro)()
        tries[(m, w[-1])].insert(w)
    • Create a dictionary where keys are (word_length, last_character) tuples
    • Each key maps to a separate Trie containing all words with that length and ending
  2. Find minimum abbreviation for each word:

    for w in words:
        cnt = tries[(len(w), w[-1])].search(w)
    • For each word, search in its corresponding group's Trie
    • Get the minimum prefix length needed for uniqueness
  3. Generate the final abbreviation:

    ans.append(
        w if cnt + 2 >= len(w) else w[:cnt] + str(len(w) - cnt - 1) + w[-1]
    )
    • If cnt + 2 >= len(w): The abbreviation won't be shorter (we need at least the prefix, one digit, and last letter), so keep the original word
    • Otherwise: Create abbreviation as prefix + middle_count + last_letter
      • w[:cnt]: The unique prefix
      • str(len(w) - cnt - 1): Count of characters between prefix and last letter
      • w[-1]: The last letter

Time and Space Complexity

  • Time Complexity: O(N Γ— L) where N is the number of words and L is the average word length

    • Inserting all words into Tries: O(N Γ— L)
    • Searching for each word: O(N Γ— L)
  • Space Complexity: O(N Γ— L) for storing all words in the Tries

The beauty of this approach is that it only compares words that could potentially have the same abbreviation, making it much more efficient than comparing all pairs of words.

Ready to land your dream job?

Unlock your dream job with a 5-minute evaluator for a personalized learning plan!

Start Evaluator

Example Walkthrough

Let's walk through the solution with a small example: words = ["apple", "apply", "ample"]

Step 1: Group words by (length, last_letter)

  • "apple": length = 5, last letter = 'e' β†’ group (5, 'e')
  • "apply": length = 5, last letter = 'y' β†’ group (5, 'y')
  • "ample": length = 5, last letter = 'e' β†’ group (5, 'e')

So we have two groups:

  • Group (5, 'e'): ["apple", "ample"]
  • Group (5, 'y'): ["apply"]

Step 2: Build Tries for each group

For group (5, 'e'), we build a Trie with "apple" and "ample":

     root (cnt=2)
        |
        a (cnt=2)
       / \
      p   m (cnt=1)
   (cnt=1) |
      |    p (cnt=1)
      p    |
      |    l (cnt=1)
      l    |
      |    e (cnt=1)
      e

For group (5, 'y'), we build a Trie with just "apply":

     root (cnt=1)
        |
        a β†’ p β†’ p β†’ l β†’ y (all cnt=1)

Step 3: Find minimum prefix for each word

  • "apple": Traverse the (5, 'e') Trie

    • 'a': cnt=2 (not unique, keep going)
    • 'ap': cnt=1 (unique!)
    • Minimum prefix length = 2
  • "ample": Traverse the (5, 'e') Trie

    • 'a': cnt=2 (not unique, keep going)
    • 'am': cnt=1 (unique!)
    • Minimum prefix length = 2
  • "apply": Traverse the (5, 'y') Trie

    • 'a': cnt=1 (unique!)
    • Minimum prefix length = 1

Step 4: Generate abbreviations

  • "apple": prefix length = 2

    • Abbreviation: "ap" + (5-2-1) + "e" = "ap2e"
    • Length check: 4 < 5, so use "ap2e"
  • "ample": prefix length = 2

    • Abbreviation: "am" + (5-2-1) + "e" = "am2e"
    • Length check: 4 < 5, so use "am2e"
  • "apply": prefix length = 1

    • Abbreviation: "a" + (5-1-1) + "y" = "a3y"
    • Length check: 3 < 5, so use "a3y"

Final Result: ["ap2e", "a3y", "am2e"]

Note how "apple" and "ample" needed longer prefixes (2 characters) because they're in the same group and would conflict with just "a3e", while "apply" only needs 1 character prefix since it's alone in its group.

Solution Implementation

1class Trie:
2    """Trie node for storing words and tracking prefix frequencies."""
3    __slots__ = ["children", "cnt"]
4
5    def __init__(self):
6        # Array to store 26 lowercase letter children nodes
7        self.children = [None] * 26
8        # Count of words passing through this node
9        self.cnt = 0
10
11    def insert(self, w: str):
12        """Insert a word into the trie and update counts along the path."""
13        node = self
14        for c in w:
15            # Calculate index for the character (0-25)
16            idx = ord(c) - ord("a")
17            # Create new node if it doesn't exist
18            if not node.children[idx]:
19                node.children[idx] = Trie()
20            # Move to child node
21            node = node.children[idx]
22            # Increment count for this prefix
23            node.cnt += 1
24
25    def search(self, w: str) -> int:
26        """
27        Find the minimum prefix length that uniquely identifies the word.
28        Returns the number of characters needed for a unique prefix.
29        """
30        node = self
31        cnt = 0
32        for c in w:
33            cnt += 1
34            idx = ord(c) - ord("a")
35            node = node.children[idx]
36            # If only one word passes through this node, we found unique prefix
37            if node.cnt == 1:
38                return cnt
39        # If no unique prefix found, return full word length
40        return len(w)
41
42
43class Solution:
44    def wordsAbbreviation(self, words: List[str]) -> List[str]:
45        """
46        Generate abbreviations for a list of words.
47        Words with same length and last character are grouped together.
48        """
49        # Dictionary to store tries grouped by (word_length, last_character)
50        tries = {}
51      
52        # Build tries for each group of words with same length and last character
53        for w in words:
54            m = len(w)
55            key = (m, w[-1])
56            # Create new trie for this group if it doesn't exist
57            if key not in tries:
58                tries[key] = Trie()
59            # Insert word into the appropriate trie
60            tries[key].insert(w)
61      
62        # Generate abbreviations for each word
63        ans = []
64        for w in words:
65            # Find minimum unique prefix length for this word
66            cnt = tries[(len(w), w[-1])].search(w)
67            # Check if abbreviation would be shorter than original
68            # Need at least: prefix + number + last_char (cnt + 1 + 1)
69            if cnt + 2 >= len(w):
70                # Abbreviation wouldn't save space, use full word
71                ans.append(w)
72            else:
73                # Create abbreviation: prefix + count + last_character
74                abbreviation = w[:cnt] + str(len(w) - cnt - 1) + w[-1]
75                ans.append(abbreviation)
76      
77        return ans
78
1/**
2 * Trie (Prefix Tree) data structure to store words and count occurrences at each node
3 */
4class Trie {
5    // Array to store child nodes for each letter (a-z)
6    private final Trie[] children = new Trie[26];
7    // Counter to track how many words pass through this node
8    private int count;
9
10    /**
11     * Inserts a word into the trie and increments counters along the path
12     * @param word The word to insert into the trie
13     */
14    public void insert(String word) {
15        Trie currentNode = this;
16      
17        // Traverse each character in the word
18        for (char character : word.toCharArray()) {
19            int index = character - 'a';
20          
21            // Create new node if it doesn't exist
22            if (currentNode.children[index] == null) {
23                currentNode.children[index] = new Trie();
24            }
25          
26            // Move to child node and increment its counter
27            currentNode = currentNode.children[index];
28            currentNode.count++;
29        }
30    }
31
32    /**
33     * Searches for the minimum prefix length needed to uniquely identify the word
34     * @param word The word to search for
35     * @return The minimum number of characters needed as prefix
36     */
37    public int search(String word) {
38        Trie currentNode = this;
39        int prefixLength = 0;
40      
41        // Traverse each character in the word
42        for (char character : word.toCharArray()) {
43            prefixLength++;
44            int index = character - 'a';
45            currentNode = currentNode.children[index];
46          
47            // If only one word passes through this node, we found unique prefix
48            if (currentNode.count == 1) {
49                return prefixLength;
50            }
51        }
52      
53        // If no unique prefix found, return full word length
54        return word.length();
55    }
56}
57
58/**
59 * Solution class to generate word abbreviations
60 */
61class Solution {
62    /**
63     * Generates abbreviations for a list of words
64     * Words with same length and last character are grouped together
65     * @param words List of words to abbreviate
66     * @return List of abbreviated words
67     */
68    public List<String> wordsAbbreviation(List<String> words) {
69        // Map to group words by their length and last character
70        // Key: [word length, last character index], Value: Trie for that group
71        Map<List<Integer>, Trie> trieMap = new HashMap<>();
72      
73        // Build tries for each group of words
74        for (String word : words) {
75            // Create key based on word length and last character
76            List<Integer> groupKey = List.of(word.length(), word.charAt(word.length() - 1) - 'a');
77          
78            // Create new trie for this group if it doesn't exist
79            trieMap.putIfAbsent(groupKey, new Trie());
80          
81            // Insert word into the appropriate trie
82            trieMap.get(groupKey).insert(word);
83        }
84      
85        // Generate abbreviations for each word
86        List<String> result = new ArrayList<>();
87      
88        for (String word : words) {
89            int wordLength = word.length();
90          
91            // Get the trie for this word's group
92            List<Integer> groupKey = List.of(wordLength, word.charAt(wordLength - 1) - 'a');
93          
94            // Find minimum prefix length needed for unique identification
95            int prefixLength = trieMap.get(groupKey).search(word);
96          
97            // Check if abbreviation would be shorter than original word
98            // Abbreviation format: prefix + number + last character (minimum 3 chars)
99            if (prefixLength + 2 >= wordLength) {
100                // Abbreviation wouldn't save space, use original word
101                result.add(word);
102            } else {
103                // Create abbreviation: prefix + count of omitted chars + last char
104                String abbreviation = word.substring(0, prefixLength) + 
105                                     (wordLength - prefixLength - 1) + 
106                                     word.substring(wordLength - 1);
107                result.add(abbreviation);
108            }
109        }
110      
111        return result;
112    }
113}
114
1class Trie {
2public:
3    // Constructor initializes the trie node
4    Trie() : prefixCount(0) {
5        fill(children.begin(), children.end(), nullptr);
6    }
7
8    // Insert a word into the trie and update prefix counts
9    void insert(const string& word) {
10        Trie* currentNode = this;
11      
12        for (char ch : word) {
13            int index = ch - 'a';
14          
15            // Create new node if path doesn't exist
16            if (currentNode->children[index] == nullptr) {
17                currentNode->children[index] = new Trie();
18            }
19          
20            // Move to child node and increment its prefix count
21            currentNode = currentNode->children[index];
22            ++currentNode->prefixCount;
23        }
24    }
25
26    // Search for minimum unique prefix length for the given word
27    int search(const string& word) {
28        Trie* currentNode = this;
29        int prefixLength = 0;
30      
31        for (char ch : word) {
32            ++prefixLength;
33            int index = ch - 'a';
34            currentNode = currentNode->children[index];
35          
36            // If only one word has this prefix, we found the minimum unique prefix
37            if (currentNode->prefixCount == 1) {
38                return prefixLength;
39            }
40        }
41      
42        // If no unique prefix found, return full word length
43        return word.size();
44    }
45
46private:
47    array<Trie*, 26> children;  // Pointers to child nodes (26 letters)
48    int prefixCount;             // Number of words sharing this prefix
49};
50
51class Solution {
52public:
53    vector<string> wordsAbbreviation(vector<string>& words) {
54        // Map to group words by (length, last character)
55        // This ensures abbreviations are only compared among words with same length and ending
56        map<pair<int, int>, Trie*> trieGroups;
57      
58        // Build tries for each group of words
59        for (const auto& word : words) {
60            // Create key based on word length and last character
61            pair<int, int> groupKey = {
62                static_cast<int>(word.size()), 
63                word.back() - 'a'
64            };
65          
66            // Create new trie for this group if it doesn't exist
67            if (trieGroups.find(groupKey) == trieGroups.end()) {
68                trieGroups[groupKey] = new Trie();
69            }
70          
71            // Insert word into corresponding trie
72            trieGroups[groupKey]->insert(word);
73        }
74
75        vector<string> result;
76      
77        // Generate abbreviation for each word
78        for (const auto& word : words) {
79            int wordLength = word.size();
80          
81            // Find the trie group for this word
82            pair<int, int> groupKey = {wordLength, word.back() - 'a'};
83          
84            // Find minimum unique prefix length
85            int uniquePrefixLength = trieGroups[groupKey]->search(word);
86          
87            // Check if abbreviation would be shorter than original word
88            // Abbreviation format: prefix + number + last char (at least 3 chars)
89            if (uniquePrefixLength + 2 >= wordLength) {
90                // Abbreviation wouldn't save space, use original word
91                result.push_back(word);
92            } else {
93                // Create abbreviation: prefix + count of middle chars + last char
94                string abbreviation = word.substr(0, uniquePrefixLength) + 
95                                    to_string(wordLength - uniquePrefixLength - 1) + 
96                                    word.back();
97                result.push_back(abbreviation);
98            }
99        }
100
101        return result;
102    }
103};
104
1// Trie node structure for prefix tree
2interface TrieNode {
3    children: (TrieNode | null)[];  // Array of 26 child nodes (for 'a' to 'z')
4    count: number;                   // Number of words passing through this node
5}
6
7// Create a new Trie node
8function createTrieNode(): TrieNode {
9    return {
10        children: Array(26).fill(null),
11        count: 0
12    };
13}
14
15// Insert a word into the Trie and increment count for each node
16function insertIntoTrie(root: TrieNode, word: string): void {
17    let currentNode = root;
18  
19    for (const char of word) {
20        // Calculate index (0-25) for the character
21        const charIndex = char.charCodeAt(0) - 'a'.charCodeAt(0);
22      
23        // Create child node if it doesn't exist
24        if (!currentNode.children[charIndex]) {
25            currentNode.children[charIndex] = createTrieNode();
26        }
27      
28        // Move to child node and increment its count
29        currentNode = currentNode.children[charIndex]!;
30        currentNode.count++;
31    }
32}
33
34// Search for the minimum prefix length needed to uniquely identify the word
35function searchPrefixLength(root: TrieNode, word: string): number {
36    let currentNode = root;
37    let prefixLength = 0;
38  
39    for (const char of word) {
40        prefixLength++;
41      
42        // Calculate index for the character
43        const charIndex = char.charCodeAt(0) - 'a'.charCodeAt(0);
44        currentNode = currentNode.children[charIndex]!;
45      
46        // If only one word passes through this node, we found unique prefix
47        if (currentNode.count === 1) {
48            return prefixLength;
49        }
50    }
51  
52    // Return full word length if no unique prefix found
53    return word.length;
54}
55
56// Generate abbreviations for a list of words
57function wordsAbbreviation(words: string[]): string[] {
58    // Map to store Tries grouped by word length and last character
59    const trieMap: Map<string, TrieNode> = new Map();
60  
61    // Build Tries for each group of words with same length and last character
62    for (const word of words) {
63        // Create key using word length and last character
64        const wordLength = word.length;
65        const lastCharIndex = word.charCodeAt(wordLength - 1) - 'a'.charCodeAt(0);
66        const key = `${wordLength}-${lastCharIndex}`;
67      
68        // Create new Trie if key doesn't exist
69        if (!trieMap.has(key)) {
70            trieMap.set(key, createTrieNode());
71        }
72      
73        // Insert word into corresponding Trie
74        insertIntoTrie(trieMap.get(key)!, word);
75    }
76  
77    // Generate abbreviations for each word
78    const result: string[] = [];
79  
80    for (const word of words) {
81        const wordLength = word.length;
82      
83        // Create key to find corresponding Trie
84        const lastCharIndex = word.charCodeAt(wordLength - 1) - 'a'.charCodeAt(0);
85        const key = `${wordLength}-${lastCharIndex}`;
86      
87        // Find minimum prefix length for unique identification
88        const prefixLength = searchPrefixLength(trieMap.get(key)!, word);
89      
90        // Check if abbreviation would be shorter than original word
91        // Abbreviation format: prefix + number + last character (minimum 3 characters)
92        if (prefixLength + 2 >= wordLength) {
93            // Keep original word if abbreviation wouldn't save space
94            result.push(word);
95        } else {
96            // Create abbreviation: prefix + count of omitted characters + last character
97            const prefix = word.substring(0, prefixLength);
98            const omittedCount = wordLength - prefixLength - 1;
99            const lastChar = word.substring(wordLength - 1);
100            result.push(prefix + omittedCount + lastChar);
101        }
102    }
103  
104    return result;
105}
106

Time and Space Complexity

Time Complexity: O(L)

The algorithm processes each word twice - once during insertion into the trie and once during search:

  • Building tries: Each word is inserted once. For a word of length l, insertion takes O(l) time. Across all n words, this totals to O(L) where L is the sum of all word lengths.
  • Searching in tries: Each word is searched once. For a word of length l, search takes O(l) time in the worst case. Across all n words, this again totals to O(L).
  • The grouping by (length, last_character) and final abbreviation construction are both O(1) per word or O(n) total, which is dominated by O(L) since L β‰₯ n.

Therefore, the overall time complexity is O(L) + O(L) = O(L).

Space Complexity: O(L)

The space is primarily consumed by the trie structures:

  • Each character in each word creates at most one trie node. In the worst case where all words have unique prefixes within their groups, we store O(L) characters across all tries.
  • Each trie node contains an array of 26 pointers and a counter, which is O(1) space per node.
  • The tries dictionary stores at most O(n) entries (one per unique (length, last_character) pair), which is bounded by O(L).
  • The output array ans stores n abbreviated strings, with total length at most O(L).

Therefore, the overall space complexity is O(L).

Learn more about how to find time and space complexity quickly.

Common Pitfalls

1. Incorrect Abbreviation Length Calculation

One of the most common mistakes is incorrectly calculating when an abbreviation is actually shorter than the original word. The pitfall occurs when checking if the abbreviation saves space.

Incorrect Implementation:

# Wrong: Only considering prefix length + 1
if cnt + 1 >= len(w):
    ans.append(w)

Why it's wrong: The abbreviation format is prefix + number + last_char. Even if the number is a single digit, we need at least 2 additional characters beyond the prefix. For example:

  • Word: "word" (length 4)
  • If prefix length is 2: "wo", the abbreviation would be "wo1d" (length 4)
  • This doesn't save any space!

Correct Implementation:

# Correct: Considering prefix + at least 1 digit + last character
if cnt + 2 >= len(w):
    ans.append(w)

2. Not Handling Multi-Digit Numbers in Abbreviations

Another subtle pitfall is assuming the middle count will always be a single digit. While the check cnt + 2 >= len(w) works for determining if abbreviation is worthwhile, you might encounter issues if trying to optimize further.

Example scenario:

  • Word: "internationalization" (length 20)
  • Prefix needed: "i" (length 1)
  • Middle count: 18
  • Abbreviation: "i18n" (length 4)

The actual abbreviation length is 1 + 2 + 1 = 4 (prefix + two digits + last char), not 1 + 1 + 1 = 3.

Better check for exact abbreviation length:

def get_abbr_length(prefix_len, word_len):
    middle_count = word_len - prefix_len - 1
    digit_count = len(str(middle_count))
    return prefix_len + digit_count + 1

# Use this for more precise checking
if get_abbr_length(cnt, len(w)) >= len(w):
    ans.append(w)

3. Forgetting to Group by Both Length AND Last Character

A critical mistake is only grouping by one criterion instead of both.

Incorrect grouping:

# Wrong: Only grouping by length
tries = {}
for w in words:
    m = len(w)
    if m not in tries:
        tries[m] = Trie()
    tries[m].insert(w)

Why it fails: Words like "abcd" and "abce" both have length 4 but abbreviate to "a2d" and "a2e" respectively. They never conflict, so grouping them together unnecessarily increases the prefix length needed.

Correct grouping:

# Correct: Group by (length, last_character)
tries = {}
for w in words:
    key = (len(w), w[-1])
    if key not in tries:
        tries[key] = Trie()
    tries[key].insert(w)

4. Off-by-One Error in Middle Count Calculation

When calculating the number of characters between the prefix and last character, it's easy to make an off-by-one error.

Incorrect:

# Wrong: Forgetting to subtract both prefix and last char
middle_count = len(w) - cnt
# or
# Wrong: Subtracting incorrectly
middle_count = len(w) - cnt - 2

Correct:

# Correct: Total length - prefix length - 1 (for last char)
middle_count = len(w) - cnt - 1

Example verification:

  • Word: "apple" (length 5)
  • Prefix: "ap" (cnt = 2)
  • Middle characters: "pl" (2 characters)
  • Calculation: 5 - 2 - 1 = 2 βœ“
  • Abbreviation: "ap2e"
Discover Your Strengths and Weaknesses: Take Our 5-Minute Quiz to Tailor Your Study Plan:

Which data structure is used to implement recursion?


Recommended Readings

Want a Structured Path to Master System Design Too? Don’t Miss This!

Load More