527. Word Abbreviation π
Problem Description
You are given an array of distinct strings words
. Your task is to return the minimal possible abbreviations for every word.
The abbreviation rules work as follows:
-
Initial abbreviation: Each word starts with an abbreviation formed by taking the first character, followed by the count of characters in between, followed by the last character. For example,
"abcdef"
becomes"a4f"
(first char 'a', 4 characters in between, last char 'f'). -
Handling conflicts: When multiple words share the same abbreviation, you need to resolve the conflict by:
- Increasing the prefix (the first part) of their abbreviations by 1 character at a time
- Continue this process until all abbreviations become unique
- For example: If
"abcdef"
and"abndef"
both initially abbreviate to"a4f"
, you would:- First expand to
"ab3f"
and"ab3f"
(still conflicting) - Then expand to
"abc2f"
and"abn2f"
(now unique)
- First expand to
-
Final check: After finding the unique abbreviation, if the abbreviation is not shorter than the original word, keep the original word instead.
The goal is to find the shortest possible unique abbreviation for each word following these rules. The output should be an array where each element is either the abbreviated form or the original word (if abbreviation doesn't save space).
Intuition
The key insight is recognizing that words can only have the same abbreviation if they share certain properties: they must have the same length and the same last letter. This is because the abbreviation format is prefix + count + last_letter
, where the count depends on the total length minus the prefix length minus 1.
For example, "apple"
(length 5, last letter 'e') can only conflict with other 5-letter words ending in 'e'. It could never conflict with "apply"
(different last letter) or "sample"
(different length).
This observation suggests we can group words by (length, last_letter)
pairs and handle each group independently. Within each group, we need to find the minimum prefix length that makes each word's abbreviation unique.
To efficiently find the minimum distinguishing prefix for each word in a group, we can use a Trie (prefix tree). The Trie allows us to track how many words share each prefix. As we traverse down the Trie for a word, the first node where cnt = 1
indicates we've found a prefix that uniquely identifies this word among its group.
The algorithm becomes:
- Group words by
(length, last_letter)
- Build a separate Trie for each group
- For each word, traverse its group's Trie to find the minimum prefix length where it becomes unique
- Generate the abbreviation using this prefix length, but only use it if it's shorter than the original word
This approach is efficient because we only compare words that could potentially conflict, and the Trie structure allows us to find the minimum distinguishing prefix in a single traversal per word.
Solution Approach
The solution implements the grouped Trie approach through two main components: a Trie
class and the main solution logic.
Trie Implementation
The [Trie](/problems/trie_intro)
class has the following structure:
children
: An array of size 26 representing child nodes for each lowercase lettercnt
: Counter tracking how many words pass through this node
The Trie provides two key methods:
insert(w)
: Inserts a word into the Trie
- Traverse each character of the word
- For each character, create a child node if it doesn't exist
- Increment the
cnt
at each node visited - This
cnt
value tells us how many words share this prefix
search(w)
: Finds the minimum prefix length for unique identification
- Traverse the word character by character
- Keep track of the prefix length (
cnt
) - When we find a node where
node.cnt == 1
, we've found the unique prefix - Return this prefix length, or the full word length if no unique prefix exists before the end
Main Algorithm
-
Group words by (length, last_letter):
tries = {} for w in words: m = len(w) if (m, w[-1]) not in tries: tries[(m, w[-1])] = [Trie](/problems/trie_intro)() tries[(m, w[-1])].insert(w)
- Create a dictionary where keys are
(word_length, last_character)
tuples - Each key maps to a separate Trie containing all words with that length and ending
- Create a dictionary where keys are
-
Find minimum abbreviation for each word:
for w in words: cnt = tries[(len(w), w[-1])].search(w)
- For each word, search in its corresponding group's Trie
- Get the minimum prefix length needed for uniqueness
-
Generate the final abbreviation:
ans.append( w if cnt + 2 >= len(w) else w[:cnt] + str(len(w) - cnt - 1) + w[-1] )
- If
cnt + 2 >= len(w)
: The abbreviation won't be shorter (we need at least the prefix, one digit, and last letter), so keep the original word - Otherwise: Create abbreviation as
prefix + middle_count + last_letter
w[:cnt]
: The unique prefixstr(len(w) - cnt - 1)
: Count of characters between prefix and last letterw[-1]
: The last letter
- If
Time and Space Complexity
-
Time Complexity:
O(N Γ L)
where N is the number of words and L is the average word length- Inserting all words into Tries:
O(N Γ L)
- Searching for each word:
O(N Γ L)
- Inserting all words into Tries:
-
Space Complexity:
O(N Γ L)
for storing all words in the Tries
The beauty of this approach is that it only compares words that could potentially have the same abbreviation, making it much more efficient than comparing all pairs of words.
Ready to land your dream job?
Unlock your dream job with a 5-minute evaluator for a personalized learning plan!
Start EvaluatorExample Walkthrough
Let's walk through the solution with a small example: words = ["apple", "apply", "ample"]
Step 1: Group words by (length, last_letter)
"apple"
: length = 5, last letter = 'e' β group (5, 'e')"apply"
: length = 5, last letter = 'y' β group (5, 'y')"ample"
: length = 5, last letter = 'e' β group (5, 'e')
So we have two groups:
- Group (5, 'e'): ["apple", "ample"]
- Group (5, 'y'): ["apply"]
Step 2: Build Tries for each group
For group (5, 'e'), we build a Trie with "apple" and "ample":
root (cnt=2) | a (cnt=2) / \ p m (cnt=1) (cnt=1) | | p (cnt=1) p | | l (cnt=1) l | | e (cnt=1) e
For group (5, 'y'), we build a Trie with just "apply":
root (cnt=1) | a β p β p β l β y (all cnt=1)
Step 3: Find minimum prefix for each word
-
"apple": Traverse the (5, 'e') Trie
- 'a': cnt=2 (not unique, keep going)
- 'ap': cnt=1 (unique!)
- Minimum prefix length = 2
-
"ample": Traverse the (5, 'e') Trie
- 'a': cnt=2 (not unique, keep going)
- 'am': cnt=1 (unique!)
- Minimum prefix length = 2
-
"apply": Traverse the (5, 'y') Trie
- 'a': cnt=1 (unique!)
- Minimum prefix length = 1
Step 4: Generate abbreviations
-
"apple": prefix length = 2
- Abbreviation: "ap" + (5-2-1) + "e" = "ap2e"
- Length check: 4 < 5, so use "ap2e"
-
"ample": prefix length = 2
- Abbreviation: "am" + (5-2-1) + "e" = "am2e"
- Length check: 4 < 5, so use "am2e"
-
"apply": prefix length = 1
- Abbreviation: "a" + (5-1-1) + "y" = "a3y"
- Length check: 3 < 5, so use "a3y"
Final Result: ["ap2e", "a3y", "am2e"]
Note how "apple" and "ample" needed longer prefixes (2 characters) because they're in the same group and would conflict with just "a3e", while "apply" only needs 1 character prefix since it's alone in its group.
Solution Implementation
1class Trie:
2 """Trie node for storing words and tracking prefix frequencies."""
3 __slots__ = ["children", "cnt"]
4
5 def __init__(self):
6 # Array to store 26 lowercase letter children nodes
7 self.children = [None] * 26
8 # Count of words passing through this node
9 self.cnt = 0
10
11 def insert(self, w: str):
12 """Insert a word into the trie and update counts along the path."""
13 node = self
14 for c in w:
15 # Calculate index for the character (0-25)
16 idx = ord(c) - ord("a")
17 # Create new node if it doesn't exist
18 if not node.children[idx]:
19 node.children[idx] = Trie()
20 # Move to child node
21 node = node.children[idx]
22 # Increment count for this prefix
23 node.cnt += 1
24
25 def search(self, w: str) -> int:
26 """
27 Find the minimum prefix length that uniquely identifies the word.
28 Returns the number of characters needed for a unique prefix.
29 """
30 node = self
31 cnt = 0
32 for c in w:
33 cnt += 1
34 idx = ord(c) - ord("a")
35 node = node.children[idx]
36 # If only one word passes through this node, we found unique prefix
37 if node.cnt == 1:
38 return cnt
39 # If no unique prefix found, return full word length
40 return len(w)
41
42
43class Solution:
44 def wordsAbbreviation(self, words: List[str]) -> List[str]:
45 """
46 Generate abbreviations for a list of words.
47 Words with same length and last character are grouped together.
48 """
49 # Dictionary to store tries grouped by (word_length, last_character)
50 tries = {}
51
52 # Build tries for each group of words with same length and last character
53 for w in words:
54 m = len(w)
55 key = (m, w[-1])
56 # Create new trie for this group if it doesn't exist
57 if key not in tries:
58 tries[key] = Trie()
59 # Insert word into the appropriate trie
60 tries[key].insert(w)
61
62 # Generate abbreviations for each word
63 ans = []
64 for w in words:
65 # Find minimum unique prefix length for this word
66 cnt = tries[(len(w), w[-1])].search(w)
67 # Check if abbreviation would be shorter than original
68 # Need at least: prefix + number + last_char (cnt + 1 + 1)
69 if cnt + 2 >= len(w):
70 # Abbreviation wouldn't save space, use full word
71 ans.append(w)
72 else:
73 # Create abbreviation: prefix + count + last_character
74 abbreviation = w[:cnt] + str(len(w) - cnt - 1) + w[-1]
75 ans.append(abbreviation)
76
77 return ans
78
1/**
2 * Trie (Prefix Tree) data structure to store words and count occurrences at each node
3 */
4class Trie {
5 // Array to store child nodes for each letter (a-z)
6 private final Trie[] children = new Trie[26];
7 // Counter to track how many words pass through this node
8 private int count;
9
10 /**
11 * Inserts a word into the trie and increments counters along the path
12 * @param word The word to insert into the trie
13 */
14 public void insert(String word) {
15 Trie currentNode = this;
16
17 // Traverse each character in the word
18 for (char character : word.toCharArray()) {
19 int index = character - 'a';
20
21 // Create new node if it doesn't exist
22 if (currentNode.children[index] == null) {
23 currentNode.children[index] = new Trie();
24 }
25
26 // Move to child node and increment its counter
27 currentNode = currentNode.children[index];
28 currentNode.count++;
29 }
30 }
31
32 /**
33 * Searches for the minimum prefix length needed to uniquely identify the word
34 * @param word The word to search for
35 * @return The minimum number of characters needed as prefix
36 */
37 public int search(String word) {
38 Trie currentNode = this;
39 int prefixLength = 0;
40
41 // Traverse each character in the word
42 for (char character : word.toCharArray()) {
43 prefixLength++;
44 int index = character - 'a';
45 currentNode = currentNode.children[index];
46
47 // If only one word passes through this node, we found unique prefix
48 if (currentNode.count == 1) {
49 return prefixLength;
50 }
51 }
52
53 // If no unique prefix found, return full word length
54 return word.length();
55 }
56}
57
58/**
59 * Solution class to generate word abbreviations
60 */
61class Solution {
62 /**
63 * Generates abbreviations for a list of words
64 * Words with same length and last character are grouped together
65 * @param words List of words to abbreviate
66 * @return List of abbreviated words
67 */
68 public List<String> wordsAbbreviation(List<String> words) {
69 // Map to group words by their length and last character
70 // Key: [word length, last character index], Value: Trie for that group
71 Map<List<Integer>, Trie> trieMap = new HashMap<>();
72
73 // Build tries for each group of words
74 for (String word : words) {
75 // Create key based on word length and last character
76 List<Integer> groupKey = List.of(word.length(), word.charAt(word.length() - 1) - 'a');
77
78 // Create new trie for this group if it doesn't exist
79 trieMap.putIfAbsent(groupKey, new Trie());
80
81 // Insert word into the appropriate trie
82 trieMap.get(groupKey).insert(word);
83 }
84
85 // Generate abbreviations for each word
86 List<String> result = new ArrayList<>();
87
88 for (String word : words) {
89 int wordLength = word.length();
90
91 // Get the trie for this word's group
92 List<Integer> groupKey = List.of(wordLength, word.charAt(wordLength - 1) - 'a');
93
94 // Find minimum prefix length needed for unique identification
95 int prefixLength = trieMap.get(groupKey).search(word);
96
97 // Check if abbreviation would be shorter than original word
98 // Abbreviation format: prefix + number + last character (minimum 3 chars)
99 if (prefixLength + 2 >= wordLength) {
100 // Abbreviation wouldn't save space, use original word
101 result.add(word);
102 } else {
103 // Create abbreviation: prefix + count of omitted chars + last char
104 String abbreviation = word.substring(0, prefixLength) +
105 (wordLength - prefixLength - 1) +
106 word.substring(wordLength - 1);
107 result.add(abbreviation);
108 }
109 }
110
111 return result;
112 }
113}
114
1class Trie {
2public:
3 // Constructor initializes the trie node
4 Trie() : prefixCount(0) {
5 fill(children.begin(), children.end(), nullptr);
6 }
7
8 // Insert a word into the trie and update prefix counts
9 void insert(const string& word) {
10 Trie* currentNode = this;
11
12 for (char ch : word) {
13 int index = ch - 'a';
14
15 // Create new node if path doesn't exist
16 if (currentNode->children[index] == nullptr) {
17 currentNode->children[index] = new Trie();
18 }
19
20 // Move to child node and increment its prefix count
21 currentNode = currentNode->children[index];
22 ++currentNode->prefixCount;
23 }
24 }
25
26 // Search for minimum unique prefix length for the given word
27 int search(const string& word) {
28 Trie* currentNode = this;
29 int prefixLength = 0;
30
31 for (char ch : word) {
32 ++prefixLength;
33 int index = ch - 'a';
34 currentNode = currentNode->children[index];
35
36 // If only one word has this prefix, we found the minimum unique prefix
37 if (currentNode->prefixCount == 1) {
38 return prefixLength;
39 }
40 }
41
42 // If no unique prefix found, return full word length
43 return word.size();
44 }
45
46private:
47 array<Trie*, 26> children; // Pointers to child nodes (26 letters)
48 int prefixCount; // Number of words sharing this prefix
49};
50
51class Solution {
52public:
53 vector<string> wordsAbbreviation(vector<string>& words) {
54 // Map to group words by (length, last character)
55 // This ensures abbreviations are only compared among words with same length and ending
56 map<pair<int, int>, Trie*> trieGroups;
57
58 // Build tries for each group of words
59 for (const auto& word : words) {
60 // Create key based on word length and last character
61 pair<int, int> groupKey = {
62 static_cast<int>(word.size()),
63 word.back() - 'a'
64 };
65
66 // Create new trie for this group if it doesn't exist
67 if (trieGroups.find(groupKey) == trieGroups.end()) {
68 trieGroups[groupKey] = new Trie();
69 }
70
71 // Insert word into corresponding trie
72 trieGroups[groupKey]->insert(word);
73 }
74
75 vector<string> result;
76
77 // Generate abbreviation for each word
78 for (const auto& word : words) {
79 int wordLength = word.size();
80
81 // Find the trie group for this word
82 pair<int, int> groupKey = {wordLength, word.back() - 'a'};
83
84 // Find minimum unique prefix length
85 int uniquePrefixLength = trieGroups[groupKey]->search(word);
86
87 // Check if abbreviation would be shorter than original word
88 // Abbreviation format: prefix + number + last char (at least 3 chars)
89 if (uniquePrefixLength + 2 >= wordLength) {
90 // Abbreviation wouldn't save space, use original word
91 result.push_back(word);
92 } else {
93 // Create abbreviation: prefix + count of middle chars + last char
94 string abbreviation = word.substr(0, uniquePrefixLength) +
95 to_string(wordLength - uniquePrefixLength - 1) +
96 word.back();
97 result.push_back(abbreviation);
98 }
99 }
100
101 return result;
102 }
103};
104
1// Trie node structure for prefix tree
2interface TrieNode {
3 children: (TrieNode | null)[]; // Array of 26 child nodes (for 'a' to 'z')
4 count: number; // Number of words passing through this node
5}
6
7// Create a new Trie node
8function createTrieNode(): TrieNode {
9 return {
10 children: Array(26).fill(null),
11 count: 0
12 };
13}
14
15// Insert a word into the Trie and increment count for each node
16function insertIntoTrie(root: TrieNode, word: string): void {
17 let currentNode = root;
18
19 for (const char of word) {
20 // Calculate index (0-25) for the character
21 const charIndex = char.charCodeAt(0) - 'a'.charCodeAt(0);
22
23 // Create child node if it doesn't exist
24 if (!currentNode.children[charIndex]) {
25 currentNode.children[charIndex] = createTrieNode();
26 }
27
28 // Move to child node and increment its count
29 currentNode = currentNode.children[charIndex]!;
30 currentNode.count++;
31 }
32}
33
34// Search for the minimum prefix length needed to uniquely identify the word
35function searchPrefixLength(root: TrieNode, word: string): number {
36 let currentNode = root;
37 let prefixLength = 0;
38
39 for (const char of word) {
40 prefixLength++;
41
42 // Calculate index for the character
43 const charIndex = char.charCodeAt(0) - 'a'.charCodeAt(0);
44 currentNode = currentNode.children[charIndex]!;
45
46 // If only one word passes through this node, we found unique prefix
47 if (currentNode.count === 1) {
48 return prefixLength;
49 }
50 }
51
52 // Return full word length if no unique prefix found
53 return word.length;
54}
55
56// Generate abbreviations for a list of words
57function wordsAbbreviation(words: string[]): string[] {
58 // Map to store Tries grouped by word length and last character
59 const trieMap: Map<string, TrieNode> = new Map();
60
61 // Build Tries for each group of words with same length and last character
62 for (const word of words) {
63 // Create key using word length and last character
64 const wordLength = word.length;
65 const lastCharIndex = word.charCodeAt(wordLength - 1) - 'a'.charCodeAt(0);
66 const key = `${wordLength}-${lastCharIndex}`;
67
68 // Create new Trie if key doesn't exist
69 if (!trieMap.has(key)) {
70 trieMap.set(key, createTrieNode());
71 }
72
73 // Insert word into corresponding Trie
74 insertIntoTrie(trieMap.get(key)!, word);
75 }
76
77 // Generate abbreviations for each word
78 const result: string[] = [];
79
80 for (const word of words) {
81 const wordLength = word.length;
82
83 // Create key to find corresponding Trie
84 const lastCharIndex = word.charCodeAt(wordLength - 1) - 'a'.charCodeAt(0);
85 const key = `${wordLength}-${lastCharIndex}`;
86
87 // Find minimum prefix length for unique identification
88 const prefixLength = searchPrefixLength(trieMap.get(key)!, word);
89
90 // Check if abbreviation would be shorter than original word
91 // Abbreviation format: prefix + number + last character (minimum 3 characters)
92 if (prefixLength + 2 >= wordLength) {
93 // Keep original word if abbreviation wouldn't save space
94 result.push(word);
95 } else {
96 // Create abbreviation: prefix + count of omitted characters + last character
97 const prefix = word.substring(0, prefixLength);
98 const omittedCount = wordLength - prefixLength - 1;
99 const lastChar = word.substring(wordLength - 1);
100 result.push(prefix + omittedCount + lastChar);
101 }
102 }
103
104 return result;
105}
106
Time and Space Complexity
Time Complexity: O(L)
The algorithm processes each word twice - once during insertion into the trie and once during search:
- Building tries: Each word is inserted once. For a word of length
l
, insertion takesO(l)
time. Across alln
words, this totals toO(L)
whereL
is the sum of all word lengths. - Searching in tries: Each word is searched once. For a word of length
l
, search takesO(l)
time in the worst case. Across alln
words, this again totals toO(L)
. - The grouping by
(length, last_character)
and final abbreviation construction are bothO(1)
per word orO(n)
total, which is dominated byO(L)
sinceL β₯ n
.
Therefore, the overall time complexity is O(L) + O(L) = O(L)
.
Space Complexity: O(L)
The space is primarily consumed by the trie structures:
- Each character in each word creates at most one trie node. In the worst case where all words have unique prefixes within their groups, we store
O(L)
characters across all tries. - Each trie node contains an array of 26 pointers and a counter, which is
O(1)
space per node. - The
tries
dictionary stores at mostO(n)
entries (one per unique(length, last_character)
pair), which is bounded byO(L)
. - The output array
ans
storesn
abbreviated strings, with total length at mostO(L)
.
Therefore, the overall space complexity is O(L)
.
Learn more about how to find time and space complexity quickly.
Common Pitfalls
1. Incorrect Abbreviation Length Calculation
One of the most common mistakes is incorrectly calculating when an abbreviation is actually shorter than the original word. The pitfall occurs when checking if the abbreviation saves space.
Incorrect Implementation:
# Wrong: Only considering prefix length + 1
if cnt + 1 >= len(w):
ans.append(w)
Why it's wrong: The abbreviation format is prefix + number + last_char
. Even if the number is a single digit, we need at least 2 additional characters beyond the prefix. For example:
- Word:
"word"
(length 4) - If prefix length is 2:
"wo"
, the abbreviation would be"wo1d"
(length 4) - This doesn't save any space!
Correct Implementation:
# Correct: Considering prefix + at least 1 digit + last character
if cnt + 2 >= len(w):
ans.append(w)
2. Not Handling Multi-Digit Numbers in Abbreviations
Another subtle pitfall is assuming the middle count will always be a single digit. While the check cnt + 2 >= len(w)
works for determining if abbreviation is worthwhile, you might encounter issues if trying to optimize further.
Example scenario:
- Word:
"internationalization"
(length 20) - Prefix needed:
"i"
(length 1) - Middle count: 18
- Abbreviation:
"i18n"
(length 4)
The actual abbreviation length is 1 + 2 + 1 = 4
(prefix + two digits + last char), not 1 + 1 + 1 = 3
.
Better check for exact abbreviation length:
def get_abbr_length(prefix_len, word_len):
middle_count = word_len - prefix_len - 1
digit_count = len(str(middle_count))
return prefix_len + digit_count + 1
# Use this for more precise checking
if get_abbr_length(cnt, len(w)) >= len(w):
ans.append(w)
3. Forgetting to Group by Both Length AND Last Character
A critical mistake is only grouping by one criterion instead of both.
Incorrect grouping:
# Wrong: Only grouping by length
tries = {}
for w in words:
m = len(w)
if m not in tries:
tries[m] = Trie()
tries[m].insert(w)
Why it fails: Words like "abcd"
and "abce"
both have length 4 but abbreviate to "a2d"
and "a2e"
respectively. They never conflict, so grouping them together unnecessarily increases the prefix length needed.
Correct grouping:
# Correct: Group by (length, last_character)
tries = {}
for w in words:
key = (len(w), w[-1])
if key not in tries:
tries[key] = Trie()
tries[key].insert(w)
4. Off-by-One Error in Middle Count Calculation
When calculating the number of characters between the prefix and last character, it's easy to make an off-by-one error.
Incorrect:
# Wrong: Forgetting to subtract both prefix and last char
middle_count = len(w) - cnt
# or
# Wrong: Subtracting incorrectly
middle_count = len(w) - cnt - 2
Correct:
# Correct: Total length - prefix length - 1 (for last char)
middle_count = len(w) - cnt - 1
Example verification:
- Word:
"apple"
(length 5) - Prefix:
"ap"
(cnt = 2) - Middle characters:
"pl"
(2 characters) - Calculation:
5 - 2 - 1 = 2
β - Abbreviation:
"ap2e"
Which data structure is used to implement recursion?
Recommended Readings
Greedy Introduction div class responsive iframe iframe src https www youtube com embed WTslqPbj7I title YouTube video player frameborder 0 allow accelerometer autoplay clipboard write encrypted media gyroscope picture in picture web share allowfullscreen iframe div When do we use greedy Greedy algorithms tend to solve optimization problems Typically they will ask you to calculate the max min of some value Commonly you may see this phrased in the problem as max min longest shortest largest smallest etc These keywords can be identified by just scanning
Trie Introduction Imagine you have a magic dictionary where as soon as you start spelling a word it immediately tells you how many words that start with those letters How cool would that be This is exactly the kind of magic a data structure called trie performs What's a Trie
Sorting Summary Comparisons We presented quite a few sorting algorithms and it is essential to know the advantages and disadvantages of each one The basic algorithms are easy to visualize and easy to learn for beginner programmers because of their simplicity As such they will suffice if you don't know any advanced
Want a Structured Path to Master System Design Too? Donβt Miss This!