Leetcode 527. Word Abbreviation

Problem Explanation

The problem gives you a list of distinct non-empty words, and your task is to generate the minimal possible abbreviation for each word by following certain rules:

  1. Start with the first character, followed by the number of characters that have been abbreviated, and then the last character. For example, if we have "like", we can abbreviate it as "l2e", where '2' represents the two characters 'i' and 'k' which have been abbreviated.

  2. If there is a conflict, we need to use a longer prefix to make the abbreviation unique. For example, if we have "internal" and "interval", we can't both abbreviate them as "i6l", we need to use "inte6l" for "internal" and "inter5l" for "interval".

  3. If the abbreviation doesn't make the word shorter, then keep it as original. For example, if we have "god", we keep it as "god" as the abbreviation "g1d" is not shorter.

The goal is to return the list of new abbreviations in the same order as the original words.

This problem is like creating a dictionary for the words where each word maps to a unique abbreviation. There exists some conditions which force to keep some more characters in abbreviated version to avoid the conflict of having same abbreviations for multiple words.

Helpful Illustration

Let's see an example to understand the problem better:

If we are given ["like", "god", "internal", "me", "internet", "interval", "intension", "face", "intrusion"]

"like" will be abbreviated as "l2e"

"god" will be kept as "god" because its abbreviation "g1d" is not shorter.

"internal" will be abbreviated as "i6l"

However for "internet", if we abbreviate it as "i6t", it will be same as the already present abbreviation of "internal". So, we have to update our abbreviation for "internal" by taking one more alphabet to "inte6l".

Simillarly for "interval" which will be keept as "interval" because both "i6l" and "i7l" are already taken.

The answer will hence be ["l2e", "god", "i6l", "me", "inte6l", "interval", "i6n", "f2e", "i6n"]

Algorithm and Approach because of collision:

The implemented solution uses an iterative approach. For each word, an initial abbreviation is generated. If this abbreviation is unique, then it is added to the abbreviation list. If it's not unique, then the conflict words generate new abbreviations by taking one more character and the process continues until all words have a unique abbreviation.

Solutions

Firstly, let's define a helper function getAbbrev, which returns the abbreviation of given word and prefix length.

Secondly, in the main function, initialize a prefix array where its i-th item denotes the length of prefix kept in corresponding abbreviation.

Finally, for each word, if it has other words sharing the same abbreviation then extend their prefix length by 1 and repeat above steps until all words map to a unique abbreviation.

Python Solution

1
2python
3class Solution:
4    def wordsAbbreviation(self, words):
5        def short(word, i):
6            if i >= len(word) - 2:
7                return word
8            return word[:i+1] + str(len(word) - 1 - i) + word[-1]
9
10        n = len(words)
11        answer, prefix = [""] * n, [1] * n
12        for i in range(n):
13            answer[i] = short(words[i], prefix[i])
14
15        while True:
16            duplicates = collections.defaultdict(list)
17            for i in range(n):
18                duplicates[answer[i]].append(i)
19            if not any(len(indices) > 1 for indices in duplicates.values()):
20                break
21            for indices in duplicates.values():
22                if len(indices) <= 1: continue
23                for i in indices:
24                    prefix[i] += 1
25                    answer[i] = short(words[i], prefix[i])
26
27        return answer

Java Solution

1
2java
3class Solution {
4    private String getAbbr(String s, int p) {
5        if (p >= s.length() - 2) return s;
6        String abbr = s.substring(0, p + 1) + (s.length() - 1 - p) + s.charAt(s.length() - 1);
7        return abbr;
8    }
9
10    public List<String> wordsAbbreviation(List<String> words) {
11        int n = words.size();
12        int[] prefix = new int[n];
13        String[] ans = new String[n];
14        for (int i = 0; i < n; i++) {
15            prefix[i] = 1;
16            ans[i] = getAbbr(words.get(i), prefix[i]);
17        }
18
19        for (int i = 0; i < n; i++) {
20            while (true) {
21                Set<Integer> set = new HashSet<>();
22                for (int j = i + 1; j < n; j++) {
23                    if (ans[j].equals(ans[i])) {
24                        set.add(j);
25                    }
26                }
27                if (set.isEmpty()) break;
28                set.add(i);
29                for (Integer index : set) {
30                    ans[index] = getAbbr(words.get(index), ++prefix[index]);
31                }
32            }
33        }
34
35        return Arrays.asList(ans);
36    }
37}

C++ Solution

1
2cpp
3class Solution {
4public:
5    vector<string> wordsAbbreviation(vector<string>& words) {
6        int n = words.size();
7        vector<int> prefix(n, 1);
8        vector<string> ans(n);
9        for (int i = 0; i < n; i++)
10            ans[i] = abbrev(words[i], prefix[i]);
11        for (int i = 0; i < n; i++) {
12            while (true) {
13                set<int> s;
14                for (int j = i+1; j < n; j++) {
15                    if (ans[j] != ans[i]) continue;
16                    s.insert(j);
17                }
18                if (s.empty()) break;
19                s.insert(i);
20                for (int j : s)
21                    ans[j] = abbrev(words[j], ++prefix[j]);
22            }
23        }
24        return ans;
25    }
26
27private:
28    string abbrev(string s, int k) {
29        return s.length() <= k+2 ? s : s.substr(0,k) + to_string(s.length()-k-1) + s[s.length()-1];
30    }
31};

C# Solution

1
2csharp
3public class Solution {
4    public string[] WordsAbbreviation(string[] words) {
5        var abbrevs = words.Select((w, i) => new { Word = w, Abbrev = GetAbbrev(w, 1), Index = i }).ToList();
6
7        foreach (var g in abbrevs.GroupBy(a => a.Abbrev).Where(g => g.Count() > 1))
8            for (var prefixLength = 2; ; prefixLength++)
9                foreach (var subG in g.GroupBy(a => GetAbbrev(a.Word, prefixLength))) {
10                    if (subG.Count() == 1)
11                        abbrevs[subG.Single().Index] = subG.Single();
12                    if (subG.All(a => abbrevs[a.Index].Abbrev != g.Key)) break;
13                }
14
15        return abbrevs.OrderBy(a => a.Index).Select(a => a.Abbrev).ToArray();
16    }
17
18    private string GetAbbrev(string s, int i) {
19        return s.Length <= i + 2 ? s : s.Substring(0, i) + (s.Length - i - 1) + s[s.Length - 1];
20    }
21}

JavaScript Solution

1
2javascript
3var getAbbr = function(word, prefix) {
4    if (prefix >= word.length - 2) return word;
5    return word.substring(0, prefix + 1) + (word.length - 1 - prefix) + word.charAt(word.length - 1);
6};
7
8var wordsAbbreviation = function(words) {
9    let n = words.length;
10    let prefix = Array(n).fill(1);
11    let ans = Array(n);
12    for (let i = 0; i < n; i++) {
13        ans[i] = getAbbr(words[i], prefix[i]);
14    }
15
16    for (let i = 0; i < n; i++) {
17        while (true) {
18            let set = new Set();
19            for (let j = i + 1; j < n; j++) {
20                if (ans[j] !== ans[i]) continue;
21                set.add(j);
22            }
23            if (set.size === 0) break;
24            set.add(i);
25            for (const index of set) {
26                ans[index] = getAbbr(words[index], ++prefix[index]);
27            }
28        }
29    }
30
31    return ans;
32};

Conclusion

The solutions demonstrated above are using a greedy approach for generating the abbreviations for the given words. We generate conflicts and resolve them using more characters from words. The overall approach needs a deeper understanding of every stage and iterative optimizations for reaching the optimal solution. Validate and handle this using all the conditions to solve the problem within optimal time complexity. This problem's complexity lies in the conflict resolution, increasing the prefix length, and iteratively checking for conflicts until none exist. It involves understanding the problem thoroughly and designing the algorithm accordingly. Care must be taken while handling edge cases and ensuring the abbreviations are shorter than the original words; else the words are used as they are.

Therefore, understanding the above algorithms could help problem solvers develop a thorough understanding of concepts involved in string manipulation, arrays, control statements, and iterative improvements in research problems using programming languages.

The respective solutions in Python, Java, JavaScript, C#, and C++ are efficient and utilize each language's unique capabilities to solve the problem effectively. By comparing these solutions, one can observe the similarities and differences in syntax and methods available in different programming languages.

In summary, such problems could be an excellent exercise for software engineers to improve both their problem-solving skills and advance their knowledge across multiple programming languages.


Got a question? Ask the Teaching Assistant anything you don't understand.

Still not clear? Ask in the Forum,  Discord or Submit the part you don't understand to our editors.


TA 👨‍🏫