1487. Making File Names Unique
Problem Description
You need to create n
folders in a file system, where you're given an array of strings names
containing the desired folder names. The folders are created one by one in order.
The key constraint is that no two folders can have the same name. When you try to create a folder with a name that already exists, the system automatically appends a suffix in the format (k)
to make it unique, where k
is the smallest positive integer that results in a unique name.
For example:
- If you try to create a folder named
"doc"
but it already exists, the system will name it"doc(1)"
- If
"doc"
and"doc(1)"
both exist, the next"doc"
would become"doc(2)"
- The suffix always uses the smallest available number that makes the name unique
Your task is to return an array showing the actual names assigned by the system to each folder in the order they were created.
The solution uses a hash table to track:
- Which folder names have been used
- For repeated names, what's the next available suffix number to try
The algorithm works by:
- Maintaining a dictionary
d
whered[name]
stores the next suffix number to try for that base name - For each folder name in the input:
- If the name already exists, find the smallest
k
such thatname(k)
is available - Update the tracking information for future occurrences
- Add the final name (with or without suffix) to the result
- If the name already exists, find the smallest
Intuition
The core challenge is efficiently handling name collisions. When we encounter a duplicate name, we need to find the smallest available suffix number quickly.
The naive approach would be to check name(1)
, name(2)
, name(3)
, and so on until we find an unused name. However, this could be inefficient if we have many duplicates of the same base name.
The key insight is that we need to track two pieces of information:
- Which exact folder names have been used (including those with suffixes)
- For each base name, what's the next suffix number we should try
Why do we need both? Consider this scenario: we create "doc"
, then "doc(1)"
, then encounter "doc"
again. Without tracking that "doc(1)"
is already used, we might try to create it again. And without tracking the next available suffix for "doc"
, we'd have to start checking from (1)
every time.
By maintaining d[name] = k
, we're essentially keeping a "bookmark" that says: "The next time you see this base name, start checking from suffix k
." This optimization prevents us from rechecking smaller numbers that we know are already taken.
When we successfully use a suffix k
for a name, we update our bookmark to k + 1
because we know all suffixes up to k
are now taken. This way, future occurrences of the same base name can start their search from a more efficient position.
The hash table serves a dual purpose: it tells us whether a name exists (for collision detection) and helps us track the next available suffix (for efficiency). This combination allows us to handle the folder naming process in an optimal manner.
Solution Approach
We use a hash table d
to track folder names and their next available suffix indices. The hash table serves dual purposes: checking if a name exists and storing the minimum available suffix for each base name.
Algorithm Steps:
-
Initialize an empty hash table
d = defaultdict(int)
to store folder names and their suffix tracking. -
Iterate through each name in the input array
names
:a. Check if the name already exists in
d
:- If
name in d
, it means this folder name was already used - Retrieve the starting suffix number:
k = d[name]
- Keep incrementing
k
whilef'{name}({k})'
exists ind
- Once we find an available suffix, update
d[name] = k + 1
for future occurrences - Modify the current name to include the suffix:
names[i] = f'{name}({k})'
b. Add the final name to the hash table:
- Set
d[names[i]] = 1
to mark this exact name (with or without suffix) as used - The value
1
here indicates that if this exact string appears again, we should start checking from suffix(1)
- If
-
Return the modified array
names
which now contains all the actual folder names assigned by the system.
Key Implementation Details:
- We modify the input array in-place rather than creating a new array
- The
defaultdict(int)
returns0
for missing keys, but we use it to store the next suffix to try - When a base name like
"doc"
gets a suffix to become"doc(2)"
, we store both:d["doc"] = 3
(next suffix to try for base name "doc")d["doc(2)"] = 1
(marking "doc(2)" as used)
This approach ensures O(1) average-case lookup for checking name existence and maintains optimal suffix tracking for repeated base names.
Ready to land your dream job?
Unlock your dream job with a 3-minute evaluator for a personalized learning plan!
Start EvaluatorExample Walkthrough
Let's walk through the algorithm with the input: names = ["doc", "doc", "image", "doc(1)", "doc"]
Initial state: d = {}
(empty hash table), result = []
Step 1: Process "doc"
- Check if "doc" exists in
d
: No, it doesn't - Add to hash table:
d["doc"] = 1
- Result:
["doc"]
- Hash table:
{"doc": 1}
Step 2: Process second "doc"
- Check if "doc" exists in
d
: Yes, it does! - Get starting suffix:
k = d["doc"] = 1
- Check if "doc(1)" exists: No, it's available
- Update:
d["doc"] = 2
(next time start from 2) - Modify name to "doc(1)"
- Add to hash table:
d["doc(1)"] = 1
- Result:
["doc", "doc(1)"]
- Hash table:
{"doc": 2, "doc(1)": 1}
Step 3: Process "image"
- Check if "image" exists in
d
: No, it doesn't - Add to hash table:
d["image"] = 1
- Result:
["doc", "doc(1)", "image"]
- Hash table:
{"doc": 2, "doc(1)": 1, "image": 1}
Step 4: Process "doc(1)"
- Check if "doc(1)" exists in
d
: Yes, it does! - Get starting suffix:
k = d["doc(1)"] = 1
- Check if "doc(1)(1)" exists: No, it's available
- Update:
d["doc(1)"] = 2
- Modify name to "doc(1)(1)"
- Add to hash table:
d["doc(1)(1)"] = 1
- Result:
["doc", "doc(1)", "image", "doc(1)(1)"]
- Hash table:
{"doc": 2, "doc(1)": 2, "image": 1, "doc(1)(1)": 1}
Step 5: Process third "doc"
- Check if "doc" exists in
d
: Yes, it does! - Get starting suffix:
k = d["doc"] = 2
- Check if "doc(2)" exists: No, it's available
- Update:
d["doc"] = 3
- Modify name to "doc(2)"
- Add to hash table:
d["doc(2)"] = 1
- Final Result:
["doc", "doc(1)", "image", "doc(1)(1)", "doc(2)"]
The algorithm efficiently handles collisions by tracking the next available suffix for each base name, avoiding redundant checks of already-used suffixes.
Solution Implementation
1from typing import List
2from collections import defaultdict
3
4class Solution:
5 def getFolderNames(self, names: List[str]) -> List[str]:
6 # Dictionary to track the next available suffix number for each base name
7 name_counter = defaultdict(int)
8
9 for i, name in enumerate(names):
10 # Check if the current name already exists
11 if name in name_counter:
12 # Get the next suffix number to try
13 suffix_num = name_counter[name]
14
15 # Find the smallest available suffix by incrementing
16 while f'{name}({suffix_num})' in name_counter:
17 suffix_num += 1
18
19 # Update the next available suffix for this base name
20 name_counter[name] = suffix_num + 1
21
22 # Modify the current name with the suffix
23 names[i] = f'{name}({suffix_num})'
24
25 # Mark the final name (original or modified) as used
26 name_counter[names[i]] = 1
27
28 return names
29
1class Solution {
2 public String[] getFolderNames(String[] names) {
3 // Map to store folder names and their next available suffix number
4 Map<String, Integer> folderNameToNextSuffix = new HashMap<>();
5
6 // Process each name in the input array
7 for (int i = 0; i < names.length; i++) {
8 String currentName = names[i];
9
10 // Check if the current name already exists in the map
11 if (folderNameToNextSuffix.containsKey(currentName)) {
12 // Get the next suffix number to try
13 int suffixNumber = folderNameToNextSuffix.get(currentName);
14
15 // Find the smallest available suffix by incrementing until we find an unused name
16 while (folderNameToNextSuffix.containsKey(currentName + "(" + suffixNumber + ")")) {
17 suffixNumber++;
18 }
19
20 // Update the next available suffix for the original name
21 folderNameToNextSuffix.put(currentName, suffixNumber);
22
23 // Modify the current name with the suffix
24 names[i] = currentName + "(" + suffixNumber + ")";
25 }
26
27 // Mark the final name as used with initial suffix value of 1
28 folderNameToNextSuffix.put(names[i], 1);
29 }
30
31 return names;
32 }
33}
34
1class Solution {
2public:
3 vector<string> getFolderNames(vector<string>& names) {
4 // Map to track used folder names and their next available suffix number
5 unordered_map<string, int> nameToNextSuffix;
6
7 for (auto& name : names) {
8 // Get the next available suffix number for this base name
9 int suffixNumber = nameToNextSuffix[name];
10
11 // If the name already exists (suffix > 0), find the next available variant
12 if (suffixNumber > 0) {
13 // Keep incrementing suffix until we find an unused name combination
14 while (nameToNextSuffix[name + "(" + to_string(suffixNumber) + ")"]) {
15 suffixNumber++;
16 }
17
18 // Update the next available suffix for this base name
19 nameToNextSuffix[name] = suffixNumber;
20
21 // Append the suffix to create the unique folder name
22 name += "(" + to_string(suffixNumber) + ")";
23 }
24
25 // Mark this final name as used (with value 1 indicating it exists)
26 nameToNextSuffix[name] = 1;
27 }
28
29 return names;
30 }
31};
32
1/**
2 * Processes an array of folder names to ensure uniqueness by appending suffixes
3 * @param names - Array of folder names that may contain duplicates
4 * @returns Array of unique folder names with suffixes added where necessary
5 */
6function getFolderNames(names: string[]): string[] {
7 // Map to track the next available suffix number for each base name
8 const nameCountMap: Map<string, number> = new Map();
9
10 // Process each name in the array
11 for (let i = 0; i < names.length; i++) {
12 // Check if this name has been encountered before
13 if (nameCountMap.has(names[i])) {
14 // Get the next suffix number to try (default to 1 if undefined)
15 let suffixNumber: number = nameCountMap.get(names[i]) || 1;
16
17 // Find the smallest available suffix by incrementing until we find an unused name
18 while (nameCountMap.has(`${names[i]}(${suffixNumber})`)) {
19 suffixNumber++;
20 }
21
22 // Update the next available suffix for this base name
23 nameCountMap.set(names[i], suffixNumber + 1);
24
25 // Append the suffix to make the name unique
26 names[i] = `${names[i]}(${suffixNumber})`;
27 } else {
28 // First occurrence of this name, set initial suffix counter to 1
29 nameCountMap.set(names[i], 1);
30 }
31
32 // Mark the final name (with or without suffix) as used
33 nameCountMap.set(names[i], 1);
34 }
35
36 return names;
37}
38
Time and Space Complexity
The time complexity is O(L)
where L
is the sum of the lengths of all file names in the names
array.
Breaking down the analysis:
- The outer loop iterates through each name once:
O(n)
iterations wheren
is the number of names - For each name, we perform string operations:
- Checking if a name exists in the dictionary:
O(len(name))
- Creating new strings with suffixes
f'{name}({k})'
:O(len(name))
- The while loop seems problematic at first glance, but it's amortized
O(1)
per name because:- Each suffix
(k)
is only generated once across all iterations - The counter
d[name]
tracks the next available suffix, avoiding re-checking previously used suffixes - Each unique string is added to the dictionary exactly once
- Each suffix
- Checking if a name exists in the dictionary:
- Dictionary operations (insertion and lookup) with string keys:
O(len(key))
Since each character in the input is processed a constant number of times, the total time complexity is O(L)
.
The space complexity is O(L)
:
- The dictionary
d
stores all unique folder names (original and modified) - In the worst case, all names are unique, so we store each name once
- The modified
names
array reuses the existing space (in-place modification) - The total space used is proportional to the total length of all stored strings:
O(L)
Learn more about how to find time and space complexity quickly.
Common Pitfalls
Pitfall 1: Not Tracking Both Base Name and Suffixed Names
The Problem:
A critical mistake is only tracking the base name without also marking the suffixed versions as occupied. For example, if the input is ["doc", "doc(1)", "doc"]
, failing to track "doc(1)"
as used would cause the third "doc"
to incorrectly become "doc(1)"
again, creating a duplicate.
Wrong Approach:
# INCORRECT - Only tracks base names
name_counter = {}
for i, name in enumerate(names):
if name in name_counter:
suffix = name_counter[name]
names[i] = f'{name}({suffix})'
name_counter[name] = suffix + 1
else:
name_counter[name] = 1
Correct Solution: The code must track BOTH the base name's next suffix AND mark each created name (including suffixed ones) as used:
name_counter[name] = suffix_num + 1 # Update next suffix for base name name_counter[names[i]] = 1 # Mark the actual created name as used
Pitfall 2: Inefficient Suffix Search Without Tracking
The Problem:
Starting the suffix search from 1 every time a duplicate is encountered leads to O(n²) complexity in worst cases. For input like ["a", "a", "a", ..., "a"]
with n occurrences, you'd check suffixes 1, then 1-2, then 1-2-3, etc.
Wrong Approach:
# INEFFICIENT - Always starts from suffix 1
if name in name_counter:
suffix = 1
while f'{name}({suffix})' in name_counter:
suffix += 1
names[i] = f'{name}({suffix})'
Correct Solution: Store and use the next available suffix for each base name to avoid redundant checks:
suffix_num = name_counter[name] # Start from the stored next suffix
while f'{name}({suffix_num})' in name_counter:
suffix_num += 1
name_counter[name] = suffix_num + 1 # Update for next time
Pitfall 3: Confusion About Dictionary Values
The Problem: The dictionary values serve different purposes depending on the key type:
- For base names: stores the next suffix number to try
- For suffixed names: just marks them as used (value is typically 1)
This dual purpose can cause confusion and lead to incorrect implementations.
Example to Illustrate:
After processing ["doc", "doc", "doc(2)"]
:
name_counter["doc"] = 3
(next suffix to try is 3)name_counter["doc(1)"] = 1
(just marking as used)name_counter["doc(2)"] = 1
(just marking as used)
Understanding this distinction is crucial for correct implementation.
Which of the tree traversal order can be used to obtain elements in a binary search tree in sorted order?
Recommended Readings
Coding Interview Patterns Your Personal Dijkstra's Algorithm to Landing Your Dream Job The goal of AlgoMonster is to help you get a job in the shortest amount of time possible in a data driven way We compiled datasets of tech interview problems and broke them down by patterns This way
Recursion Recursion is one of the most important concepts in computer science Simply speaking recursion is the process of a function calling itself Using a real life analogy imagine a scenario where you invite your friends to lunch https assets algo monster recursion jpg You first call Ben and ask
Runtime Overview When learning about algorithms and data structures you'll frequently encounter the term time complexity This concept is fundamental in computer science and offers insights into how long an algorithm takes to complete given a certain input size What is Time Complexity Time complexity represents the amount of time
Want a Structured Path to Master System Design Too? Don’t Miss This!