811. Subdomain Visit Count
Problem Description
You are given a list of count-paired domains, where each entry represents how many times a particular domain was visited. When a domain is visited, all of its parent domains are also implicitly visited with the same count.
A count-paired domain is formatted as "count domain"
, where:
count
is an integer representing the number of visitsdomain
is a string like"discuss.leetcode.com"
or"leetcode.com"
For example, if "9001 discuss.leetcode.com"
is given:
"discuss.leetcode.com"
was visited 9001 times"leetcode.com"
was also visited 9001 times (parent domain)"com"
was also visited 9001 times (top-level domain)
Given an array cpdomains
containing count-paired domains, you need to return an array that shows the total visit count for each domain and subdomain across all inputs. The output format should also be count-paired domains in the format "count domain"
.
Example breakdown:
- Input:
["9001 discuss.leetcode.com"]
- Output should include:
"9001 discuss.leetcode.com"
"9001 leetcode.com"
"9001 com"
If multiple entries contribute to the same domain, their counts should be summed. The order of the output doesn't matter.
Intuition
The key insight is that we need to extract all possible subdomains from each given domain and accumulate their visit counts. When we see a domain like "discuss.leetcode.com"
, we need to identify all its parent domains by looking at the dot separators.
Think of a domain as a hierarchy: the rightmost part ("com"
) is the top level, and as we add more parts to the left, we get more specific subdomains. We can find all parent domains by locating each dot .
in the string and taking the substring from that position onwards.
For parsing, we notice that each count-paired domain has a space separating the count from the domain. After extracting the count, we need to find all valid subdomains. The trick is to iterate through the string and whenever we encounter a space
or a dot .
, we know that what comes after it is a valid domain or subdomain.
For example, with "9001 discuss.leetcode.com"
:
- After the space:
"discuss.leetcode.com"
- After the first dot:
"leetcode.com"
- After the second dot:
"com"
We use a hash map (Counter in Python) to accumulate counts because the same domain might appear in multiple input entries. This allows us to efficiently sum up visits for each unique domain.
The solution cleverly combines the initial space and the dots into a single check if c in ' .'
to identify all positions where a valid domain starts (at index i + 1
). This unified approach elegantly handles both the full domain after the space and all subdomains after each dot.
Solution Approach
The solution uses a hash map (Counter) to accumulate visit counts for all domains and subdomains. Here's the step-by-step implementation:
-
Initialize a Counter: Create a
Counter
objectcnt
to store the accumulated visit counts for each domain. -
Process each count-paired domain:
- For each string
s
incpdomains
, extract the visit count by finding the substring before the first space:v = int(s[:s.index(' ')])
- For each string
-
Extract all subdomains:
- Iterate through each character in the string
s
with its index - Check if the current character is either a space
' '
or a dot'.'
- When we find a space or dot at position
i
, the substring starting fromi + 1
to the end represents a valid domain or subdomain - Add the visit count
v
to this domain in our counter:cnt[s[i + 1:]] += v
- Iterate through each character in the string
-
Format the output:
- Convert the counter entries back to the required format
- For each domain
s
with countv
in the counter, create a stringf'{v} {s}'
- Return a list of all these formatted strings
Example walkthrough with "9001 discuss.leetcode.com"
:
- Extract count:
v = 9001
- At index 4 (space):
s[5:]
="discuss.leetcode.com"
, add 9001 visits - At index 12 (first dot):
s[13:]
="leetcode.com"
, add 9001 visits - At index 21 (second dot):
s[22:]
="com"
, add 9001 visits
The algorithm efficiently processes each input string once with O(n) time complexity per string, where n is the length of the string. The space complexity is O(m) where m is the total number of unique domains across all inputs.
Ready to land your dream job?
Unlock your dream job with a 3-minute evaluator for a personalized learning plan!
Start EvaluatorExample Walkthrough
Let's walk through a concrete example with input: ["900 google.mail.com", "50 yahoo.com", "1 intel.mail.com", "5 wiki.org"]
Step 1: Initialize Counter
- Create an empty counter:
cnt = {}
Step 2: Process "900 google.mail.com"
- Extract count:
v = 900
(substring before space) - Iterate through string:
- Index 3: character is space
' '
→ extract"google.mail.com"
→cnt["google.mail.com"] = 900
- Index 10: character is dot
'.'
→ extract"mail.com"
→cnt["mail.com"] = 900
- Index 15: character is dot
'.'
→ extract"com"
→cnt["com"] = 900
- Index 3: character is space
- Counter state:
{"google.mail.com": 900, "mail.com": 900, "com": 900}
Step 3: Process "50 yahoo.com"
- Extract count:
v = 50
- Iterate through string:
- Index 2: character is space
' '
→ extract"yahoo.com"
→cnt["yahoo.com"] = 50
- Index 8: character is dot
'.'
→ extract"com"
→cnt["com"] = 900 + 50 = 950
- Index 2: character is space
- Counter state:
{"google.mail.com": 900, "mail.com": 900, "com": 950, "yahoo.com": 50}
Step 4: Process "1 intel.mail.com"
- Extract count:
v = 1
- Iterate through string:
- Index 1: character is space
' '
→ extract"intel.mail.com"
→cnt["intel.mail.com"] = 1
- Index 7: character is dot
'.'
→ extract"mail.com"
→cnt["mail.com"] = 900 + 1 = 901
- Index 12: character is dot
'.'
→ extract"com"
→cnt["com"] = 950 + 1 = 951
- Index 1: character is space
- Counter state:
{"google.mail.com": 900, "mail.com": 901, "com": 951, "yahoo.com": 50, "intel.mail.com": 1}
Step 5: Process "5 wiki.org"
- Extract count:
v = 5
- Iterate through string:
- Index 1: character is space
' '
→ extract"wiki.org"
→cnt["wiki.org"] = 5
- Index 6: character is dot
'.'
→ extract"org"
→cnt["org"] = 5
- Index 1: character is space
- Final counter:
{"google.mail.com": 900, "mail.com": 901, "com": 951, "yahoo.com": 50, "intel.mail.com": 1, "wiki.org": 5, "org": 5}
Step 6: Format output Convert each counter entry to "count domain" format:
- Output:
["900 google.mail.com", "901 mail.com", "951 com", "50 yahoo.com", "1 intel.mail.com", "5 wiki.org", "5 org"]
The key insight is how the algorithm accumulates counts when the same subdomain appears in multiple inputs (like "mail.com" appearing in both "google.mail.com" and "intel.mail.com").
Solution Implementation
1from collections import Counter
2from typing import List
3
4class Solution:
5 def subdomainVisits(self, cpdomains: List[str]) -> List[str]:
6 # Counter to store the visit count for each domain/subdomain
7 domain_counter = Counter()
8
9 # Process each count-paired domain
10 for count_paired_domain in cpdomains:
11 # Extract the visit count from the beginning of the string
12 space_index = count_paired_domain.index(' ')
13 visit_count = int(count_paired_domain[:space_index])
14
15 # Iterate through each character to find domain separators
16 for index, char in enumerate(count_paired_domain):
17 # When we find a space or dot, the substring after it is a valid domain
18 if char in ' .':
19 # Extract the domain starting from the character after the separator
20 domain = count_paired_domain[index + 1:]
21 # Add the visit count to this domain
22 domain_counter[domain] += visit_count
23
24 # Format the results as "count domain" strings
25 result = []
26 for domain, count in domain_counter.items():
27 result.append(f'{count} {domain}')
28
29 return result
30
1class Solution {
2 public List<String> subdomainVisits(String[] cpdomains) {
3 // Map to store subdomain visit counts
4 Map<String, Integer> visitCounts = new HashMap<>();
5
6 // Process each count-paired domain
7 for (String countPairedDomain : cpdomains) {
8 // Find the space separator between count and domain
9 int spaceIndex = countPairedDomain.indexOf(" ");
10
11 // Extract the visit count from the string
12 int visitCount = Integer.parseInt(countPairedDomain.substring(0, spaceIndex));
13
14 // Iterate through the string to find all subdomains
15 for (int i = spaceIndex; i < countPairedDomain.length(); i++) {
16 // Check if current character is a space or dot (subdomain separator)
17 if (countPairedDomain.charAt(i) == ' ' || countPairedDomain.charAt(i) == '.') {
18 // Extract subdomain starting from the character after space/dot
19 String subdomain = countPairedDomain.substring(i + 1);
20
21 // Update the visit count for this subdomain
22 visitCounts.put(subdomain, visitCounts.getOrDefault(subdomain, 0) + visitCount);
23 }
24 }
25 }
26
27 // Build the result list with formatted count-paired domains
28 List<String> result = new ArrayList<>();
29 for (Map.Entry<String, Integer> entry : visitCounts.entrySet()) {
30 // Format: "count domain"
31 result.add(entry.getValue() + " " + entry.getKey());
32 }
33
34 return result;
35 }
36}
37
1class Solution {
2public:
3 vector<string> subdomainVisits(vector<string>& cpdomains) {
4 // Hash map to store domain visit counts
5 unordered_map<string, int> domainCount;
6
7 // Process each count-paired domain
8 for (const auto& cpdomain : cpdomains) {
9 // Find the space separator between count and domain
10 int spacePos = cpdomain.find(' ');
11
12 // Extract the visit count from the string
13 int visitCount = stoi(cpdomain.substr(0, spacePos));
14
15 // Iterate through the domain string to find all subdomains
16 for (int i = spacePos; i < cpdomain.size(); ++i) {
17 // When we encounter a space or dot, the substring after it is a valid domain
18 if (cpdomain[i] == ' ' || cpdomain[i] == '.') {
19 // Extract subdomain starting from position i+1 and add visit count
20 string subdomain = cpdomain.substr(i + 1);
21 domainCount[subdomain] += visitCount;
22 }
23 }
24 }
25
26 // Build the result vector with formatted strings
27 vector<string> result;
28 for (const auto& [domain, count] : domainCount) {
29 // Format: "count domain"
30 result.push_back(to_string(count) + " " + domain);
31 }
32
33 return result;
34 }
35};
36
1function subdomainVisits(cpdomains: string[]): string[] {
2 // Hash map to store domain visit counts
3 const domainCount: Map<string, number> = new Map();
4
5 // Process each count-paired domain
6 for (const cpdomain of cpdomains) {
7 // Find the space separator between count and domain
8 const spacePos = cpdomain.indexOf(' ');
9
10 // Extract the visit count from the string
11 const visitCount = parseInt(cpdomain.substring(0, spacePos));
12
13 // Iterate through the domain string to find all subdomains
14 for (let i = spacePos; i < cpdomain.length; i++) {
15 // When we encounter a space or dot, the substring after it is a valid domain
16 if (cpdomain[i] === ' ' || cpdomain[i] === '.') {
17 // Extract subdomain starting from position i+1 and add visit count
18 const subdomain = cpdomain.substring(i + 1);
19 domainCount.set(subdomain, (domainCount.get(subdomain) || 0) + visitCount);
20 }
21 }
22 }
23
24 // Build the result array with formatted strings
25 const result: string[] = [];
26 for (const [domain, count] of domainCount) {
27 // Format: "count domain"
28 result.push(`${count} ${domain}`);
29 }
30
31 return result;
32}
33
Time and Space Complexity
Time Complexity: O(N * M)
where N
is the number of domains in the input list and M
is the average length of each domain string.
- The outer loop iterates through all
N
domain strings incpdomains
- For each domain string, we extract the count value using
s.index(' ')
which takesO(M)
time - The inner loop iterates through each character in the domain string, which is
O(M)
operations - For each space or dot character found, we:
- Create a substring
s[i + 1:]
which takesO(M)
time in the worst case - Update the counter with this substring, which is
O(1)
average case for hash table operations
- Create a substring
- The maximum number of subdomains for any domain is bounded by the length of the string (at most
M
characters means at mostM/2
dots plus the space) - Finally, creating the result list requires iterating through all unique subdomains in the counter, which is at most
O(N * M)
entries - Overall:
O(N * M)
for processing all domains
Space Complexity: O(N * M)
where N
is the number of domains and M
is the average length of each domain string.
- The
Counter
objectcnt
stores all unique subdomains and their counts - In the worst case, each domain contributes multiple unique subdomains (e.g., "discuss.leetcode.com" contributes "discuss.leetcode.com", "leetcode.com", and "com")
- The total number of unique subdomains is bounded by
O(N * M)
since each domain of lengthM
can contribute at mostO(M)
subdomains - The output list also requires
O(N * M)
space to store all the formatted strings - Overall space complexity:
O(N * M)
Learn more about how to find time and space complexity quickly.
Common Pitfalls
1. Forgetting to Process the Full Domain
A common mistake is only processing subdomains that appear after dots (.
), forgetting that the initial full domain (appearing after the space) also needs to be counted.
Incorrect approach:
# This would miss counting "discuss.leetcode.com"
for i, c in enumerate(s):
if c == '.': # Only looking for dots
cnt[s[i + 1:]] += v
Solution: Include both space and dot as separators in the condition:
for i, c in enumerate(s):
if c in ' .': # Check for both space and dot
cnt[s[i + 1:]] += v
2. String Parsing Errors
Using split()
incorrectly or making assumptions about string format can lead to errors.
Incorrect approach:
# Assumes there's always exactly one space
parts = s.split(' ')
v = int(parts[0])
domain = parts[1] # Could fail if format is unexpected
Solution: Use index()
to find the first space position for more robust parsing:
space_idx = s.index(' ')
v = int(s[:space_idx])
# Then process from this known position
3. Not Accumulating Counts
Overwriting counts instead of accumulating them when the same domain appears multiple times.
Incorrect approach:
cnt[domain] = v # This overwrites instead of accumulating
Solution: Always use addition to accumulate counts:
cnt[domain] += v # Correctly accumulates visits
4. Off-by-One Errors in Substring Extraction
Getting the wrong substring by not adding 1 to skip the separator character.
Incorrect approach:
if c in ' .': cnt[s[i:]] += v # Includes the separator in the domain
Solution: Always use i + 1
to skip the separator:
if c in ' .': cnt[s[i + 1:]] += v # Correctly skips the separator
Which type of traversal does breadth first search do?
Recommended Readings
Coding Interview Patterns Your Personal Dijkstra's Algorithm to Landing Your Dream Job The goal of AlgoMonster is to help you get a job in the shortest amount of time possible in a data driven way We compiled datasets of tech interview problems and broke them down by patterns This way
Recursion Recursion is one of the most important concepts in computer science Simply speaking recursion is the process of a function calling itself Using a real life analogy imagine a scenario where you invite your friends to lunch https assets algo monster recursion jpg You first call Ben and ask
Runtime Overview When learning about algorithms and data structures you'll frequently encounter the term time complexity This concept is fundamental in computer science and offers insights into how long an algorithm takes to complete given a certain input size What is Time Complexity Time complexity represents the amount of time
Want a Structured Path to Master System Design Too? Don’t Miss This!