1152. Analyze User Website Visit Pattern π
Problem Description
You are given three arrays that record website visit data:
username
: array of usernameswebsite
: array of website namestimestamp
: array of visit times
These arrays have the same length, and [username[i], website[i], timestamp[i]]
means user username[i]
visited website website[i]
at time timestamp[i]
.
A pattern is defined as a sequence of exactly three websites (can be the same website repeated). For example:
["home", "away", "love"]
- three different websites["leetcode", "love", "leetcode"]
- same website can appear multiple times["luffy", "luffy", "luffy"]
- all three can be the same website
The score of a pattern is the number of unique users who visited those three websites in that exact order (based on timestamps). The visits don't need to be consecutive - there can be other website visits in between, as long as the pattern websites are visited in the specified order.
For example:
- If a user visits
["home", "google", "away", "youtube", "love"]
in that time order, they match the pattern["home", "away", "love"]
- The pattern requires the websites to be visited in order according to their timestamps
Your task is to find the pattern with the highest score. If multiple patterns have the same highest score, return the lexicographically smallest one (comparing the patterns as tuples).
The solution approach involves:
- Grouping website visits by user and sorting them by timestamp
- For each user with at least 3 website visits, generating all possible 3-website combinations while preserving order
- Counting how many unique users match each pattern
- Returning the pattern with the highest count (or lexicographically smallest if there's a tie)
Intuition
The key insight is that we need to find patterns of three websites that are visited by multiple users in the same order. To approach this, we need to think about what information we actually need from the raw data.
First, we realize that the absolute timestamps don't matter - what matters is the relative order of visits for each user. So for each user, we need to know their sequence of website visits sorted by time.
Once we have each user's chronological visit sequence, we need to find all possible 3-website subsequences from their visits. Why subsequences and not just consecutive triplets? Because the problem states that the websites don't need to be visited contiguously - there can be other visits in between. For example, if a user visits ["A", "B", "C", "D", "E"]
, the pattern ["A", "C", "E"]
is valid.
For a user with n
website visits where n β₯ 3
, we can form C(n, 3)
different ordered triplets. We generate all these combinations using three nested loops: choosing the first website, then the second website from those that come after it, and finally the third website from those that come after the second.
An important detail is that we only want to count each pattern once per user. If a user visits ["A", "B", "A", "B"]
, the pattern ["A", "B", "B"]
can be formed in multiple ways, but it should only count as 1 for this user. That's why we use a set to store unique patterns for each user before adding them to our global counter.
Finally, after counting how many users match each pattern, we need to handle ties. When multiple patterns have the same count, we return the lexicographically smallest one. This is naturally handled by sorting the patterns first by count (descending) and then by the pattern itself (ascending).
Learn more about Sorting patterns.
Solution Approach
The implementation follows these key steps:
Step 1: Organize data by user and timestamp
First, we create a dictionary d
to group websites by user. We sort the input data by timestamp to ensure each user's website visits are in chronological order:
d = defaultdict(list)
for user, _, site in sorted(zip(username, timestamp, website), key=lambda x: x[1]):
d[user].append(site)
The sorted()
function with key=lambda x: x[1]
sorts by the timestamp (second element). After this step, d[user]
contains a list of websites that the user visited in chronological order.
Step 2: Generate all possible 3-website patterns for each user
For each user's website sequence, we generate all possible ordered triplets:
cnt = Counter()
for sites in d.values():
m = len(sites)
s = set()
if m > 2:
for i in range(m - 2):
for j in range(i + 1, m - 1):
for k in range(j + 1, m):
s.add((sites[i], sites[j], sites[k]))
The three nested loops ensure we pick three websites in order: i < j < k
. This maintains the chronological order requirement. We use a set s
to store unique patterns for each user, preventing duplicate counting if the same pattern can be formed multiple ways from one user's visits.
Step 3: Count pattern occurrences across all users
After generating all unique patterns for a user, we add them to the global counter:
for t in s: cnt[t] += 1
Each pattern in the set increases the counter by 1, representing one user who matches that pattern.
Step 4: Find the pattern with highest score
Finally, we sort all patterns by their count (descending) and lexicographically (ascending) to handle ties:
return sorted(cnt.items(), key=lambda x: (-x[1], x[0]))[0][0]
The sorting key (-x[1], x[0])
means:
-x[1]
: Sort by count in descending order (negative for reverse)x[0]
: Sort by pattern tuple lexicographically in ascending order
The [0][0]
at the end extracts the pattern tuple from the first item (highest score, lexicographically smallest).
Time Complexity: O(n log n + u * m^3)
where n
is the total number of records, u
is the number of unique users, and m
is the maximum number of websites visited by a user.
Space Complexity: O(n + p)
where p
is the total number of unique patterns generated.
Ready to land your dream job?
Unlock your dream job with a 5-minute evaluator for a personalized learning plan!
Start EvaluatorExample Walkthrough
Let's walk through a small example to illustrate the solution approach.
Input:
- username = ["joe", "joe", "joe", "james", "james", "james", "james", "mary", "mary", "mary"]
- website = ["home", "about", "career", "home", "cart", "maps", "home", "home", "about", "career"]
- timestamp = [1, 2, 3, 4, 5, 6, 7, 8, 9, 10]
Step 1: Organize data by user and timestamp
After sorting by timestamp and grouping by user:
- joe: ["home", "about", "career"]
- james: ["home", "cart", "maps", "home"]
- mary: ["home", "about", "career"]
Step 2: Generate all possible 3-website patterns for each user
For joe with ["home", "about", "career"]:
- Only one possible triplet: ("home", "about", "career")
For james with ["home", "cart", "maps", "home"]:
- Choosing indices (0,1,2): ("home", "cart", "maps")
- Choosing indices (0,1,3): ("home", "cart", "home")
- Choosing indices (0,2,3): ("home", "maps", "home")
- Choosing indices (1,2,3): ("cart", "maps", "home")
For mary with ["home", "about", "career"]:
- Only one possible triplet: ("home", "about", "career")
Step 3: Count pattern occurrences across all users
Pattern counts:
- ("home", "about", "career"): 2 users (joe and mary)
- ("home", "cart", "maps"): 1 user (james)
- ("home", "cart", "home"): 1 user (james)
- ("home", "maps", "home"): 1 user (james)
- ("cart", "maps", "home"): 1 user (james)
Step 4: Find the pattern with highest score
The pattern ("home", "about", "career") has the highest count of 2, so it's returned as the answer.
If there were a tie, for example if both ("home", "about", "career") and ("a", "b", "c") had count 2, we would return ("a", "b", "c") as it's lexicographically smaller.
Solution Implementation
1from collections import defaultdict, Counter
2from typing import List
3
4class Solution:
5 def mostVisitedPattern(
6 self, username: List[str], timestamp: List[int], website: List[str]
7 ) -> List[str]:
8 # Create a dictionary to store websites visited by each user
9 user_websites = defaultdict(list)
10
11 # Sort visits by timestamp and group websites by user
12 for user, time, site in sorted(
13 zip(username, timestamp, website), key=lambda x: x[1]
14 ):
15 user_websites[user].append(site)
16
17 # Counter to track frequency of each 3-sequence pattern
18 pattern_counter = Counter()
19
20 # For each user, find all possible 3-sequence patterns
21 for websites_list in user_websites.values():
22 num_sites = len(websites_list)
23 unique_patterns = set()
24
25 # Only process if user visited at least 3 websites
26 if num_sites > 2:
27 # Generate all possible 3-sequence combinations
28 for i in range(num_sites - 2):
29 for j in range(i + 1, num_sites - 1):
30 for k in range(j + 1, num_sites):
31 # Add pattern to set (ensures each pattern counted once per user)
32 unique_patterns.add(
33 (websites_list[i], websites_list[j], websites_list[k])
34 )
35
36 # Increment counter for each unique pattern from this user
37 for pattern in unique_patterns:
38 pattern_counter[pattern] += 1
39
40 # Sort patterns by frequency (descending) and lexicographically (ascending)
41 # Return the first pattern (most frequent, lexicographically smallest)
42 sorted_patterns = sorted(
43 pattern_counter.items(),
44 key=lambda x: (-x[1], x[0])
45 )
46 return list(sorted_patterns[0][0])
47
1class Solution {
2 public List<String> mostVisitedPattern(String[] username, int[] timestamp, String[] website) {
3 // Map to store each user's visit history
4 Map<String, List<Visit>> userVisitsMap = new HashMap<>();
5 int n = username.length;
6
7 // Group visits by user
8 for (int i = 0; i < n; i++) {
9 String user = username[i];
10 int time = timestamp[i];
11 String site = website[i];
12 userVisitsMap.computeIfAbsent(user, k -> new ArrayList<>())
13 .add(new Visit(user, time, site));
14 }
15
16 // Map to count occurrences of each 3-sequence pattern
17 Map<String, Integer> patternCount = new HashMap<>();
18
19 // Process each user's visit history
20 for (List<Visit> visits : userVisitsMap.values()) {
21 int visitCount = visits.size();
22 Set<String> uniquePatterns = new HashSet<>();
23
24 // Only process users with at least 3 visits
25 if (visitCount > 2) {
26 // Sort visits by timestamp
27 Collections.sort(visits, (a, b) -> a.timestamp - b.timestamp);
28
29 // Generate all possible 3-sequences in chronological order
30 for (int i = 0; i < visitCount - 2; i++) {
31 for (int j = i + 1; j < visitCount - 1; j++) {
32 for (int k = j + 1; k < visitCount; k++) {
33 // Create pattern string with comma separator
34 String pattern = visits.get(i).website + "," +
35 visits.get(j).website + "," +
36 visits.get(k).website;
37 uniquePatterns.add(pattern);
38 }
39 }
40 }
41 }
42
43 // Count each unique pattern for this user
44 for (String pattern : uniquePatterns) {
45 patternCount.put(pattern, patternCount.getOrDefault(pattern, 0) + 1);
46 }
47 }
48
49 // Find the most frequent pattern (lexicographically smallest if tie)
50 int maxCount = 0;
51 String mostFrequentPattern = "";
52
53 for (Map.Entry<String, Integer> entry : patternCount.entrySet()) {
54 String currentPattern = entry.getKey();
55 int currentCount = entry.getValue();
56
57 // Update if higher count or same count but lexicographically smaller
58 if (currentCount > maxCount ||
59 (currentCount == maxCount && currentPattern.compareTo(mostFrequentPattern) < 0)) {
60 maxCount = currentCount;
61 mostFrequentPattern = currentPattern;
62 }
63 }
64
65 // Split the pattern string and return as list
66 return Arrays.asList(mostFrequentPattern.split(","));
67 }
68}
69
70/**
71 * Class to represent a user's visit to a website at a specific time
72 */
73class Visit {
74 String user;
75 int timestamp;
76 String website;
77
78 Visit(String user, int timestamp, String website) {
79 this.user = user;
80 this.timestamp = timestamp;
81 this.website = website;
82 }
83}
84
1class Solution {
2public:
3 vector<string> mostVisitedPattern(vector<string>& username, vector<int>& timestamp, vector<string>& website) {
4 // Map to store user -> list of (timestamp, website) pairs
5 unordered_map<string, vector<pair<int, string>>> userVisits;
6
7 // Collect all visits for each user
8 int n = username.size();
9 for (int i = 0; i < n; ++i) {
10 string user = username[i];
11 int ts = timestamp[i];
12 string site = website[i];
13 userVisits[user].emplace_back(ts, site);
14 }
15
16 // Map to count frequency of each 3-sequence pattern
17 unordered_map<string, int> patternCount;
18
19 // Process each user's visit history
20 for (auto& [user, visits] : userVisits) {
21 int visitCount = visits.size();
22 // Set to store unique patterns for current user (avoid counting same pattern multiple times per user)
23 unordered_set<string> uniquePatterns;
24
25 // Only process users with at least 3 visits
26 if (visitCount >= 3) {
27 // Sort visits by timestamp to maintain chronological order
28 sort(visits.begin(), visits.end());
29
30 // Generate all possible 3-sequences from user's visits
31 for (int i = 0; i < visitCount - 2; ++i) {
32 for (int j = i + 1; j < visitCount - 1; ++j) {
33 for (int k = j + 1; k < visitCount; ++k) {
34 // Create pattern string by concatenating websites with comma separator
35 string pattern = visits[i].second + "," +
36 visits[j].second + "," +
37 visits[k].second;
38 uniquePatterns.insert(pattern);
39 }
40 }
41 }
42 }
43
44 // Increment count for each unique pattern from this user
45 for (const string& pattern : uniquePatterns) {
46 patternCount[pattern]++;
47 }
48 }
49
50 // Find the pattern with maximum count (lexicographically smallest if tie)
51 int maxCount = 0;
52 string bestPattern;
53
54 for (const auto& [pattern, count] : patternCount) {
55 // Update if we found a pattern with higher count, or same count but lexicographically smaller
56 if (count > maxCount || (count == maxCount && pattern < bestPattern)) {
57 maxCount = count;
58 bestPattern = pattern;
59 }
60 }
61
62 // Split the best pattern string into individual websites
63 return split(bestPattern, ',');
64 }
65
66private:
67 // Helper function to split a string by delimiter
68 vector<string> split(string& s, char delimiter) {
69 vector<string> result;
70 stringstream ss(s);
71 string token;
72
73 // Extract each token separated by delimiter
74 while (getline(ss, token, delimiter)) {
75 result.push_back(token);
76 }
77
78 return result;
79 }
80};
81
1function mostVisitedPattern(username: string[], timestamp: number[], website: string[]): string[] {
2 // Map to store user -> list of [timestamp, website] pairs
3 const userVisits: Map<string, [number, string][]> = new Map();
4
5 // Collect all visits for each user
6 const n = username.length;
7 for (let i = 0; i < n; i++) {
8 const user = username[i];
9 const ts = timestamp[i];
10 const site = website[i];
11
12 if (!userVisits.has(user)) {
13 userVisits.set(user, []);
14 }
15 userVisits.get(user)!.push([ts, site]);
16 }
17
18 // Map to count frequency of each 3-sequence pattern
19 const patternCount: Map<string, number> = new Map();
20
21 // Process each user's visit history
22 for (const [user, visits] of userVisits) {
23 const visitCount = visits.length;
24 // Set to store unique patterns for current user (avoid counting same pattern multiple times per user)
25 const uniquePatterns: Set<string> = new Set();
26
27 // Only process users with at least 3 visits
28 if (visitCount >= 3) {
29 // Sort visits by timestamp to maintain chronological order
30 visits.sort((a, b) => a[0] - b[0]);
31
32 // Generate all possible 3-sequences from user's visits
33 for (let i = 0; i < visitCount - 2; i++) {
34 for (let j = i + 1; j < visitCount - 1; j++) {
35 for (let k = j + 1; k < visitCount; k++) {
36 // Create pattern string by concatenating websites with comma separator
37 const pattern = visits[i][1] + "," +
38 visits[j][1] + "," +
39 visits[k][1];
40 uniquePatterns.add(pattern);
41 }
42 }
43 }
44 }
45
46 // Increment count for each unique pattern from this user
47 for (const pattern of uniquePatterns) {
48 patternCount.set(pattern, (patternCount.get(pattern) || 0) + 1);
49 }
50 }
51
52 // Find the pattern with maximum count (lexicographically smallest if tie)
53 let maxCount = 0;
54 let bestPattern = "";
55
56 for (const [pattern, count] of patternCount) {
57 // Update if we found a pattern with higher count, or same count but lexicographically smaller
58 if (count > maxCount || (count === maxCount && pattern < bestPattern)) {
59 maxCount = count;
60 bestPattern = pattern;
61 }
62 }
63
64 // Split the best pattern string into individual websites
65 return split(bestPattern, ',');
66}
67
68// Helper function to split a string by delimiter
69function split(s: string, delimiter: string): string[] {
70 // Extract each token separated by delimiter
71 return s.split(delimiter);
72}
73
Time and Space Complexity
The time complexity is O(n + m * k^3)
where n
is the total number of entries (length of username/timestamp/website arrays), m
is the number of unique users, and k
is the maximum number of websites visited by any single user. In the worst case where one user visits all n
websites, this becomes O(n^3)
.
The space complexity is O(n + m * k^3)
. The dictionary d
stores at most n
website entries across all users. The set s
for each user can contain at most C(k, 3) = k*(k-1)*(k-2)/6
unique 3-patterns, and the Counter cnt
aggregates all unique patterns across all users. In the worst case with one user visiting all websites, this becomes O(n^3)
.
Breaking down the complexity:
- Sorting the initial data:
O(n log n)
time - Building user-website dictionary:
O(n)
time and space - For each user with
k
websites, generating all 3-patterns:O(k^3)
time - Storing unique patterns in set and counter:
O(k^3)
space per user - Final sorting of counter items:
O(p log p)
wherep
is the number of unique patterns
The reference answer simplifies to O(n^3)
for both time and space complexity by considering the worst-case scenario where a single user visits all n
websites, making k = n
.
Learn more about how to find time and space complexity quickly.
Common Pitfalls
Pitfall 1: Not Handling Duplicate Pattern Counting Per User
Issue: A critical mistake is counting the same pattern multiple times for a single user. If a user visits websites ["a", "b", "c", "b", "c"]
, the pattern ["a", "b", "c"]
can be formed in multiple ways:
- Using indices (0, 1, 2)
- Using indices (0, 1, 4)
- Using indices (0, 3, 4)
Without proper handling, this pattern would be counted 3 times for this user, but it should only count once since it's still just one user matching the pattern.
Incorrect Code Example:
# WRONG: Directly adds to counter without deduplication
for sites in user_websites.values():
m = len(sites)
if m > 2:
for i in range(m - 2):
for j in range(i + 1, m - 1):
for k in range(j + 1, m):
pattern_counter[(sites[i], sites[j], sites[k])] += 1
Solution: Use a set to store unique patterns for each user before counting:
# CORRECT: Use a set to ensure each pattern counts only once per user
for sites in user_websites.values():
m = len(sites)
unique_patterns = set()
if m > 2:
for i in range(m - 2):
for j in range(i + 1, m - 1):
for k in range(j + 1, m):
unique_patterns.add((sites[i], sites[j], sites[k]))
# Now add each unique pattern once
for pattern in unique_patterns:
pattern_counter[pattern] += 1
Pitfall 2: Forgetting to Sort by Timestamp Before Grouping
Issue: The problem requires patterns to follow chronological order. If you group websites by user without first sorting by timestamp, the website order might be wrong.
Incorrect Code Example:
# WRONG: Groups websites without sorting by timestamp first
for i in range(len(username)):
user_websites[username[i]].append(website[i])
Solution: Always sort the data by timestamp before grouping:
# CORRECT: Sort by timestamp before grouping
for user, time, site in sorted(zip(username, timestamp, website), key=lambda x: x[1]):
user_websites[user].append(site)
Pitfall 3: Incorrect Lexicographical Comparison for Tie-Breaking
Issue: When multiple patterns have the same highest score, returning the wrong one due to incorrect sorting logic.
Incorrect Code Example:
# WRONG: Only sorts by count, doesn't handle lexicographical ordering
return max(pattern_counter.items(), key=lambda x: x[1])[0]
Solution: Use proper sorting with both count (descending) and lexicographical order (ascending):
# CORRECT: Sort by count (descending) then lexicographically (ascending)
sorted_patterns = sorted(pattern_counter.items(), key=lambda x: (-x[1], x[0]))
return list(sorted_patterns[0][0])
Which of the following problems can be solved with backtracking (select multiple)
Recommended Readings
Sorting Summary Comparisons We presented quite a few sorting algorithms and it is essential to know the advantages and disadvantages of each one The basic algorithms are easy to visualize and easy to learn for beginner programmers because of their simplicity As such they will suffice if you don't know any advanced
Coding Interview Patterns Your Personal Dijkstra's Algorithm to Landing Your Dream Job The goal of AlgoMonster is to help you get a job in the shortest amount of time possible in a data driven way We compiled datasets of tech interview problems and broke them down by patterns This way
Recursion Recursion is one of the most important concepts in computer science Simply speaking recursion is the process of a function calling itself Using a real life analogy imagine a scenario where you invite your friends to lunch https assets algo monster recursion jpg You first call Ben and ask
Want a Structured Path to Master System Design Too? Donβt Miss This!