2782. Number of Unique Categories
Problem Description
In this problem, we are given n
different elements, each with an associated category. Our task is to determine the total number of unique categories among all elements. We are given a class CategoryHandler
, which has a function haveSameCategory(a, b)
that tells us if two elements a
and b
belong to the same category or not. The input n
indicates the total number of elements, indexed from 0
to n - 1
. The haveSameCategory
function returns true
if elements a
and b
are in the same category, false
if they are in different categories, or false
if either a
or b
isn't a valid index within the range. The goal is to find and return the count of unique categories.
Intuition
The approach to solve this problem is akin to a classic Disjoint Set Union (DSU) or Union-Find problem. The idea here is to treat categories as disjoint sets, where each element initially is in its own category set. As we go through the elements, we use the haveSameCategory
method from CategoryHandler
to determine if two elements are in the same category. If they are, we "union" their sets by assigning them the same representative (root). The function find
is used to get the representative of an element's set, and if two elements are in the same category, we make one's representative the other's parent, effectively merging their sets.
This operation is applied for each pair that share a category until all elements that are in the same category are connected and have the same root. After processing all elements, the number of unique categories is equal to the number of sets that have themselves as their representative, which we count at the end and return as the final answer.
The solution uses path compression in the find
function which ensures that each element directly points to the set representative after the first time it's looked up, which optimizes the process and keeps the time complexity under control. By iterating over each element and connecting elements in the same category, we're essentially building a forest of trees, where each tree represents a unique category. The end result is that we obtain the number of unique categories by counting the number of roots in this forest.
Learn more about Union Find patterns.
Solution Approach
The provided solution uses the Disjoint Set Union (DSU) data structure, also known as Union-Find, to efficiently merge elements that share the same category and determine the number of unique categories.
Here's a step-by-step breakdown of the implementation:
- A list
p
is initialized withn
elements, where each element is initially set to its own index, implying that each element is its own representative. - A nested loop is used to iterate through all pairs of elements. For each pair
(a, b)
, we check whether they belong to the same category by callingcategoryHandler.haveSameCategory(a, b)
. - If the returned value is
true
, we find the representatives (parents) of both elements using thefind
function and merge them by setting the representative ofa
to be the representative ofb
. This effectively merges the sets thata
andb
belong to. - The
find
function is a crucial part of the DSU pattern. It follows the path of parent pointers from the elementx
up to the root (representative) of the set containingx
. It also implements path compression, a significant optimization where each element on the path is set to point directly to the root after the first lookup. - After processing all pairs, the categories are merged into disjoint sets with a unique representative for each set. To count the number of unique categories, we iterate over the list
p
and increment the count for each element that is its own representative (indicating it's the root of a set). - The final count, which represents the number of unique categories, is then returned.
The algorithms and patterns used in this solution include:
- Disjoint Set Union (DSU) pattern for maintaining and merging disjoint sets.
- Path Compression optimization to flatten the structure of the tree for any affiliated elements, reducing the time complexity of subsequent
find
operations.
An example of a mathematical formula used in the implementation is the equivalence relation that's being checked in the find
function: if p[x] != x
, it indicates that x
is not a root, and we recursively call find(p[x])
to retrieve the representative of x
.
The primary data structure used is a one-dimensional array (p
) which keeps track of representatives (roots) for each element, allowing us to efficiently perform unions and finds.
The solution has a time complexity that's nearly linear (amortized O(N α(N))
where α
is the Inverse Ackermann function) due to the use of path compression, making the solution highly efficient even for large values of n
.
Ready to land your dream job?
Unlock your dream job with a 2-minute evaluator for a personalized learning plan!
Start EvaluatorExample Walkthrough
Let's say we are given n = 5
elements, indexed from 0
to 4
. Assume the following haveSameCategory
results when checking pairs:
haveSameCategory(0, 1)
returnsfalse
haveSameCategory(1, 2)
returnstrue
haveSameCategory(2, 3)
returnstrue
haveSameCategory(3, 4)
returnsfalse
haveSameCategory(0, 4)
returnstrue
Based on the description above, we start by initializing a list p
with n
elements where p = [0, 1, 2, 3, 4]
, indicating that each element is initially its own representative.
Next, we iterate through all pairs and use the haveSameCategory
function from our CategoryHandler
:
- Since
haveSameCategory(0, 1)
returnsfalse
, we do nothing, as they are not in the same category. - For the pair
(1, 2)
,haveSameCategory
returnstrue
. We find the representative of1
, which is1
itself, and the representative of2
, which is2
itself. We then union these two into the same set, resulting inp = [0, 1, 1, 3, 4]
, indicating that1
is now the representative of2
as well. - With the pair
(2, 3)
,haveSameCategory
returnstrue
. Since they need to be in the same category, we find the representative of2
, which we updated to1
in the previous step, and the representative of3
, which is3
. After union,p = [0, 1, 1, 1, 4]
because1
is now the representative for3
. - Next,
haveSameCategory(3, 4)
returnsfalse
, so no action is taken. - Lastly,
haveSameCategory(0, 4)
returnstrue
. We find the representatives of0
and4
, which are0
and4
, respectively. After union,p = [0, 1, 1, 1, 0]
, which means0
is now the representative for4
.
We have completed the iterations. Now, to count the number of unique categories, we count the number of set representatives:
- Index
0
is its own representative, indicating it is a unique category. - Index
1
is the representative for indices2
and3
, indicating another unique category. - Indices
2
,3
, and4
are not their own representatives.
So, we have two unique categories represented by p = [0, 1, 1, 1, 0]
. The final answer is 2
.
Solution Implementation
1# Class definition for handling categories.
2# Note: The actual implementation of CategoryHandler.haveSameCategory should be provided.
3class CategoryHandler:
4 def haveSameCategory(self, a: int, b: int) -> bool:
5 # This method should determine if elements `a` and `b` have the same category.
6 pass
7
8
9class Solution:
10 def numberOfCategories(
11 self, n: int, category_handler: Optional['CategoryHandler']
12 ) -> int:
13 # Function to find the root of an element `x` using path compression.
14 def find_root(x: int) -> int:
15 if parent[x] != x:
16 parent[x] = find_root(parent[x]) # Path compression for efficiency.
17 return parent[x]
18
19 # Initialization of each element to be its own parent.
20 parent = list(range(n))
21
22 # Iterate through all possible pairs of elements.
23 for a in range(n):
24 for b in range(a + 1, n): # To avoid unnecessary checks, start from a + 1.
25 # If two elements belong to the same category, unite their sets.
26 if category_handler.haveSameCategory(a, b):
27 parent[find_root(a)] = find_root(b)
28
29 # Count the number of unique parents (root elements) to determine
30 # the number of different categories.
31 # This is done by checking if the parent of an element equals its index.
32 return sum(1 for i, x in enumerate(parent) if i == x)
33
34
35# Example usage:
36# Assuming that `CustomCategoryHandler` is a class derived from `CategoryHandler`
37# with a specific implementation for the method `haveSameCategory`.
38# solution = Solution()
39# number_of_categories = solution.numberOfCategories(n, CustomCategoryHandler())
40
1/**
2 * The Solution class implements the numberOfCategories method which uses the union-find algorithm to determine the number of unique categories.
3 */
4class Solution {
5 // The array 'parents' is used to track the parent of each element, initializing each element as its own parent.
6 private int[] parents;
7
8 /**
9 * This method calculates the number of unique categories given the 'CategoryHandler' object.
10 *
11 * @param n The total number of elements.
12 * @param categoryHandler An object that provides a method to check if two elements belong to the same category.
13 * @return The number of unique categories among the elements.
14 */
15 public int numberOfCategories(int n, CategoryHandler categoryHandler) {
16 // Initialize the parent array, setting each item's parent to itself.
17 parents = new int[n];
18 for (int i = 0; i < n; ++i) {
19 parents[i] = i;
20 }
21
22 // Iterate over all pairs of elements and union them if they are in the same category.
23 for (int a = 0; a < n; ++a) {
24 for (int b = a + 1; b < n; ++b) {
25 if (categoryHandler.haveSameCategory(a, b)) {
26 union(a, b);
27 }
28 }
29 }
30
31 // Count the number of unique categories by counting the number of elements that are their own parent.
32 int countCategories = 0;
33 for (int i = 0; i < n; ++i) {
34 if (i == parents[i]) {
35 countCategories++;
36 }
37 }
38 return countCategories;
39 }
40
41 /**
42 * The find method is used for finding the root parent of 'x'. Also, path compression is applied for optimization.
43 *
44 * @param x The element to find the root parent of.
45 * @return The root parent of 'x'.
46 */
47 private int find(int x) {
48 // If 'x' is not its own parent, recursively find its parent and apply path compression.
49 if (parents[x] != x) {
50 parents[x] = find(parents[x]);
51 }
52 return parents[x];
53 }
54
55 /**
56 * The union method merges sets containing 'a' and 'b' by attaching one's root parent to the other's parent.
57 *
58 * @param a The first element to union.
59 * @param b The second element to union.
60 */
61 private void union(int a, int b) {
62 // Find the root parents of both 'a' and 'b' and make one the parent of the other.
63 int rootA = find(a);
64 int rootB = find(b);
65 parents[rootA] = rootB; // Union by attaching rootA's parent to rootB.
66 }
67}
68
1#include <vector>
2#include <numeric>
3#include <functional>
4using namespace std;
5
6/**
7 * Definition for a category handler.
8 */
9class CategoryHandler {
10public:
11 CategoryHandler(vector<int> categories);
12 bool haveSameCategory(int a, int b);
13};
14
15class Solution {
16public:
17 /**
18 * Counts the number of unique categories among 'n' items.
19 * @param n Number of items.
20 * @param categoryHandler Pointer to a CategoryHandler instance.
21 * @return The count of unique categories.
22 */
23 int numberOfCategories(int n, CategoryHandler* categoryHandler) {
24 vector<int> parent(n);
25 // Initialize the parent vector with elements pointing to themselves.
26 iota(parent.begin(), parent.end(), 0);
27
28 // Function to find the root of the set that x belongs to.
29 function<int(int)> find = [&](int x) {
30 if (parent[x] != x) { // Path compression.
31 parent[x] = find(parent[x]);
32 }
33 return parent[x];
34 };
35
36 // Traverse all pairs and union the sets if they are of the same category.
37 for (int a = 0; a < n; ++a) {
38 for (int b = a + 1; b < n; ++b) {
39 if (categoryHandler->haveSameCategory(a, b)) {
40 parent[find(a)] = find(b);
41 }
42 }
43 }
44
45 // Count the number of items that are their own parent (root of the set).
46 int count = 0;
47 for (int i = 0; i < n; ++i) {
48 count += i == parent[i]; // Increment count for each root.
49 }
50 return count;
51 }
52};
53
1// Utility function to find the root of the category set that a given element belongs to
2const findRoot = (element: number, parents: number[]): number => {
3 if (parents[element] !== element) {
4 parents[element] = findRoot(parents[element], parents); // Path compression for optimization
5 }
6 return parents[element];
7};
8
9// Main function to calculate the number of distinct categories
10const calculateNumberOfCategories = (n: number, categoryHandler: CategoryHandler): number => {
11 // Initialize an array to represent the parent of each element, with each element initially being its own parent
12 const parents = new Array(n).fill(0).map((_, index) => index);
13
14 // Iterate through all possible pairs checking if they have the same category
15 for (let a = 0; a < n; ++a) {
16 for (let b = a + 1; b < n; ++b) {
17 if (categoryHandler.haveSameCategory(a, b)) {
18 // If two elements are from the same category, union their sets by assigning the same root
19 parents[findRoot(a, parents)] = findRoot(b, parents);
20 }
21 }
22 }
23
24 // Iterate through all the elements to count the number of distinct roots/categories
25 let distinctCategories = 0;
26 for (let i = 0; i < n; ++i) {
27 // If the element is the root (parent of itself), then it's a distinct category
28 if (i === parents[i]) {
29 ++distinctCategories;
30 }
31 }
32 return distinctCategories;
33};
34
Time and Space Complexity
Time Complexity
The time complexity of this code is primarily influenced by the double loop where we check if two elements have the same category using categoryHandler.haveSameCategory(a, b)
. This loop runs a
from 0
to n-1
and nested loop b
runs from a+1
to n
.
- The outer loop runs
n
times (from 0 ton-1
). - For each iteration of the outer loop, the inner loop runs
n - a - 1
times.
The number of times the inner loop runs can be described by the sum:
Sigma(n - a - 1) from a=0 to n-1
, which equals (n-1) + (n-2) + ... + 1
, which is a known arithmetic series sum that equals to (n*(n-1))/2
. Each call to the inner loop incurs a call to haveSameCategory
, which could have its own time complexity depending on its implementation. If we assume it’s O(1), the total time complexity becomes O(n^2)
.
Additionally, there’s a union-find algorithm applied through the find
function, which has a complexity of O(alpha(n))
for each find
operation, where alpha
is the Inverse Ackermann Function which is very slowly growing and for all practical purposes is considered almost constant.
So we need to also consider this complexity for each call to haveSameCategory
. If there are m
calls to haveSameCategory
that return true (and thus perform the union operation), the total time complexity for union-find would be O(m*alpha(n))
.
However, since every element can be part of at most one union operation (due to the properties of the disjoint set union structure), m
will at most be n
, and thus, we end up with O(n*alpha(n))
.
The final line of the code, which computes the number of categories present, is simply an iteration over the array p
of length n
.
Assuming the complexity of haveSameCategory
is O(1)
, the total time complexity of the code is dominated by the double loop and it is O(n^2 + n*alpha(n))
. But n*alpha(n)
is negligible compared to n^2
, so we can simply state the total time complexity as O(n^2)
.
Space Complexity
The space complexity of the code is O(n)
. This comes from the space required to store the parent array p
, which has one element for each of the n
nodes. Space complexity is not greatly affected by the recursion in the find
function because the disjoint set union-find structure ensures that the trees are very shallow (due to path compression), and thus the call stack depth is effectively bounded by alpha(n)
, which is almost constant.
Therefore, the dominant space requirement is the parent array, resulting in O(n)
space complexity.
Learn more about how to find time and space complexity quickly using problem constraints.
Which algorithm is best for finding the shortest distance between two points in an unweighted graph?
Recommended Readings
Union Find Disjoint Set Union Data Structure Introduction Prerequisite Depth First Search Review problems dfs_intro Once we have a strong grasp of recursion and Depth First Search we can now introduce Disjoint Set Union DSU This data structure is motivated by the following problem Suppose we have sets of elements
LeetCode Patterns Your Personal Dijkstra's Algorithm to Landing Your Dream Job The goal of AlgoMonster is to help you get a job in the shortest amount of time possible in a data driven way We compiled datasets of tech interview problems and broke them down by patterns This way we
Recursion Recursion is one of the most important concepts in computer science Simply speaking recursion is the process of a function calling itself Using a real life analogy imagine a scenario where you invite your friends to lunch https algomonster s3 us east 2 amazonaws com recursion jpg You first
Want a Structured Path to Master System Design Too? Don’t Miss This!