3839. Number of Prefix Connected Groups
Problem Description
You are given an array of strings words and an integer k.
Two words a and b located at distinct indices are said to be prefix-connected if their first k characters are identical, that is, a[0..k-1] == b[0..k-1].
A connected group is a set of words in which every pair of words is prefix-connected. In other words, all words in the same group share the exact same prefix of length k.
Your task is to return the number of connected groups that contain at least two words.
There are a couple of important points to keep in mind:
- Any word whose length is less than
kcannot share a length-kprefix with any other word, so it cannot join any group and is simply ignored. - Duplicate strings are treated as separate words. For example, if the same string appears twice and its length is at least
k, those two occurrences count as two distinct words that are prefix-connected to each other.
Essentially, you group every word of length at least k by its first k characters, and then count how many of these groups end up containing two or more words.
How We Pick the Algorithm
Why Hash Table / Counting?
This problem maps to Hash Table / Counting through a short path in the full flowchart.
Group words by prefix using a hash table to count frequencies.
Open in FlowchartIntuition
The key observation is that two words belong to the same connected group if and only if they share the same first k characters. This means the prefix of length k acts as a unique identifier for each potential group.
Since every word in a group must have an identical length-k prefix, all words sharing the same prefix automatically form one connected group, because each pair among them is prefix-connected. So instead of comparing words pairwise, we can simply group words by their length-k prefix.
This naturally leads to a counting strategy. We iterate over the words, and for each word with length at least k, we extract its prefix w[:k] and tally how many words map to that prefix. Words shorter than k are skipped, since they cannot match any prefix of length k.
Once we have the count of words for every distinct prefix, a group qualifies only when it contains at least two words. Therefore, we count how many prefixes have a tally greater than 1, and that result is exactly the number of connected groups with two or more words.
Solution Approach
Solution 1: Hash Table
We use a hash table to count, for each prefix made up of the first characters, how many words map to it.
The implementation works as follows:
-
Initialize a counter (a hash table) that maps each length- prefix to the number of words sharing that prefix.
-
Iterate over every word in . If the length of is greater than or equal to , take its prefix and increment its count in by . Words with length less than are skipped, since they cannot form a length- prefix.
-
After processing all words, traverse the values in and count how many of them are greater than . Each such value corresponds to a connected group containing at least two words. The expression
sum(v > 1 for v in cnt.values())performs this count by summing up the boolean results (whereTruecounts as1).
The time complexity is , where is the total number of characters across all words, since extracting each prefix of length and inserting it into the hash table is proportional to the prefix length. The space complexity is , where is the number of words, accounting for the prefixes stored as keys in the hash table.
Example Walkthrough
Let's trace through a small example to see how the hash table approach works.
Input: words = ["apple", "apply", "ape", "bat", "bath", "ba"], k = 3
We want to group words by their first k = 3 characters and count how many groups end up with at least two words.
Step 1: Initialize the counter
We start with an empty hash table:
cnt = {}
Step 2: Iterate over each word
We process each word one at a time. If a word's length is less than k = 3, we skip it. Otherwise, we extract its first 3 characters as the prefix and increment the count.
| Word | Length | Length ≥ k? | Prefix w[:3] | Action | cnt after step |
|---|---|---|---|---|---|
apple | 5 | ✅ Yes | "app" | increment "app" | {"app": 1} |
apply | 5 | ✅ Yes | "app" | increment "app" | {"app": 2} |
ape | 3 | ✅ Yes | "ape" | increment "ape" | {"app": 2, "ape": 1} |
bat | 3 | ✅ Yes | "bat" | increment "bat" | {"app": 2, "ape": 1, "bat": 1} |
bath | 4 | ✅ Yes | "bat" | increment "bat" | {"app": 2, "ape": 1, "bat": 2} |
ba | 2 | ❌ No | — | skipped (too short) | {"app": 2, "ape": 1, "bat": 2} |
Notice that "ba" is ignored entirely because its length (2) is less than k = 3, so it cannot share a length-3 prefix with anything.
Step 3: Count groups with at least two words
Now we examine the final counter:
cnt = {"app": 2, "ape": 1, "bat": 2}
We count how many values are greater than 1:
| Prefix | Count | Count > 1? |
|---|---|---|
"app" | 2 | ✅ True |
"ape" | 1 | ❌ False |
"bat" | 2 | ✅ True |
The expression sum(v > 1 for v in cnt.values()) evaluates to True + False + True = 1 + 0 + 1 = 2.
Result: 2
There are two connected groups with at least two words:
"app"group:{"apple", "apply"}— both share the prefix"app"."bat"group:{"bat", "bath"}— both share the prefix"bat".
The "ape" group has only one word, so it does not qualify, and "ba" was excluded for being too short.
Solution Implementation
1from typing import List
2from collections import Counter
3
4
5class Solution:
6 def prefixConnected(self, words: List[str], k: int) -> int:
7 # Counter to track how many words share each length-k prefix
8 prefix_count = Counter()
9
10 for word in words:
11 # Only consider words long enough to have a prefix of length k
12 if len(word) >= k:
13 prefix = word[:k]
14 prefix_count[prefix] += 1
15
16 # Count how many distinct prefixes are shared by more than one word
17 return sum(count > 1 for count in prefix_count.values())
181class Solution {
2 public int prefixConnected(String[] words, int k) {
3 // Map to store each length-k prefix and how many times it appears
4 Map<String, Integer> prefixCount = new HashMap<>();
5
6 // Iterate over every word and record its prefix of length k
7 for (String word : words) {
8 // Only consider words that are long enough to have a length-k prefix
9 if (word.length() >= k) {
10 // Extract the prefix of length k and increment its counter
11 prefixCount.merge(word.substring(0, k), 1, Integer::sum);
12 }
13 }
14
15 // Count how many prefixes are shared by more than one word
16 int answer = 0;
17 for (int count : prefixCount.values()) {
18 // A prefix counts only if it is shared by at least two words
19 if (count > 1) {
20 ++answer;
21 }
22 }
23
24 return answer;
25 }
26}
271class Solution {
2public:
3 int prefixConnected(vector<string>& words, int k) {
4 // Map to count how many words share each k-length prefix
5 unordered_map<string, int> prefixCount;
6
7 // Iterate over every word in the input
8 for (const auto& word : words) {
9 // Only consider words long enough to have a k-length prefix
10 if (word.size() >= static_cast<size_t>(k)) {
11 // Extract the first k characters and increment its count
12 ++prefixCount[word.substr(0, k)];
13 }
14 }
15
16 // Count how many distinct prefixes are shared by more than one word
17 int answer = 0;
18 for (const auto& [prefix, count] : prefixCount) {
19 if (count > 1) {
20 ++answer;
21 }
22 }
23
24 return answer;
25 }
26};
271/**
2 * Counts how many distinct k-length prefixes are shared by more than one word.
3 *
4 * @param words - The list of input words.
5 * @param k - The length of the prefix to consider.
6 * @returns The number of k-length prefixes that appear in at least two words.
7 */
8function prefixConnected(words: string[], k: number): number {
9 // Map each k-length prefix to the number of words that start with it.
10 const prefixCount = new Map<string, number>();
11
12 for (const word of words) {
13 // Only words at least k characters long can have a valid k-length prefix.
14 if (word.length >= k) {
15 const prefix = word.substring(0, k);
16 prefixCount.set(prefix, (prefixCount.get(prefix) ?? 0) + 1);
17 }
18 }
19
20 // Count prefixes that are shared by more than one word.
21 let answer = 0;
22 for (const count of prefixCount.values()) {
23 if (count > 1) {
24 answer++;
25 }
26 }
27
28 return answer;
29}
30Time and Space Complexity
Time Complexity: O(n × k)
The code iterates over each word in words, which contains n words. For each word w, when len(w) >= k, the slicing operation w[:k] creates a substring of length k, and computing its hash to insert into the Counter takes O(k) time. Therefore, processing all n words costs O(n × k). The final sum(v > 1 for v in cnt.values()) traverses at most n distinct prefixes, costing O(n), which does not dominate. Hence, the overall time complexity is O(n × k).
Space Complexity: O(n)
The Counter object cnt stores at most n distinct prefix keys (one per word). Although each key is a string of length up to k, the reference answer treats the prefix as a unit of storage, so the space used is proportional to the number of words n. Thus, the overall space complexity is O(n).
Pattern Learn more about how to find time and space complexity quickly.
Common Pitfalls
Pitfall 1: Forgetting to Skip Words Shorter Than k
A frequent mistake is slicing the prefix without first checking the word's length. In Python, slicing word[:k] does not raise an error when len(word) < k; instead, it silently returns the entire (shorter) string. This means a short word like "ab" with k = 4 would produce the prefix "ab", which could then incorrectly match another short word or be treated as a valid length-k prefix.
Incorrect code:
for word in words: prefix = word[:k] # No length check! prefix_count[prefix] += 1 # Short words wrongly counted
Solution: Always guard the prefix extraction with an explicit length check, so words too short to form a length-k prefix are ignored entirely.
for word in words:
if len(word) >= k: # Guard against short words
prefix_count[word[:k]] += 1
Pitfall 2: Counting Groups Instead of Members (Off-by-Definition Error)
Another common error is misunderstanding what to count. Some solutions return the total number of distinct prefixes, or the total number of words that belong to some group, rather than the number of groups with at least two words.
Incorrect interpretations:
# Wrong: counts ALL distinct prefixes, including singletons
return len(prefix_count)
# Wrong: counts total words in non-trivial groups, not the group count
return sum(count for count in prefix_count.values() if count > 1)
Solution: Count only the prefixes whose frequency is strictly greater than 1. Each such prefix represents exactly one valid connected group.
return sum(count > 1 for count in prefix_count.values())
Pitfall 3: Mishandling Duplicate Strings
It is tempting to deduplicate the input with a set for "efficiency," but the problem explicitly states that duplicate strings count as distinct words. Using a set would collapse two identical words into one, causing a valid two-word group to be missed.
Incorrect code:
for word in set(words): # Deduplication loses valid pairs!
if len(word) >= k:
prefix_count[word[:k]] += 1
For example, words = ["abc", "abc"], k = 3 should yield 1 group, but deduplicating reduces the count to 0.
Solution: Iterate over the original list directly and increment the counter for every occurrence, so identical strings are each tallied separately.
for word in words: # Preserve every occurrence
if len(word) >= k:
prefix_count[word[:k]] += 1
Ready to land your dream job?
Unlock your dream job with a 5-minute quiz for a personalized study roadmap!
Get My RoadmapWhich data structure is used in a depth first search?
Recommended Readings
Coding Interview Patterns Your Personal Dijkstra's Algorithm to Landing Your Dream Job The goal of AlgoMonster is to help you get a job in the shortest amount of time possible in a data driven way We compiled datasets of tech interview problems and broke them down by patterns This way
Recursion If you prefer videos here's a video that explains recursion in a fun and easy way Recursion is one of the most important concepts in computer science Simply speaking recursion is the process of a function calling itself Using a real life analogy imagine a scenario where you invite your friends to lunch https assets algo monster recursion jpg You first call Ben and ask him
Runtime Overview When learning about algorithms and data structures you'll frequently encounter the term time complexity This concept is fundamental in computer science and offers insights into how long an algorithm takes to complete given a certain input size What is Time Complexity Time complexity describes how the time needed
Want a Structured Path to Master System Design Too? Don’t Miss This!