3839. Number of Prefix Connected Groups

MediumArrayHash TableStringCounting

LeetCode ↗

Problem Pattern Idea Example Code Complex Pitfalls

Problem Description

You are given an array of strings words and an integer k.

Two words a and b located at distinct indices are said to be prefix-connected if their first k characters are identical, that is, a[0..k-1] == b[0..k-1].

A connected group is a set of words in which every pair of words is prefix-connected. In other words, all words in the same group share the exact same prefix of length k.

Your task is to return the number of connected groups that contain at least two words.

There are a couple of important points to keep in mind:

Any word whose length is less than k cannot share a length-k prefix with any other word, so it cannot join any group and is simply ignored.
Duplicate strings are treated as separate words. For example, if the same string appears twice and its length is at least k, those two occurrences count as two distinct words that are prefix-connected to each other.

Essentially, you group every word of length at least k by its first k characters, and then count how many of these groups end up containing two or more words.

Quick Interview Experience

Help others by sharing your interview experience

Have you seen this problem before?

How We Pick the Algorithm

Why Hash Table / Counting?

This problem maps to Hash Table / Counting through a short path in the full flowchart.

Group words by prefix using a hash table to count frequencies.

Open in Flowchart

Intuition

The key observation is that two words belong to the same connected group if and only if they share the same first k characters. This means the prefix of length k acts as a unique identifier for each potential group.

Since every word in a group must have an identical length-k prefix, all words sharing the same prefix automatically form one connected group, because each pair among them is prefix-connected. So instead of comparing words pairwise, we can simply group words by their length-k prefix.

This naturally leads to a counting strategy. We iterate over the words, and for each word with length at least k, we extract its prefix w[:k] and tally how many words map to that prefix. Words shorter than k are skipped, since they cannot match any prefix of length k.

Once we have the count of words for every distinct prefix, a group qualifies only when it contains at least two words. Therefore, we count how many prefixes have a tally greater than 1, and that result is exactly the number of connected groups with two or more words.

Solution Approach

Solution 1: Hash Table

We use a hash table $\textit{cnt}$ to count, for each prefix made up of the first $k$ characters, how many words map to it.

The implementation works as follows:

Initialize a counter $\textit{cnt}$ (a hash table) that maps each length- $k$ prefix to the number of words sharing that prefix.
Iterate over every word $w$ in $\textit{words}$ . If the length of $w$ is greater than or equal to $k$ , take its prefix $w[:k]$ and increment its count in $\textit{cnt}$ by $1$ . Words with length less than $k$ are skipped, since they cannot form a length- $k$ prefix.
After processing all words, traverse the values in $\textit{cnt}$ and count how many of them are greater than $1$ . Each such value corresponds to a connected group containing at least two words. The expression sum(v > 1 for v in cnt.values()) performs this count by summing up the boolean results (where True counts as 1).

The time complexity is $O(L)$ , where $L$ is the total number of characters across all words, since extracting each prefix of length $k$ and inserting it into the hash table is proportional to the prefix length. The space complexity is $O(n \times k)$ , where $n$ is the number of words, accounting for the prefixes stored as keys in the hash table.

Example Walkthrough

Let's trace through a small example to see how the hash table approach works.

Input: words = ["apple", "apply", "ape", "bat", "bath", "ba"], k = 3

We want to group words by their first k = 3 characters and count how many groups end up with at least two words.

Step 1: Initialize the counter

We start with an empty hash table:

cnt = {}

Step 2: Iterate over each word

We process each word one at a time. If a word's length is less than k = 3, we skip it. Otherwise, we extract its first 3 characters as the prefix and increment the count.

Word	Length	Length ≥ k?	Prefix `w[:3]`	Action	`cnt` after step
`apple`	5	✅ Yes	`"app"`	increment `"app"`	`{"app": 1}`
`apply`	5	✅ Yes	`"app"`	increment `"app"`	`{"app": 2}`
`ape`	3	✅ Yes	`"ape"`	increment `"ape"`	`{"app": 2, "ape": 1}`
`bat`	3	✅ Yes	`"bat"`	increment `"bat"`	`{"app": 2, "ape": 1, "bat": 1}`
`bath`	4	✅ Yes	`"bat"`	increment `"bat"`	`{"app": 2, "ape": 1, "bat": 2}`
`ba`	2	❌ No	—	skipped (too short)	`{"app": 2, "ape": 1, "bat": 2}`

Notice that "ba" is ignored entirely because its length (2) is less than k = 3, so it cannot share a length-3 prefix with anything.

Step 3: Count groups with at least two words

Now we examine the final counter:

cnt = {"app": 2, "ape": 1, "bat": 2}

We count how many values are greater than 1:

Prefix	Count	Count > 1?
`"app"`	2	✅ True
`"ape"`	1	❌ False
`"bat"`	2	✅ True

The expression sum(v > 1 for v in cnt.values()) evaluates to True + False + True = 1 + 0 + 1 = 2.

Result: 2

There are two connected groups with at least two words:

"app" group: {"apple", "apply"} — both share the prefix "app".
"bat" group: {"bat", "bath"} — both share the prefix "bat".

The "ape" group has only one word, so it does not qualify, and "ba" was excluded for being too short.

Solution Implementation

1from typing import List
2from collections import Counter
3
4
5class Solution:
6    def prefixConnected(self, words: List[str], k: int) -> int:
7        # Counter to track how many words share each length-k prefix
8        prefix_count = Counter()
9
10        for word in words:
11            # Only consider words long enough to have a prefix of length k
12            if len(word) >= k:
13                prefix = word[:k]
14                prefix_count[prefix] += 1
15
16        # Count how many distinct prefixes are shared by more than one word
17        return sum(count > 1 for count in prefix_count.values())
18

1class Solution {
2    public int prefixConnected(String[] words, int k) {
3        // Map to store each length-k prefix and how many times it appears
4        Map<String, Integer> prefixCount = new HashMap<>();
5
6        // Iterate over every word and record its prefix of length k
7        for (String word : words) {
8            // Only consider words that are long enough to have a length-k prefix
9            if (word.length() >= k) {
10                // Extract the prefix of length k and increment its counter
11                prefixCount.merge(word.substring(0, k), 1, Integer::sum);
12            }
13        }
14
15        // Count how many prefixes are shared by more than one word
16        int answer = 0;
17        for (int count : prefixCount.values()) {
18            // A prefix counts only if it is shared by at least two words
19            if (count > 1) {
20                ++answer;
21            }
22        }
23
24        return answer;
25    }
26}
27

1class Solution {
2public:
3    int prefixConnected(vector<string>& words, int k) {
4        // Map to count how many words share each k-length prefix
5        unordered_map<string, int> prefixCount;
6
7        // Iterate over every word in the input
8        for (const auto& word : words) {
9            // Only consider words long enough to have a k-length prefix
10            if (word.size() >= static_cast<size_t>(k)) {
11                // Extract the first k characters and increment its count
12                ++prefixCount[word.substr(0, k)];
13            }
14        }
15
16        // Count how many distinct prefixes are shared by more than one word
17        int answer = 0;
18        for (const auto& [prefix, count] : prefixCount) {
19            if (count > 1) {
20                ++answer;
21            }
22        }
23
24        return answer;
25    }
26};
27

1/**
2 * Counts how many distinct k-length prefixes are shared by more than one word.
3 *
4 * @param words - The list of input words.
5 * @param k - The length of the prefix to consider.
6 * @returns The number of k-length prefixes that appear in at least two words.
7 */
8function prefixConnected(words: string[], k: number): number {
9    // Map each k-length prefix to the number of words that start with it.
10    const prefixCount = new Map<string, number>();
11
12    for (const word of words) {
13        // Only words at least k characters long can have a valid k-length prefix.
14        if (word.length >= k) {
15            const prefix = word.substring(0, k);
16            prefixCount.set(prefix, (prefixCount.get(prefix) ?? 0) + 1);
17        }
18    }
19
20    // Count prefixes that are shared by more than one word.
21    let answer = 0;
22    for (const count of prefixCount.values()) {
23        if (count > 1) {
24            answer++;
25        }
26    }
27
28    return answer;
29}
30

Time and Space Complexity

Time Complexity: O(n × k)

The code iterates over each word in words, which contains n words. For each word w, when len(w) >= k, the slicing operation w[:k] creates a substring of length k, and computing its hash to insert into the Counter takes O(k) time. Therefore, processing all n words costs O(n × k). The final sum(v > 1 for v in cnt.values()) traverses at most n distinct prefixes, costing O(n), which does not dominate. Hence, the overall time complexity is O(n × k).

Space Complexity: O(n)

The Counter object cnt stores at most n distinct prefix keys (one per word). Although each key is a string of length up to k, the reference answer treats the prefix as a unit of storage, so the space used is proportional to the number of words n. Thus, the overall space complexity is O(n).

Pattern Learn more about how to find time and space complexity quickly.

Common Pitfalls

Pitfall 1: Forgetting to Skip Words Shorter Than `k`

A frequent mistake is slicing the prefix without first checking the word's length. In Python, slicing word[:k] does not raise an error when len(word) < k; instead, it silently returns the entire (shorter) string. This means a short word like "ab" with k = 4 would produce the prefix "ab", which could then incorrectly match another short word or be treated as a valid length-k prefix.

Incorrect code:

for word in words:
    prefix = word[:k]          # No length check!
    prefix_count[prefix] += 1  # Short words wrongly counted

Solution: Always guard the prefix extraction with an explicit length check, so words too short to form a length-k prefix are ignored entirely.

for word in words:
    if len(word) >= k:         # Guard against short words
        prefix_count[word[:k]] += 1

Pitfall 2: Counting Groups Instead of Members (Off-by-Definition Error)

Another common error is misunderstanding what to count. Some solutions return the total number of distinct prefixes, or the total number of words that belong to some group, rather than the number of groups with at least two words.

Incorrect interpretations:

# Wrong: counts ALL distinct prefixes, including singletons
return len(prefix_count)

# Wrong: counts total words in non-trivial groups, not the group count
return sum(count for count in prefix_count.values() if count > 1)

Solution: Count only the prefixes whose frequency is strictly greater than 1. Each such prefix represents exactly one valid connected group.

return sum(count > 1 for count in prefix_count.values())

Pitfall 3: Mishandling Duplicate Strings

It is tempting to deduplicate the input with a set for "efficiency," but the problem explicitly states that duplicate strings count as distinct words. Using a set would collapse two identical words into one, causing a valid two-word group to be missed.

Incorrect code:

for word in set(words):        # Deduplication loses valid pairs!
    if len(word) >= k:
        prefix_count[word[:k]] += 1

For example, words = ["abc", "abc"], k = 3 should yield 1 group, but deduplicating reduces the count to 0.

Solution: Iterate over the original list directly and increment the counter for every occurrence, so identical strings are each tallied separately.

for word in words:             # Preserve every occurrence
    if len(word) >= k:
        prefix_count[word[:k]] += 1

Ready to land your dream job?

Unlock your dream job with a 5-minute quiz for a personalized study roadmap!

Get My Roadmap

Discover Your Strengths and Weaknesses: Take Our 5-Minute Quiz to Get a Personalized Study Roadmap:

Core Learning

LLD/OOP Interview

Live Training

Additional Resources

AI Engineering