Facebook Pixel

1772. Sort Features by Popularity πŸ”’

MediumArrayHash TableStringSorting
Leetcode Link

Problem Description

You're developing a product with multiple features and want to understand which features are most popular based on user feedback.

You have two inputs:

  • features: An array of strings where each string represents a feature name
  • responses: An array of strings where each string contains space-separated words representing a user's response

The goal is to sort the features based on their popularity, which is defined as the number of responses that mention the feature at least once.

Key rules for sorting:

  1. Features should be sorted in descending order by popularity (most popular first)
  2. If two features have the same popularity, maintain their original order from the features array
  3. If a feature appears multiple times in a single response, it only counts as 1 toward that feature's popularity

For example, if:

  • features = ["camera", "battery", "screen"]
  • responses = ["I love the camera and camera quality", "battery life is great", "camera is amazing"]

Then:

  • "camera" appears in 2 responses (popularity = 2)
  • "battery" appears in 1 response (popularity = 1)
  • "screen" appears in 0 responses (popularity = 0)

The output would be ["camera", "battery", "screen"] sorted by popularity.

The solution uses a Counter to track how many responses contain each feature. For each response, it converts the response to a set of words (eliminating duplicates within that response) and increments the count for each word. Finally, it sorts the features list using the negative count as the key to achieve descending order.

Quick Interview Experience
Help others by sharing your interview experience
Have you seen this problem before?

Intuition

The core challenge is counting how many responses mention each feature while ensuring duplicate mentions within the same response don't inflate the count.

The natural approach is to think about what we need to track:

  1. For each feature, we need to know how many different responses mentioned it
  2. We need to handle the duplicate word issue within each response

The key insight is that when processing each response, we should first convert it to a set of unique words. This immediately solves the duplicate problem - if "camera" appears 3 times in one response, the set will only contain it once.

Once we have unique words per response, we can simply count occurrences across all responses. A hash table (Counter in Python) is perfect for this - we iterate through each response, extract unique words, and increment the count for each word we see.

For the sorting step, we realize that Python's sorted() function maintains the original order when elements have equal sort keys (stable sort). This means if we sort the original features array by popularity, features with the same popularity will naturally maintain their original order.

The trick with using -cnt[w] as the sort key is elegant: since Python sorts in ascending order by default, negating the count gives us descending order. Features not in cnt have a count of 0, and -0 is still 0, so they'll appear at the end in their original order.

This approach is efficient because we only scan each response once, and the sorting is done on the typically small features array rather than on all words found in responses.

Learn more about Sorting patterns.

Solution Approach

The implementation follows a straightforward process using a hash table for counting and custom sorting for the final result.

Step 1: Initialize the Counter

cnt = Counter()

We create a Counter object (hash table) to track how many responses contain each word.

Step 2: Process Each Response

for s in responses:
    for w in set(s.split()):
        cnt[w] += 1

For each response string s:

  • Split the response into individual words using s.split()
  • Convert the list of words to a set() to remove duplicates within that response
  • For each unique word w in the set, increment its count in the hash table

The use of set() is crucial here - it ensures that if a word appears multiple times in one response, it only contributes 1 to the popularity count.

Step 3: Sort Features by Popularity

return sorted(features, key=lambda w: -cnt[w])

Sort the original features array using a custom key function:

  • The key function lambda w: -cnt[w] returns the negative count for each feature
  • Using negative values sorts in descending order (most popular first)
  • If a feature doesn't exist in cnt, cnt[w] returns 0, so -cnt[w] is 0
  • Python's sorted() is stable, meaning features with equal popularity maintain their original relative order

Time Complexity: O(N * M + F * log F) where:

  • N is the number of responses
  • M is the average number of unique words per response
  • F is the number of features

Space Complexity: O(W) where W is the total number of unique words across all responses

Ready to land your dream job?

Unlock your dream job with a 5-minute evaluator for a personalized learning plan!

Start Evaluator

Example Walkthrough

Let's walk through a concrete example to illustrate the solution approach:

Input:

  • features = ["storage", "battery", "display", "camera"]
  • responses = ["love the storage and battery battery", "camera quality is good", "battery lasts long and storage is huge", "display is broken"]

Step 1: Initialize Counter We start with an empty counter: cnt = {}

Step 2: Process Each Response

Response 1: "love the storage and battery battery"

  • Split into words: ["love", "the", "storage", "and", "battery", "battery"]
  • Convert to set: {"love", "the", "storage", "and", "battery"} (duplicate "battery" removed)
  • Update counter:
    • cnt = {"love": 1, "the": 1, "storage": 1, "and": 1, "battery": 1}

Response 2: "camera quality is good"

  • Split into words: ["camera", "quality", "is", "good"]
  • Convert to set: {"camera", "quality", "is", "good"}
  • Update counter:
    • cnt = {"love": 1, "the": 1, "storage": 1, "and": 1, "battery": 1, "camera": 1, "quality": 1, "is": 1, "good": 1}

Response 3: "battery lasts long and storage is huge"

  • Split into words: ["battery", "lasts", "long", "and", "storage", "is", "huge"]
  • Convert to set: {"battery", "lasts", "long", "and", "storage", "is", "huge"}
  • Update counter (incrementing existing entries):
    • cnt["battery"] becomes 2 (was 1)
    • cnt["storage"] becomes 2 (was 1)
    • cnt["and"] becomes 2 (was 1)
    • cnt["is"] becomes 2 (was 1)
    • New words get count 1
    • cnt = {"love": 1, "the": 1, "storage": 2, "and": 2, "battery": 2, "camera": 1, "quality": 1, "is": 2, "good": 1, "lasts": 1, "long": 1, "huge": 1}

Response 4: "display is broken"

  • Split into words: ["display", "is", "broken"]
  • Convert to set: {"display", "is", "broken"}
  • Update counter:
    • cnt["display"] becomes 1 (new)
    • cnt["is"] becomes 3 (was 2)
    • cnt["broken"] becomes 1 (new)

Step 3: Sort Features by Popularity

Now we sort our original features array using the counts:

  • "storage": cnt["storage"] = 2, sort key = -2
  • "battery": cnt["battery"] = 2, sort key = -2
  • "display": cnt["display"] = 1, sort key = -1
  • "camera": cnt["camera"] = 1, sort key = -1

Sorting by these keys (in ascending order of negative counts = descending order of actual counts):

  1. "storage" and "battery" both have sort key -2 (highest popularity)
  2. Since they have equal popularity, they maintain original order: "storage" before "battery"
  3. "display" and "camera" both have sort key -1 (lower popularity)
  4. They also maintain original order: "display" before "camera"

Final Output: ["storage", "battery", "display", "camera"]

The features are now sorted by popularity (2, 2, 1, 1) while maintaining original order for ties.

Solution Implementation

1from typing import List
2from collections import Counter
3
4class Solution:
5    def sortFeatures(self, features: List[str], responses: List[str]) -> List[str]:
6        # Create a counter to track frequency of each word across all responses
7        word_frequency = Counter()
8      
9        # Process each response
10        for response in responses:
11            # Convert response to set of unique words to count each word only once per response
12            unique_words = set(response.split())
13          
14            # Increment counter for each unique word in this response
15            for word in unique_words:
16                word_frequency[word] += 1
17      
18        # Sort features by their frequency in descending order
19        # Features with higher frequency (more responses mentioning them) come first
20        # Features not mentioned in responses will have frequency 0
21        sorted_features = sorted(features, key=lambda feature: -word_frequency[feature])
22      
23        return sorted_features
24
1class Solution {
2    public String[] sortFeatures(String[] features, String[] responses) {
3        // Map to store the frequency count of each feature word across all responses
4        Map<String, Integer> featureFrequency = new HashMap<>();
5      
6        // Process each response to count feature occurrences
7        for (String response : responses) {
8            // Use a set to track features already seen in current response
9            // (each feature counts only once per response)
10            Set<String> visitedFeatures = new HashSet<>();
11          
12            // Split response into individual words
13            for (String word : response.split(" ")) {
14                // If this word hasn't been counted for this response yet
15                if (visitedFeatures.add(word)) {
16                    // Increment the frequency count for this word
17                    featureFrequency.merge(word, 1, Integer::sum);
18                }
19            }
20        }
21      
22        int featuresCount = features.length;
23      
24        // Create an array of indices to sort features indirectly
25        Integer[] indices = new Integer[featuresCount];
26        for (int i = 0; i < featuresCount; i++) {
27            indices[i] = i;
28        }
29      
30        // Sort indices based on feature frequency (descending) and original order (ascending)
31        Arrays.sort(indices, (indexA, indexB) -> {
32            // Get frequency counts for both features (default to 0 if not found)
33            int frequencyA = featureFrequency.getOrDefault(features[indexA], 0);
34            int frequencyB = featureFrequency.getOrDefault(features[indexB], 0);
35          
36            // If frequencies are equal, maintain original order (stable sort)
37            // Otherwise, sort by frequency in descending order
38            return frequencyA == frequencyB ? indexA - indexB : frequencyB - frequencyA;
39        });
40      
41        // Build the result array using sorted indices
42        String[] sortedFeatures = new String[featuresCount];
43        for (int i = 0; i < featuresCount; i++) {
44            sortedFeatures[i] = features[indices[i]];
45        }
46      
47        return sortedFeatures;
48    }
49}
50
1class Solution {
2public:
3    vector<string> sortFeatures(vector<string>& features, vector<string>& responses) {
4        // Map to store the frequency count of each feature word across all responses
5        unordered_map<string, int> featureFrequency;
6      
7        // Process each response to count feature occurrences
8        for (const auto& response : responses) {
9            // Use string stream to parse words from the response
10            istringstream stringStream(response);
11            string word;
12          
13            // Use set to track unique words in current response (avoid counting duplicates)
14            unordered_set<string> uniqueWords;
15          
16            // Extract each word from the response
17            while (stringStream >> word) {
18                uniqueWords.insert(word);
19            }
20          
21            // Increment frequency count for each unique word in this response
22            for (const auto& uniqueWord : uniqueWords) {
23                ++featureFrequency[uniqueWord];
24            }
25        }
26      
27        // Create index array for sorting features by their indices
28        int featuresCount = features.size();
29        vector<int> indices(featuresCount);
30      
31        // Initialize indices with values [0, 1, 2, ..., n-1]
32        iota(indices.begin(), indices.end(), 0);
33      
34        // Sort indices based on feature frequency (descending) and original order (ascending)
35        sort(indices.begin(), indices.end(), [&](int indexI, int indexJ) {
36            int frequencyI = featureFrequency[features[indexI]];
37            int frequencyJ = featureFrequency[features[indexJ]];
38          
39            // If frequencies are equal, maintain original order; otherwise sort by frequency
40            return frequencyI == frequencyJ ? indexI < indexJ : frequencyI > frequencyJ;
41        });
42      
43        // Build the result array using sorted indices
44        vector<string> sortedFeatures(featuresCount);
45        for (int i = 0; i < featuresCount; ++i) {
46            sortedFeatures[i] = features[indices[i]];
47        }
48      
49        return sortedFeatures;
50    }
51};
52
1/**
2 * Sorts features based on their frequency in responses.
3 * Features mentioned more frequently in responses appear first.
4 * Features with the same frequency maintain their original relative order.
5 * 
6 * @param features - Array of feature names to be sorted
7 * @param responses - Array of response strings containing features separated by spaces
8 * @returns Sorted array of features based on frequency in responses
9 */
10function sortFeatures(features: string[], responses: string[]): string[] {
11    // Map to store the frequency count of each feature across all responses
12    const featureFrequencyMap: Map<string, number> = new Map();
13  
14    // Count occurrences of each feature in responses
15    for (const response of responses) {
16        // Track visited features in current response to avoid double counting
17        const visitedFeaturesInResponse: Set<string> = new Set();
18      
19        // Split response into individual words/features
20        for (const feature of response.split(' ')) {
21            // Skip if this feature was already counted in this response
22            if (visitedFeaturesInResponse.has(feature)) {
23                continue;
24            }
25          
26            // Mark feature as visited for this response
27            visitedFeaturesInResponse.add(feature);
28          
29            // Increment the frequency count for this feature
30            featureFrequencyMap.set(feature, (featureFrequencyMap.get(feature) || 0) + 1);
31        }
32    }
33  
34    // Create an array of indices to track original positions
35    const featuresCount: number = features.length;
36    const indicesArray: number[] = Array.from({ length: featuresCount }, (_, index) => index);
37  
38    // Sort indices based on feature frequency (descending) and original order (ascending for ties)
39    indicesArray.sort((indexA: number, indexB: number) => {
40        const frequencyA: number = featureFrequencyMap.get(features[indexA]) || 0;
41        const frequencyB: number = featureFrequencyMap.get(features[indexB]) || 0;
42      
43        // If frequencies are equal, maintain original order; otherwise sort by frequency (descending)
44        return frequencyA === frequencyB ? indexA - indexB : frequencyB - frequencyA;
45    });
46  
47    // Map sorted indices back to feature names
48    return indicesArray.map(index => features[index]);
49}
50

Time and Space Complexity

Time Complexity: O(m * k + n * log n)

Where:

  • m is the number of responses
  • k is the average number of words per response
  • n is the number of features

Breaking down the operations:

  1. Iterating through all responses: O(m)
  2. For each response, splitting into words: O(k)
  3. Converting to set and iterating: O(k)
  4. Overall counting phase: O(m * k)
  5. Sorting features by count: O(n * log n)

The reference answer simplifies to O(n * log n) by assuming that m * k is typically smaller than n * log n in practical scenarios, or that the counting phase is considered a preprocessing step with the sorting being the dominant operation.

Space Complexity: O(m * k)

The space is used for:

  • Counter dictionary storing at most O(m * k) unique words
  • Temporary set for each response: O(k)
  • Split words list for each response: O(k)

The overall space complexity is dominated by the Counter storage of unique words across all responses.

Learn more about how to find time and space complexity quickly.

Common Pitfalls

1. Counting Multiple Occurrences Within a Single Response

One of the most common mistakes is forgetting to use set() when processing each response, which would incorrectly count a feature multiple times if it appears repeatedly in the same response.

Incorrect Implementation:

# WRONG: This counts "camera" twice if a response says "camera camera"
for response in responses:
    for word in response.split():  # No set() used here
        word_frequency[word] += 1

Correct Implementation:

# CORRECT: Using set() ensures each word counts only once per response
for response in responses:
    for word in set(response.split()):
        word_frequency[word] += 1

2. Case Sensitivity Issues

The current solution is case-sensitive, meaning "Camera" and "camera" would be treated as different features. If the problem requires case-insensitive matching, you need to normalize the case.

Solution for Case-Insensitive Matching:

def sortFeatures(self, features: List[str], responses: List[str]) -> List[str]:
    word_frequency = Counter()
  
    # Create a mapping of lowercase to original feature names
    feature_map = {f.lower(): f for f in features}
  
    for response in responses:
        # Convert response to lowercase before processing
        unique_words = set(response.lower().split())
        for word in unique_words:
            word_frequency[word] += 1
  
    # Sort using lowercase version for lookup
    return sorted(features, key=lambda f: -word_frequency[f.lower()])

3. Partial Word Matching Confusion

The solution uses exact word matching after splitting by spaces. It won't match partial words or substrings. For example, if a feature is "battery" and a response contains "batteries", it won't be counted.

If Partial Matching is Required:

def sortFeatures(self, features: List[str], responses: List[str]) -> List[str]:
    feature_count = {feature: 0 for feature in features}
  
    for response in responses:
        response_lower = response.lower()
        for feature in features:
            # Check if feature appears as substring (not just whole word)
            if feature.lower() in response_lower:
                feature_count[feature] += 1
  
    return sorted(features, key=lambda f: -feature_count[f])

4. Assuming Features Are Always Present in Responses

The solution correctly handles features that never appear in responses (they get a count of 0), but developers might incorrectly assume all features will be mentioned and try to access counts directly without the Counter's default behavior.

Incorrect Assumption:

# WRONG: This would raise KeyError for features not in responses
regular_dict = {}
for response in responses:
    for word in set(response.split()):
        regular_dict[word] = regular_dict[word] + 1  # KeyError if word not in dict

Correct Handling:

# CORRECT: Counter handles missing keys gracefully
word_frequency = Counter()
# Counter returns 0 for missing keys, no KeyError
Discover Your Strengths and Weaknesses: Take Our 5-Minute Quiz to Tailor Your Study Plan:

Depth first search is equivalent to which of the tree traversal order?


Recommended Readings

Want a Structured Path to Master System Design Too? Don’t Miss This!

Load More