1772. Sort Features by Popularity
Problem Description
In the given problem, we have two lists: features
and responses
.
features
is a list of single-word strings where each word represents a unique feature of a product.
responses
is a list of strings where each string may contain several words expressing features liked by a user. These strings represent user feedback from a survey.
The goal is to sort the features array in order of decreasing popularity. Popularity of a feature is defined by the number of responses that mention the feature at least once. One thing to note is that even if a response mentions a feature multiple times, it still only counts once towards the feature's popularity.
Furthermore, if two features have the same popularity, our sorting should prioritize them based on the order in which they appear in the features
list.
Finally, we need to return the features sorted by the rules described above.
Intuition
For the solution, we need to do the following:
-
Count how many times each feature is mentioned in the
responses
list. To ensure that each feature is counted only once per response, we can create a set of words from each response. This will automatically remove duplicates. -
After we have the counts of each feature as per the responses, we need to sort the
features
list. The sorting is not regular alphabetical or numerical sorting; it is based on the popularity of the features.
Therefore, we need a custom sort that:
- Sorts features in descending order of their popularity.
- When features have the same popularity count, they should retain their original relative order as given in the input
features
array.
- To implement this, we use a sorting function with a custom key. In Python, this can be achieved via the
sorted
function with alambda
function as the key. Thelambda
function returns the negative count of how many times a feature occurs, thus enabling sorting in descending order based on popularity.
By following this approach, we can efficiently sort the features according to the problem's requirements.
Learn more about Sorting patterns.
Solution Approach
To implement the solution, the code utilizes a couple of Python-specific techniques and data structures. Here's a step-by-step walkthrough of the solution:
-
Import the
Counter
class from thecollections
module. ACounter
is a dictionary subclass designed for counting hashable objects. It's an unordered collection where elements are stored as dictionary keys and their counts as dictionary values. -
Initialize the
Counter
by the variable namecnt
. It's going to be used to store the number of times each feature appears in the responses. -
Iterate through each response in the
responses
list using afor
loop. -
Inside the loop, split the current response
r
usingr.split()
, converting it into a list of words. Then convert this list into a setws
to ensure that each word is unique and duplicate mentions of a feature in a single response are eliminated. -
Iterate through each word
s
in the setws
. If it's a word that represents a feature, it should be counted. Increment its count in ourCounter
by doingcnt[s] += 1
. -
Now, we need to sort the
features
list by the count of each feature while preserving the order of features with the same popularity. To do this, we use thesorted
function with a customlambda
function as the sorting key:lambda x: -cnt[x]
. This lambda function returns the count from theCounter
for each elementx
infeatures
, negated so that the sorting is in descending order. -
The
sorted
function with the custom lambda function will sort the features by descending popularity, and in case of a tie, it will preserve the original order, as it's stable sort (it maintains the relative order of records with equal values).
Here is the final sorted list:
1sorted_features = sorted(features, key=lambda x: -cnt[x])
This Python code gives us the desired output where the features
are sorted according to the specified rules.
Ready to land your dream job?
Unlock your dream job with a 2-minute evaluator for a personalized learning plan!
Start EvaluatorExample Walkthrough
Let's walk through an example to illustrate the solution approach. Suppose we have the following features
list and responses
list:
1features = ["storage", "color", "battery", "screen"] 2responses = [ 3 "I wish the battery life was longer", 4 "Loving the new screen and high storage capacity", 5 "The color is so vibrant and the screen is the perfect size", 6 "Battery life is great, screen is great, storage just what I needed", 7 "I wish the color options were more vibrant" 8]
Now, let's apply our solution approach step by step:
-
Count Feature Mentions: First, we need to count how many times each feature is mentioned across all
responses
. According to the given approach, for each response, we convert it into a set of words so that duplicates within the same response are eliminated. -
Initialize Counter: We initialize a
Counter
object calledcnt
to keep the counts.1from collections import Counter 2cnt = Counter()
-
Process Responses: Let's loop through each
response
and update theCounter
with words only if they are in ourfeatures
list.1for r in responses: 2 ws = set(r.split()) 3 for s in ws: 4 if s in features: 5 cnt[s] += 1
After processing all responses,
cnt
might look like this:1cnt = Counter({'battery': 2, 'screen': 3, 'storage': 2, 'color': 2})
Here, 'screen' was mentioned in 3 different responses, while 'battery', 'storage', and 'color' were mentioned in 2.
-
Sort Features: Next, we sort the
features
by descending popularity using thesorted
function with alambda
key that uses theCounter
counts.1sorted_features = sorted(features, key=lambda x: -cnt[x])
The
sorted_features
list will look like this:1['screen', 'battery', 'storage', 'color']
The feature 'screen' comes first because it has the highest count. 'battery', 'storage', and 'color' all have the same count. However, 'battery' appears before 'storage' and 'color' in the original list, so it comes next. Likewise, 'storage' comes before 'color', which is why 'color' is last.
-
Result: Finally, we have our features sorted by popularity while preserving the order in which they appear in the
features
list when they have the same popularity:1sorted_features = ["screen", "battery", "storage", "color"]
This sorted list aligns with our problem's requirement of sorting the features by their popularity, ensuring that the ones with the same frequency of mentions maintain the order from the original
features
list.
Solution Implementation
1from typing import List
2from collections import Counter
3
4class Solution:
5 def sortFeatures(self, features: List[str], responses: List[str]) -> List[str]:
6 # Create a counter to keep track of the number of times each word appears in the responses
7 counter = Counter()
8
9 # Iterate through each response in the list of responses
10 for response in responses:
11 # Split the response into words and convert it to a set to remove duplicates
12 unique_words = set(response.split())
13
14 # Update the count for each unique word in the current response
15 for word in unique_words:
16 counter[word] += 1
17
18 # Sort the list of features based on their counts in a descending order
19 # Features that do not appear in the counter will have a default count of 0
20 sorted_features = sorted(features, key=lambda feature: -counter[feature])
21
22 # Return the list of features sorted by their popularity
23 return sorted_features
24
25# Example usage:
26# sol = Solution()
27# features = ["cooler", "lock", "touch"]
28# responses = ["I love the cooler cooler", "lock touch cool", "locker like touch"]
29# print(sol.sortFeatures(features, responses)) # Output: ["touch", "cooler", "lock"]
30
1class Solution {
2 public String[] sortFeatures(String[] features, String[] responses) {
3 // Create a map to count the occurrence of each word in the responses
4 Map<String, Integer> featureCounts = new HashMap<>();
5 for (String response : responses) {
6 // Create a set to store unique words in the current response
7 Set<String> uniqueWords = new HashSet<>();
8 // Split the response into words
9 for (String word : response.split(" ")) {
10 uniqueWords.add(word);
11 }
12 // Increment the count for each unique word
13 for (String word : uniqueWords) {
14 featureCounts.put(word, featureCounts.getOrDefault(word, 0) + 1);
15 }
16 }
17
18 int numFeatures = features.length;
19 // Initialize an array of indexes corresponding to the features array
20 Integer[] indexes = new Integer[numFeatures];
21 for (int i = 0; i < numFeatures; ++i) {
22 indexes[i] = i;
23 }
24
25 // Sort the indexes based on the feature occurrence counts and their original order
26 Arrays.sort(indexes, (index1, index2) -> {
27 int countDifference = featureCounts.getOrDefault(features[index2], 0) -
28 featureCounts.getOrDefault(features[index1], 0);
29 // If counts are equal, sort by the original order
30 return countDifference == 0 ? index1 - index2 : countDifference;
31 });
32
33 // Create an array to hold the sorted features
34 String[] sortedFeatures = new String[numFeatures];
35 for (int i = 0; i < numFeatures; ++i) {
36 sortedFeatures[i] = features[indexes[i]];
37 }
38
39 // Return the sorted array of features
40 return sortedFeatures;
41 }
42}
43
1#include <vector>
2#include <string>
3#include <unordered_map>
4#include <unordered_set>
5#include <sstream>
6#include <algorithm>
7#include <numeric>
8
9class Solution {
10public:
11 std::vector<std::string> sortFeatures(std::vector<std::string>& features, std::vector<std::string>& responses) {
12 // Map to store the count of each feature mentioned in the responses
13 std::unordered_map<std::string, int> featureCounts;
14
15 // Iterate through all the responses
16 for (auto& response : responses) {
17 std::stringstream ss(response);
18 std::string word;
19 std::unordered_set<std::string> uniqueWords;
20
21 // Extract each word from the response and add to a set to ensure uniqueness
22 while (ss >> word) {
23 uniqueWords.insert(word);
24 }
25
26 // Increase the count for each unique word found in the responses that match a feature
27 for (auto& word : uniqueWords) {
28 featureCounts[word]++;
29 }
30 }
31
32 int numFeatures = features.size();
33 std::vector<int> indices(numFeatures);
34 // Initialize the indices sequence
35 std::iota(indices.begin(), indices.end(), 0);
36
37 // Sort the indices based on the counts of the features
38 std::sort(indices.begin(), indices.end(), [&](int index1, int index2) -> bool {
39 int countDifference = featureCounts[features[index1]] - featureCounts[features[index2]];
40 // If counts differ, sort in descending order; if counts are same, sort by index
41 return countDifference > 0 || (countDifference == 0 && index1 < index2);
42 });
43
44 // Prepare the sorted list of features based on the sorted indices
45 std::vector<std::string> sortedFeatures(numFeatures);
46 for (int i = 0; i < numFeatures; ++i) {
47 sortedFeatures[i] = features[indices[i]];
48 }
49
50 // Return the sorted list of features
51 return sortedFeatures;
52 }
53};
54
1// Required imports or type definitions (if any) would be placed here
2
3let featureCounts: { [feature: string]: number } = {};
4
5// Function to sort features based on their frequency in the given responses
6function sortFeatures(features: string[], responses: string[]): string[] {
7 featureCounts = {}; // Resetting the feature counts for a fresh start
8
9 // Iterate through all the responses to count feature occurrences
10 responses.forEach(response => {
11 let uniqueWords: Set<string> = new Set<string>(); // To ensure uniqueness of words in a response
12 response.split(/\s+/).forEach(word => {
13 uniqueWords.add(word); // Add each word to the set
14 });
15
16 // Update the count for each unique word that matches a feature
17 uniqueWords.forEach(word => {
18 if (features.includes(word)) {
19 if (!featureCounts[word]) featureCounts[word] = 0;
20 featureCounts[word]++;
21 }
22 });
23 });
24
25 // Initialize indices for sorting
26 let indices: number[] = features.map((_, index) => index);
27
28 // Sort indices based on feature occurrence count and their original order
29 indices.sort((index1, index2) => {
30 let count1 = featureCounts[features[index1]] || 0; // Fallback to 0 if not found
31 let count2 = featureCounts[features[index2]] || 0;
32 return count2 - count1 || index1 - index2; // Descending by count, then by original index
33 });
34
35 // Prepare the sorted list of features
36 let sortedFeatures: string[] = indices.map(index => features[index]);
37
38 // Return the list of sorted features
39 return sortedFeatures;
40}
41
42// Example use:
43// const sorted = sortFeatures(["feature1", "feature2"], ["I love feature2", "feature1 is great"]);
44// console.log(sorted); // Logs feature names sorted by their frequency and original order
45
Time and Space Complexity
Time Complexity
The time complexity of the given code snippet is O(N + U + FlogF)
, where:
N
is the total number of words across all responses.U
is the number of unique words across all responses.F
is the number of features.
Here's how the time complexity breaks down:
- Splitting each response and adding words to the set has a complexity of
O(N)
, whereN
is the total number of words. This includes the cost of splitting the response into words and processing each word. - Counting occurrences of each unique word involves a constant-time operation per unique word, resulting in a complexity of
O(U)
. - Sorting the features list is
O(FlogF)
because we are sortingF
features using a comparison sort where features are compared based on their counts.
Space Complexity
The space complexity of the given code snippet is O(U + F)
, where:
U
is the number of unique words across all responses. This is because the counter holds counts for each unique word.F
is the number of features, which accounts for the space to store thefeatures
list.
We have to store the counts for all unique words and the final sorted features
list, contributing to the space complexity.
Learn more about how to find time and space complexity quickly using problem constraints.
Which data structure is used in a depth first search?
Recommended Readings
Sorting Summary Comparisons We presented quite a few sorting algorithms and it is essential to know the advantages and disadvantages of each one The basic algorithms are easy to visualize and easy to learn for beginner programmers because of their simplicity As such they will suffice if you don't know any advanced
LeetCode Patterns Your Personal Dijkstra's Algorithm to Landing Your Dream Job The goal of AlgoMonster is to help you get a job in the shortest amount of time possible in a data driven way We compiled datasets of tech interview problems and broke them down by patterns This way we
Recursion Recursion is one of the most important concepts in computer science Simply speaking recursion is the process of a function calling itself Using a real life analogy imagine a scenario where you invite your friends to lunch https algomonster s3 us east 2 amazonaws com recursion jpg You first