2512. Reward Top K Students


Problem Description

In this problem, you are dealing with student feedback reports that contain words signifying either positive or negative feedback. Each student starts with zero points. The points a student has are affected by the words in their feedback report: every time a positive word is mentioned, their points are increased by 3, and for each negative word, their points are reduced by 1.

You are given two arrays positive_feedback and negative_feedback which list the words that count as positive and negative feedback, respectively. Alongside this, there is an array report which contains feedback reports for the students, and a corresponding student_id array that links each feedback report to a unique student ID.

Your task is to calculate the final points for each student after analyzing the feedback reports, and then rank the students according to their points. If there's a tie in points, the student with a lower ID should come first. After ranking them, you will return a list of the IDs of the top k students, where k is a given integer.

Intuition

To find a solution to this problem, let's consider how we would approach this task manually:

  • First, we recognize that each word in the feedback is binary; it either has a positive effect or a negative effect on the student's points. So, we will want to quickly determine whether a word is positive or negative. To expedite this word lookup process, we'll use a hash set for both positive_feedback and negative_feedback since hash sets can substantially improve the speed of membership checking.

  • Next, we associate each report with the student that it belongs to. We'll need to calculate the total points for each student by going through each word in their report and adjusting their score according to whether the word is in the positive set or the negative set.

  • Once we have calculated the points for each student, we'll need to rank the students. However, because two students can have the same number of points, we'll need a way to break ties. The problem specifies that in such cases, the student with the lower student ID should rank higher.

  • The final step is to sort the students based on their points, and in the case of ties, their IDs. After sorting, we will just select the top k students as per the sorted order.

To implement this in code, we'll traverse the report array and calculate the points for each student. We'll keep the scores and their respective student IDs in an array, then sort the array by the points in descending order, and by student ID in ascending order to break ties. Lastly, we extract the first k elements from the sorted array which represent the top k students.

Learn more about Sorting and Heap (Priority Queue) patterns.

Not Sure What to Study? Take the 2-min Quiz to Find Your Missing Piece:

Which of the following shows the order of node visit in a Breadth-first Search?

Solution Approach

The solution provided uses a combination of data structures (sets for quick word lookup and arrays for storing final scores) and algorithms (iterative score calculation and sorting).

Here's a step-by-step breakdown:

  • Step 1: Convert the lists of positive and negative feedback words into sets, ps and ns respectively. This is done to take advantage of the average constant time complexity for lookup operations in a set. The use of sets ensures that when we check if a word is either positive or negative, the operation is fast and efficient.

  • Step 2: Initialize an empty array arr. This will store tuples, where each tuple consists of a student’s total score and their ID (t, sid).

  • Step 3: Loop through each feedback report paired with the respective student_id. Inside the loop, set a temporary score t to zero; this variable will hold the cumulative score for the student as we iterate through their feedback.

  • Step 4: Iterate over every word w in each feedback report by splitting the report's string. Within this loop, we perform the following checks for each word:

    • If the word is in the positive set ps, increase t by 3.
    • If the word is in the negative set ns, decrease t by 1. As we iterate through the words, we are effectively calculating the net score of a student based on the positive and negative feedback words.
  • Step 5: After calculating the total score t for a student, we append a tuple containing the score and the student's ID to the array arr.

  • Step 6: Sort the array arr first by the score in descending order, and then by student ID in ascending order. The sorting technique used is a custom sort based on a lambda function as the key, which implements the rules for ranking: (-x[0], x[1]). This means we are sorting primarily by the first element of the tuple (the score) in descending order and then by the second element (the student ID) in ascending order.

  • Step 7: Now that we have a sorted array of students by their score and ID, we extract the first k elements from the array. Specifically, we generate a new list that contains just the student IDs of these top k elements.

The overall complexity is governed by the time it takes to go through the reports and calculate the scores (which is linear with respect to the number of words in all reports), and the time it takes to sort the array of scores (which depends on the sorting algorithm used, typically O(n log n), where n is the number of students). Therefore, the solution is efficient and scales well with the number of students and the length of their feedback reports.

Discover Your Strengths and Weaknesses: Take Our 2-Minute Quiz to Tailor Your Study Plan:

Breadth first search can be used to find the shortest path between two nodes in a directed graph.

Example Walkthrough

Let's use a small example to illustrate the solution approach:

Suppose we have the following inputs:

  • positive_feedback = ["excellent", "good", "superb"]
  • negative_feedback = ["poor", "bad", "horrible"]
  • report = ["excellent work but poor attention", "bad results but good effort", "superb participation", "good performance"]
  • student_id = [102, 101, 104, 103]
  • k = 2

Step 1: Convert the positive and negative feedback arrays into sets for faster lookup:

  • ps = {"excellent", "good", "superb"}
  • ns = {"poor", "bad", "horrible"}

Step 2: Initialize an empty array arr to store tuples of the form (score, student_id).

Step 3: Begin iterating over each report and the respective student_id. For example, start with the first report "excellent work but poor attention" and student_id 102.

Step 4: For the first report, set a temporary score t to zero. Then iterate over every word:

  • "excellent" is in ps, so add 3 to t: t = 0 + 3
  • "work" is not in either set, t remains unchanged
  • "but" is not in either set, t remains unchanged
  • "poor" is in ns, so subtract 1 from t: t = 3 - 1
  • "attention" is not in either set, t remains unchanged

Step 5: After processing the first report, t is 2. Append the tuple (2, 102) to arr.

Repeat Steps 3-5 for the remaining reports:

  • For the second report "bad results but good effort" with student_id 101, the final score t will be 1. Append (1, 101).
  • For the third report "superb participation" with student_id 104, the final score t is 3. Append (3, 104).
  • For the fourth report "good performance" with student_id 103, the final score t is 3. Append (3, 103).

Step 6: Now, arr = [(2, 102), (1, 101), (3, 104), (3, 103)]. Sort arr based on the score in descending order and student ID in ascending order. The sorted arr will be [(3, 103), (3, 104), (2, 102), (1, 101)].

Step 7: Extract the top k student IDs from the sorted list, which gives [103, 104] as the top 2 students.

Thus, the solution process allows us to rank the students by their feedback scores efficiently, taking into consideration both the feedback words and the student IDs for tie-breaking.

Solution Implementation

1from typing import List
2
3class Solution:
4    def topStudents(
5        self,
6        positive_feedback: List[str],
7        negative_feedback: List[str],
8        reports: List[str],
9        student_ids: List[int],
10        k: int,
11    ) -> List[int]:
12        # Convert lists of positive and negative feedback into sets for O(1) look-up time
13        positive_set = set(positive_feedback)
14        negative_set = set(negative_feedback)
15
16        scores = []  # Initialize an empty list to store tuples of scores and student IDs
17
18        # Iterate over the students' reports and their corresponding student IDs
19        for student_id, report in zip(student_ids, reports):
20            total_score = 0  # Initialize total_score for the student
21
22            # For every word in a student's report, check if it is in 
23            # positive feedback or negative feedback, and update total_score
24            for word in report.split():
25                if word in positive_set:
26                    total_score += 3
27                elif word in negative_set:
28                    total_score -= 1
29
30            # Append the total_score and student_id as a tuple to the scores list
31            scores.append((total_score, student_id))
32
33        # Sort the scores in descending order by score, and then by student_id for ties
34        scores.sort(key=lambda x: (-x[0], x[1]))
35
36        # Return a list of the top k student_ids with the highest scores
37        return [score[1] for score in scores[:k]]
38
1import java.util.*;
2
3class Solution {
4    public List<Integer> topStudents(String[] positiveFeedback, String[] negativeFeedback,
5                                     String[] reports, int[] studentIds, int k) {
6      
7        // Convert positive and negative feedbacks to sets for quick lookup
8        Set<String> positiveFeedbackSet = new HashSet<>(Arrays.asList(positiveFeedback));
9        Set<String> negativeFeedbackSet = new HashSet<>(Arrays.asList(negativeFeedback));
10      
11        // Initialize the number of reports
12        int numOfReports = reports.length;
13      
14        // Create an array to store scores and corresponding student IDs
15        int[][] scoresAndStudentIds = new int[numOfReports][0];
16      
17        // Iterate over each report
18        for (int i = 0; i < numOfReports; ++i) {
19            // Get the student ID for the current report
20            int studentId = studentIds[i];
21            // Initialize score for the current report
22            int score = 0;
23          
24            // Split the report into words
25            for (var word : reports[i].split(" ")) {
26                // Check if the word is in positive feedback set
27                if (positiveFeedbackSet.contains(word)) {
28                    score += 3; // Add 3 to the score for positive feedback
29                } else if (negativeFeedbackSet.contains(word)) {
30                    score -= 1; // Subtract 1 from the score for negative feedback
31                }
32            }
33            // Assign computed score and student ID to the scores array
34            scoresAndStudentIds[i] = new int[] {score, studentId};
35        }
36      
37        // Sort the array first by scores in descending order, and then by student IDs in ascending order
38        Arrays.sort(scoresAndStudentIds, (a, b) -> {
39            if (a[0] == b[0]) return a[1] - b[1]; // Same score, sort by ID
40            return b[0] - a[0]; // Different scores, sort by score
41        });
42      
43        // Initialize the list to store top k student IDs
44        List<Integer> topStudents = new ArrayList<>();
45      
46        // Extract the top k student IDs
47        for (int i = 0; i < k; ++i) {
48            topStudents.add(scoresAndStudentIds[i][1]);
49        }
50      
51        // Return the list of top k student IDs
52        return topStudents;
53    }
54}
55
1#include <vector>
2#include <string>
3#include <unordered_set>
4#include <sstream>
5#include <algorithm>
6
7class Solution {
8public:
9    std::vector<int> topStudents(std::vector<std::string>& positiveFeedback, std::vector<std::string>& negativeFeedback, 
10                                 std::vector<std::string>& reports, std::vector<int>& studentIds, int k) {
11      
12        // Convert the positive and negative feedback vectors to sets for efficient lookup
13        std::unordered_set<std::string> positiveFeedbackSet(positiveFeedback.begin(), positiveFeedback.end());
14        std::unordered_set<std::string> negativeFeedbackSet(negativeFeedback.begin(), negativeFeedback.end());
15        std::vector<std::pair<int, int>> scoresAndIds; // Vector to store the score and student ID pairs
16
17        int numReports = reports.size();
18        for (int i = 0; i < numReports; ++i) {
19            int studentId = studentIds[i];
20            std::vector<std::string> words = split(reports[i], ' ');
21            int totalScore = 0;
22          
23            // Calculate the score for each word in the report
24            for (auto& word : words) {
25                if (positiveFeedbackSet.count(word)) {
26                    totalScore += 3;   // Add 3 for positive feedback
27                } else if (negativeFeedbackSet.count(word)) {
28                    totalScore -= 1;   // Subtract 1 for negative feedback
29                }
30            }
31          
32            // Store the negative of the score for sorting in ascending order later
33            scoresAndIds.push_back({-totalScore, studentId});
34        }
35
36        // Sort the vector by score and then by student ID
37        std::sort(scoresAndIds.begin(), scoresAndIds.end());
38
39        std::vector<int> topStudentsIds;
40        for (int i = 0; i < k; ++i) {
41            topStudentsIds.emplace_back(scoresAndIds[i].second);
42        }
43        return topStudentsIds; // Return the top k students' IDs
44    }
45
46    // Utility function to split a string by a delimiter
47    std::vector<std::string> split(std::string& str, char delimiter) {
48        std::stringstream ss(str);
49        std::string item;
50        std::vector<std::string> result;
51        while (std::getline(ss, item, delimiter)) {
52            result.emplace_back(item);
53        }
54        return result;
55    }
56};
57
1function topStudents(
2    positiveFeedback: string[],
3    negativeFeedback: string[],
4    reports: string[],
5    studentIds: number[],
6    k: number,
7): number[] {
8    // Get the number of reports (and students as both should be equal)
9    const numStudents = studentIds.length;
10    // Initialize a Map to store the student IDs and their calculated scores
11    const scoresMap = new Map<number, number>();
12  
13    // Create Sets from the positive and negative feedback arrays for efficient lookup
14    const positiveFeedbackSet = new Set(positiveFeedback);
15    const negativeFeedbackSet = new Set(negativeFeedback);
16
17    // Iterate over each student's report
18    for (let i = 0; i < numStudents; i++) {
19        // Calculate the score for the current student's report
20        const score = reports[i]
21          .split(' ') // Split the report string into words
22          .reduce((total, word) => {
23              // If the word is in the positive feedback set, add 3 to the total score
24              if (positiveFeedbackSet.has(word)) {
25                  return total + 3;
26              }
27              // If the word is in the negative feedback set, subtract 1 from the total score
28              if (negativeFeedbackSet.has(word)) {
29                  return total - 1;
30              }
31              // If the word is neither positive nor negative, do not change the score
32              return total;
33          }, 0);
34      
35        // Update the Map with the current student's id and their calculated score
36        scoresMap.set(studentIds[i], score);
37    }
38
39    // Convert the Map into an array, sort it based on scores and then student IDs
40    // in case of ties, and then extract only the student IDs
41    return Array.from(scoresMap.entries())
42        .sort(([studentIdA, scoreA], [studentIdB, scoreB]) => {
43            // If scores are equal, sort by student ID ascendingly
44            if (scoreA === scoreB) {
45                return studentIdA - studentIdB;
46            }
47            // Otherwise, sort by score descendingly
48            return scoreB - scoreA;
49        })
50        .map(([studentId, _]) => studentId) // Extract only the student IDs from the sorted array
51        .slice(0, k); // Get the top 'k' student IDs based on their scores
52}
53
Not Sure What to Study? Take the 2-min Quiz:

Which two pointer technique does Quick Sort use?

Time and Space Complexity

Time Complexity

The time complexity of the provided code is O(n * log n + (|ps| + |ns| + n) * |s|). To break it down:

  • zip(student_id, report) operation is O(n) because it is iterating through the list of student IDs and their corresponding reports.
  • Splitting the report and checking if each word is in the sets positive_feedback (ps) or negative_feedback (ns) contributes O(n * |s|), where n is the number of reports and |s| is the average length of a report since we consider |s| for the average length of words in a report.
  • The presence checks in ps and ns is O(1) on average for hash sets, which is multiplied with n * |s| for all words in all reports.
  • Sorting the arr list of tuples based on a compound key contributes O(n * log n) since Python's sort function uses TimSort, which has this complexity.

Space Complexity

The space complexity of the code is O((|ps|+|ns|) * |s| + n). Here's how this is derived:

  • Space for the sets ps and ns contributes |ps| * |s| and |ns| * |s|, respectively, because each word of average length |s| is stored in these sets.
  • The arr list, which stores tuples of total score and student IDs for n students, contributes O(n) to space.

Learn more about how to find time and space complexity quickly using problem constraints.

Fast Track Your Learning with Our Quick Skills Quiz:

Problem: Given a list of tasks and a list of requirements, compute a sequence of tasks that can be performed, such that we complete every task once while satisfying all the requirements.

Which of the following method should we use to solve this problem?


Recommended Readings


Got a question? Ask the Teaching Assistant anything you don't understand.

Still not clear? Ask in the Forum,  Discord or Submit the part you don't understand to our editors.


TA 👨‍🏫