1086. High Five
Problem Description
In this problem, we are given a list of students' scores in the form items
, where every element items[i] = [ID_i, score_i]
corresponds to a score from a student with ID_i
. Our task is to compute the average of the top five scores for each student. We must then return an array of pairs result
, where result[j] = [ID_j, topFiveAverage_j]
signifies the student with ID_j
and their average of top five scores. It is important to note that if a student has fewer than five scores, we calculate the average of the scores they have. The result
array should be ordered in ascending order by student ID.
To calculate the average, we sum up the top five scores of each student and then perform integer division by 5
. Integer division means that after dividing, if we have a decimal value, we drop the remainder and keep only the integer part.
Intuition
When tackling this problem, we can start by thinking about how we can pair students with their scores and how to efficiently compute the top five average. Here is a step-by-step approach to the problem:
-
Organizing Scores by Students: We can use a dictionary to map each student's ID to a list of their scores. A
defaultdict
from Python'scollections
module is perfect for this because it allows us to append scores to each student's list without initialization. -
Finding Top Scores: We are only interested in the top five scores for each student. The
nlargest
function from Python'sheapq
module can help us to find the top five scores efficiently. -
Computing Averages: Once we have the top five scores, we can simply sum them up and then divide by
5
to get the average. This must be integer division as specified by the problem. -
Creating the Result List: We need the resulting list to be sorted by student ID. We start with the smallest ID and go up to the largest ID we encountered, checking if the ID has any scores mapped to it. If it has, we compute the average and add it to the result list in a
[ID_i, topFiveAverage_i]
format. -
Returning the Result: After processing all student IDs, we end up with a list of ID and average pairs, which we return as the final answer.
By keeping these steps in mind, we can write a Python program that is both efficient and easy to understand, which is what the given solution code is doing.
Learn more about Sorting patterns.
Solution Approach
The implementation of the solution follows the intuition and can be understood in multiple steps:
-
Creating a Dictionary with Default Values: A
defaultdict
is created to store the scores for each student. The choice ofdefaultdict
is crucial as it initializes a new list automatically if a key (student ID) is not found, avoiding key errors or the need for manual initialization.1d = defaultdict(list)
-
Storing Scores by Student IDs: The solution iterates over each score in the
items
list, appending the score to the list associated with the student's ID in the dictionary. During this process, we keep track of the maximum student ID encountered to know the range we need to consider for our final result.1for i, x in items: 2 d[i].append(x) 3 m = max(m, i)
-
Iterating Over Possible Student IDs: We iterate from
1
to the maximum student IDm + 1
to ensure we cover all possible IDs. We check for each ID if there are scores associated with it.1for i in range(1, m + 1): 2 if xs := d[i]:
-
Calculating the Top Five Average: For each student with scores, we use the
nlargest
function to fetch the top five scores efficiently (in O(n log 5) time where n is the number of scores per student). We then sum these scores and perform integer division by5
to get the average.1avg = sum(nlargest(5, xs)) // 5
-
Storing the Results: The calculated averages along with the student IDs are stored in the result list
ans
using theappend
method. This ensures that the results are stored in the ascending order of the student IDs as the loop already iterates in ascending order.1ans.append([i, avg])
-
Returning the Sorted Results: Since our loop iteration ensures IDs are already in ascending order, we can directly return the
ans
list as our final sorted result.1return ans
The provided solution is efficient because it utilizes a dictionary to relate scores with students' IDs, and a heap-based nlargest
method to retrieve the top scores. This method minimizes sorting overhead since it does not require sorting all the scores. Furthermore, by keeping a running maximum of encountered student IDs, the solution only looks at IDs that have scores without unnecessarily processing unused IDs, thereby optimizing the performance.
Ready to land your dream job?
Unlock your dream job with a 2-minute evaluator for a personalized learning plan!
Start EvaluatorExample Walkthrough
Let's illustrate the solution approach with a small example. Assume our input items
is:
1items = [[1, 91], [1, 92], [2, 93], [2, 97], [1, 60], [2, 77], [1, 65], [1, 87], [1, 100], [2, 100], [2, 76]]
This items
list consists of scores for two students with IDs 1 and 2. Let's apply our solution approach step-by-step.
-
Creating a Dictionary with Default Values:
First, we create a
defaultdict
to store the scores indexed by student IDs.1from collections import defaultdict 2d = defaultdict(list)
-
Storing Scores by Student IDs:
We go through each item in the list. We add each score to the list of scores for the corresponding student ID. Suppose the
max_id
variable keeps track of the maximum ID encountered so far.1max_id = 0 2for i, x in items: 3 d[i].append(x) 4 max_id = max(max_id, i) 5# Now, d = {1: [91, 92, 60, 65, 87, 100], 2: [93, 97, 77, 100, 76]}
-
Iterating Over Possible Student IDs:
Since we've seen that the maximum ID is 2, we will iterate over student IDs from 1 to 2.
1for i in range(1, max_id + 1): 2 if xs := d[i]:
-
Calculating the Top Five Average:
If a student has scores, we will select the top five scores, sum them, and divide by 5 to get the average.
1from heapq import nlargest 2ans = [] 3for i in range(1, max_id + 1): 4 if xs := d[i]: 5 avg = sum(nlargest(5, xs)) // 5 6 # For student 1: nlargest(5, [91, 92, 60, 65, 87, 100]) -> [100, 92, 91, 87, 65], sum -> 435, average -> 435//5 = 87 7 # For student 2: nlargest(5, [93, 97, 77, 100, 76]) -> [100, 97, 93, 77, 76], sum -> 443, average -> 443//5 = 88 8 ans.append([i, avg])
-
Storing the Results:
We have calculated averages for both student IDs, so we add them to our answer list.
1ans.append([i, avg]) 2# After iteration, ans = [[1, 87], [2, 88]]
-
Returning the Sorted Results:
Lastly, we return the list
ans
, which is already sorted by student IDs, as our final result.1return ans 2# Result: [[1, 87], [2, 88]]
In this small example, student 1 had six scores and student 2 had five. After taking the top five scores for each, the averages of their top five scores are 87 and 88, respectively. Thus, the final result reflecting each student's ID and their calculated top-five average is:
1[[1, 87], [2, 88]]
Solution Implementation
1from collections import defaultdict
2from heapq import nlargest
3
4class Solution:
5 def highFive(self, items: List[List[int]]) -> List[List[int]]:
6 # Dictionary to store scores for each ID
7 scores_dict = defaultdict(list)
8 # Variable to keep track of the highest ID number seen
9 max_id = 0
10
11 # Iterate through the items to populate the scores dictionary
12 for student_id, score in items:
13 # Append the score to the list of scores for the current student_id
14 scores_dict[student_id].append(score)
15 # Update max_id if the current student_id is greater
16 max_id = max(max_id, student_id)
17
18 # List to store the result
19 result = []
20
21 # Iterate through all possible IDs from 1 to max_id (inclusive)
22 for i in range(1, max_id + 1):
23 # Check if there are scores for the current ID
24 if scores := scores_dict[i]:
25 # Calculate the average of the top 5 scores
26 average = sum(nlargest(5, scores)) // 5
27 # Append [student_id, average_score] to the result list
28 result.append([i, average])
29
30 # Return the final list of averages
31 return result
32
1class Solution {
2
3 public int[][] highFive(int[][] items) {
4 // Initialize an array of lists to store scores for each student
5 List<Integer>[] scoresPerStudent = new List[1001];
6 Arrays.setAll(scoresPerStudent, k -> new ArrayList<>());
7
8 // Populate the lists with scores. Each student is represented at index i, and score is x
9 for (var item : items) {
10 int studentId = item[0];
11 int score = item[1];
12 scoresPerStudent[studentId].add(score);
13 }
14
15 // Sort the scores in descending order for each student
16 for (var scores : scoresPerStudent) {
17 scores.sort((a, b) -> b - a);
18 }
19
20 // Prepare a list to hold the answer
21 List<int[]> averageTopFiveScores = new ArrayList<>();
22
23 // Iterate through each student's list of scores
24 for (int i = 1; i <= 1000; ++i) {
25 var scores = scoresPerStudent[i];
26 // Ensure that the student has at least one score
27 if (!scores.isEmpty()) {
28 int sum = 0;
29 // Get the top five scores, or all scores if fewer than five, and calculate their sum
30 for (int j = 0; j < Math.min(5, scores.size()); ++j) {
31 sum += scores.get(j);
32 }
33 // Compute and add the average of the top five scores for the student to the result list
34 averageTopFiveScores.add(new int[] {i, sum / Math.min(5, scores.size())});
35 }
36 }
37
38 // Convert the list of average scores to a 2D array and return
39 return averageTopFiveScores.toArray(new int[0][]);
40 }
41}
42
1class Solution {
2public:
3 // Function to calculate the average of the top five scores for each student
4 vector<vector<int>> highFive(vector<vector<int>>& items) {
5 vector<int> scores[1001]; // Array of vectors to store scores for each student
6
7 // Iterate through all the score items
8 for (auto& item : items) {
9 int studentId = item[0]; // Extract the student ID
10 int score = item[1]; // Extract the score
11 scores[studentId].push_back(score); // Add score to the corresponding student's list
12 }
13
14 vector<vector<int>> result; // Resultant vector to store the averages of top five scores
15
16 // Process the scores for each student
17 for (int i = 1; i <= 1000; ++i) { // Assuming student IDs are in the range 1 to 1000
18 if (!scores[i].empty()) { // Ensure that the student has scores recorded
19 sort(scores[i].begin(), scores[i].end(), greater<int>()); // Sort scores in descending order
20
21 int sumTopFiveScores = 0; // Variable to store the sum of the top five scores
22 for (int j = 0; j < 5; ++j) { // Calculate the sum of top 5 scores
23 sumTopFiveScores += scores[i][j];
24 }
25 int averageTopFiveScores = sumTopFiveScores / 5; // Compute the average of top five scores
26
27 result.push_back({i, averageTopFiveScores}); // Add the student ID and average to the result
28 }
29 }
30
31 return result; // Return the final result
32 }
33};
34
1// This function calculates the average of the top 5 scores for each unique ID.
2function highFive(items: number[][]): number[][] {
3 // Initialize an array to store the scores grouped by ID.
4 const scoresById: number[][] = Array(1001)
5 .fill(0)
6 .map(() => Array());
7
8 // Loop through each item (id, score) and group the scores by their ID.
9 for (const [id, score] of items) {
10 scoresById[id].push(score);
11 }
12
13 const averages: number[][] = []; // This will hold the final result.
14
15 // Iterate through the potential range of IDs.
16 for (let i = 1; i <= 1000; ++i) {
17 // Check if the current ID has any scores.
18 if (scoresById[i].length > 0) {
19 // Sort the scores for the current ID in descending order.
20 scoresById[i].sort((a, b) => b - a);
21
22 // Calculate the sum of the top 5 scores.
23 const sumTopFive = scoresById[i].slice(0, 5).reduce((a, b) => a + b, 0);
24
25 // Calculate the average, floor it, and push the ID and average to the result array.
26 averages.push([i, Math.floor(sumTopFive / 5)]);
27 }
28 }
29
30 // Return the array containing ID and average of top 5 scores.
31 return averages;
32}
33
Time and Space Complexity
Time Complexity
The time complexity of the function is determined by several factors:
- The time complexity of iterating over
items
, which hasn
elements:O(n)
. - Appending each item's score to the corresponding list in the dictionary:
O(1)
(amortized) for each insertion. - The
max
function inside the loop, which has constant timeO(1)
because it's just comparing two integers. - The final iteration over the range from
1
tom + 1
, which isO(m)
wherem
is the maximum id seen. - For each student, the
nlargest
function is used to obtain the top 5 scores which have a complexity ofO(k * log(m))
for each student (k
is 5 here, andm
is the number of scores for that student). - Summing the scores and calculating the average has constant time complexity
O(1)
.
Since we don't know the relationship between m
and n
exactly, we combine the terms to reflect the worst case. We can assume each student has at least 5 scores and at most n
scores. Thus, the nlargest
step ends up with a time complexity of O(n * log(n))
due to at most n
calls to nlargest
with a list of maximum length n
.
Combining these, our overall time complexity is O(n + n * log(n))
, which simplifies to O(n * log(n))
.
Space Complexity
For space complexity, we have:
- A dictionary that holds up to
m
keys, with each key holding a list of scores. If every score is unique to a student, the space complexity isO(n)
. - An output list
ans
that will holdm
elements:O(m)
.
Combining these, we get O(n) + O(m)
. However, since m
cannot exceed n
(as there cannot be more student IDs than individual scores), the space complexity simplifies to O(n)
.
Learn more about how to find time and space complexity quickly using problem constraints.
Which of the following is a min heap?
Recommended Readings
Sorting Summary Comparisons We presented quite a few sorting algorithms and it is essential to know the advantages and disadvantages of each one The basic algorithms are easy to visualize and easy to learn for beginner programmers because of their simplicity As such they will suffice if you don't know any advanced
Patterns The Shortest Path Algorithm for Coding Interviews The goal of AlgoMonster is to help you get a job in the shortest amount of time possible in a data driven way We compiled datasets of tech interview problems and broke them down by patterns This way we can determine the
Recursion Recursion is one of the most important concepts in computer science Simply speaking recursion is the process of a function calling itself Using a real life analogy imagine a scenario where you invite your friends to lunch https algomonster s3 us east 2 amazonaws com recursion jpg You first
Got a question?ย Ask the Monster Assistantย anything you don't understand.
Still not clear? ย Submitย the part you don't understand to our editors. Or join ourย Discord and ask the community.