1619. Mean of Array After Removing Some Elements
Problem Description
You are given an integer array arr
. Your task is to calculate the mean (average) of the array after removing the smallest 5% and the largest 5% of elements.
The process works as follows:
- Sort the array in ascending order
- Remove the bottom 5% of elements (the smallest values)
- Remove the top 5% of elements (the largest values)
- Calculate the mean of the remaining 90% of elements
For example, if the array has 20 elements:
- Remove the smallest 5% β remove 1 element from the beginning (0.05 Γ 20 = 1)
- Remove the largest 5% β remove 1 element from the end
- Calculate the mean of the remaining 18 elements
The solution approach:
- First, sort the array to arrange elements in ascending order
- Calculate the starting index as
int(n * 0.05)
where n is the array length - Calculate the ending index as
int(n * 0.95)
- Extract the subarray from start to end index (this excludes the smallest and largest 5%)
- Calculate the sum of this subarray and divide by its length to get the mean
- Round the result to 5 decimal places
The answer will be accepted if it's within 10^-5
of the actual answer, meaning your result should be accurate to at least 5 decimal places.
Intuition
The key insight is that to remove the extreme values (smallest 5% and largest 5%), we need to identify which elements fall into these categories. The most straightforward way to do this is by sorting the array first.
Once sorted, the smallest 5% will be at the beginning of the array, and the largest 5% will be at the end. This makes removal trivial - we just need to figure out how many elements constitute 5% of the total.
For an array of length n
, exactly n * 0.05
elements make up 5%. Since we might get a decimal value (e.g., 5% of 30 is 1.5), we use int()
to truncate to the nearest integer. This gives us the number of elements to exclude from each end.
After determining the cutoff points:
- Start index:
int(n * 0.05)
- this skips the smallest 5% - End index:
int(n * 0.95)
- this stops before the largest 5%
The middle 90% of elements (from index start
to end-1
) represents our trimmed dataset. We can then simply calculate the arithmetic mean of these remaining elements by summing them and dividing by their count.
The rounding to 5 decimal places ensures our answer meets the precision requirement specified in the problem (within 10^-5
of the actual answer).
Learn more about Sorting patterns.
Solution Approach
The implementation follows a straightforward approach using sorting and array slicing:
-
Get array length: Store
n = len(arr)
to calculate the trim percentages. -
Calculate trim indices:
start = int(n * 0.05)
gives us the index where the trimmed array should begin (excluding the smallest 5%)end = int(n * 0.95)
gives us the index where the trimmed array should end (excluding the largest 5%)- Using
int()
ensures we get whole number indices by truncating any decimal values
-
Sort the array: Call
arr.sort()
to arrange all elements in ascending order. This positions:- The smallest 5% at indices
[0, start)
- The middle 90% at indices
[start, end)
- The largest 5% at indices
[end, n)
- The smallest 5% at indices
-
Extract the trimmed portion: Use Python's array slicing
t = arr[start:end]
to get only the middle 90% of elements. This slice operation creates a new list containing elements from indexstart
(inclusive) to indexend
(exclusive). -
Calculate and return the mean:
sum(t)
computes the sum of all elements in the trimmed arraylen(t)
gives the count of elements in the trimmed arraysum(t) / len(t)
calculates the arithmetic meanround(..., 5)
rounds the result to 5 decimal places to meet the precision requirement
The time complexity is O(n log n)
due to sorting, and the space complexity is O(n)
for storing the trimmed array slice.
Ready to land your dream job?
Unlock your dream job with a 3-minute evaluator for a personalized learning plan!
Start EvaluatorExample Walkthrough
Let's walk through a concrete example with a small array to illustrate the solution approach.
Example: arr = [6, 2, 7, 5, 1, 2, 0, 3, 10, 2, 5, 0, 5, 5, 0, 8, 7, 6, 8, 0]
Step 1: Get array length
n = 20
Step 2: Calculate trim indices
- Bottom 5%:
start = int(20 * 0.05) = int(1.0) = 1
- Top 5%:
end = int(20 * 0.95) = int(19.0) = 19
- This means we'll remove 1 element from the beginning and 1 element from the end
Step 3: Sort the array
- After sorting:
[0, 0, 0, 0, 1, 2, 2, 2, 3, 5, 5, 5, 5, 6, 6, 7, 7, 8, 8, 10]
- The smallest value (first 0) will be removed
- The largest value (10) will be removed
Step 4: Extract the trimmed portion
t = arr[1:19]
gives us:[0, 0, 0, 1, 2, 2, 2, 3, 5, 5, 5, 5, 6, 6, 7, 7, 8, 8]
- We've excluded:
- Smallest 5% (index 0): the value 0
- Largest 5% (index 19): the value 10
Step 5: Calculate the mean
- Sum of trimmed array:
0+0+0+1+2+2+2+3+5+5+5+5+6+6+7+7+8+8 = 76
- Count of elements:
18
- Mean:
76 / 18 = 4.222222...
- Rounded to 5 decimal places:
4.22222
The final answer is 4.22222, which represents the mean of the middle 90% of the data after removing the extreme values.
Solution Implementation
1class Solution:
2 def trimMean(self, arr: List[int]) -> float:
3 """
4 Calculate the trimmed mean by removing the smallest 5% and largest 5% of elements.
5
6 Args:
7 arr: List of integers to calculate trimmed mean from
8
9 Returns:
10 The mean of the remaining elements after trimming, rounded to 5 decimal places
11 """
12 # Get the total number of elements
13 n = len(arr)
14
15 # Calculate indices for 5% and 95% positions
16 # These represent the start and end of our trimmed range
17 trim_start_index = int(n * 0.05)
18 trim_end_index = int(n * 0.95)
19
20 # Sort the array to identify smallest and largest elements
21 arr.sort()
22
23 # Extract the middle 90% of elements (removing smallest 5% and largest 5%)
24 trimmed_array = arr[trim_start_index:trim_end_index]
25
26 # Calculate and return the mean of trimmed elements, rounded to 5 decimal places
27 trimmed_mean = sum(trimmed_array) / len(trimmed_array)
28 return round(trimmed_mean, 5)
29
1class Solution {
2 /**
3 * Calculates the mean of an array after removing the smallest 5% and largest 5% of elements.
4 *
5 * @param arr The input integer array
6 * @return The trimmed mean value
7 */
8 public double trimMean(int[] arr) {
9 // Sort the array in ascending order
10 Arrays.sort(arr);
11
12 // Get the total number of elements
13 int arrayLength = arr.length;
14
15 // Calculate the number of elements to remove from each end (5% from each side)
16 int elementsToTrim = (int) (arrayLength * 0.05);
17
18 // Initialize sum accumulator for the remaining middle elements
19 double sum = 0;
20
21 // Iterate through the middle 90% of elements
22 // Start from index after removing bottom 5%, end before top 5%
23 for (int i = elementsToTrim; i < arrayLength - elementsToTrim; i++) {
24 sum += arr[i];
25 }
26
27 // Calculate and return the mean of the middle 90% of elements
28 // Total remaining elements = arrayLength * 0.9
29 return sum / (arrayLength * 0.9);
30 }
31}
32
1class Solution {
2public:
3 double trimMean(vector<int>& arr) {
4 // Sort the array in ascending order
5 sort(arr.begin(), arr.end());
6
7 // Get the total number of elements
8 int n = arr.size();
9
10 // Calculate the number of elements to remove from each end (5% from each side)
11 int elementsToTrim = static_cast<int>(n * 0.05);
12
13 // Initialize sum for the middle 90% of elements
14 double sum = 0.0;
15
16 // Sum elements from index [elementsToTrim] to [n - elementsToTrim - 1]
17 // This excludes the smallest 5% and largest 5% of values
18 for (int i = elementsToTrim; i < n - elementsToTrim; ++i) {
19 sum += arr[i];
20 }
21
22 // Calculate and return the mean of the middle 90% of elements
23 // Total remaining elements = n - 2 * elementsToTrim = n * 0.9
24 return sum / (n * 0.9);
25 }
26};
27
1/**
2 * Calculates the trimmed mean of an array by removing the smallest and largest 5% of elements
3 * @param arr - The input array of numbers
4 * @returns The mean of the remaining 90% of elements
5 */
6function trimMean(arr: number[]): number {
7 // Sort the array in ascending order
8 arr.sort((a, b) => a - b);
9
10 // Get the total length of the array
11 const arrayLength: number = arr.length;
12
13 // Calculate how many elements to remove from each end (5% from each side)
14 const elementsToRemove: number = arrayLength * 0.05;
15
16 // Initialize sum accumulator
17 let sum: number = 0;
18
19 // Sum the middle 90% of elements (excluding the lowest and highest 5%)
20 for (let i = elementsToRemove; i < arrayLength - elementsToRemove; i++) {
21 sum += arr[i];
22 }
23
24 // Calculate and return the mean of the remaining elements
25 // The remaining count is 90% of the original array length
26 return sum / (arrayLength * 0.9);
27}
28
Time and Space Complexity
Time Complexity: O(n log n)
- The dominant operation is
arr.sort()
which uses Timsort (Python's default sorting algorithm) with time complexityO(n log n)
wheren
is the length of the input array - Computing
start
andend
indices takesO(1)
time - Array slicing
arr[start:end]
takesO(n)
time to create a new list - Computing the sum of the trimmed array takes
O(n)
time - Division and rounding operations take
O(1)
time - Overall:
O(n log n) + O(n) + O(n) + O(1) = O(n log n)
Space Complexity: O(n)
- The sorting operation in Python creates a copy of the array internally, requiring
O(n)
auxiliary space - The sliced array
t = arr[start:end]
creates a new list containing approximately0.9n
elements, which isO(n)
space - Variables
start
,end
, andn
useO(1)
space - Overall:
O(n) + O(n) + O(1) = O(n)
Learn more about how to find time and space complexity quickly.
Common Pitfalls
1. Integer Division vs Float Division
One common mistake is forgetting that the mean calculation requires float division. In older Python versions (Python 2.x) or some other languages, dividing two integers might result in integer division, truncating the decimal part.
Incorrect approach:
# In some contexts, this might perform integer division
mean = sum(trimmed_array) // len(trimmed_array) # Wrong: uses floor division
Correct approach:
# Ensure float division
mean = sum(trimmed_array) / len(trimmed_array) # Correct: uses true division
# Or explicitly convert to float
mean = float(sum(trimmed_array)) / len(trimmed_array)
2. Rounding vs Truncating When Calculating Indices
Using round()
instead of int()
for index calculation can lead to incorrect trimming boundaries, especially when the 5% calculation results in values like 0.5.
Incorrect approach:
trim_start_index = round(n * 0.05) # Could round up when we want to round down
trim_end_index = round(n * 0.95) # Could cause off-by-one errors
Correct approach:
trim_start_index = int(n * 0.05) # Always truncates (rounds down)
trim_end_index = int(n * 0.95) # Consistent behavior
3. Modifying Original Array
The sort()
method modifies the array in-place, which might cause issues if the original array order needs to be preserved elsewhere.
Problematic if original order matters:
arr.sort() # Modifies the original array
Alternative if preservation needed:
sorted_arr = sorted(arr) # Creates a new sorted array
trimmed_array = sorted_arr[trim_start_index:trim_end_index]
4. Edge Cases with Small Arrays
When the array size is very small (e.g., less than 20 elements), calculating 5% might result in 0 elements to trim, which could lead to unexpected behavior if not handled properly.
Example issue:
# If n = 10: int(10 * 0.05) = int(0.5) = 0 # This means no elements are trimmed from the start # If n = 19: int(19 * 0.05) = int(0.95) = 0, int(19 * 0.95) = int(18.05) = 18 # Only 1 element gets trimmed from the end
Solution consideration:
The problem statement should guarantee minimum array size, or you should verify that the trimming logic handles these cases appropriately. The current implementation handles this correctly by using int()
which truncates to 0 for small arrays.
5. Precision Loss in Final Rounding
Not rounding to the specified precision or using incorrect rounding methods can cause the answer to fail the accuracy requirement.
Incorrect approaches:
return trimmed_mean # No rounding at all
return int(trimmed_mean * 100000) / 100000 # Manual truncation, not rounding
return round(trimmed_mean) # Rounds to nearest integer, not 5 decimal places
Correct approach:
return round(trimmed_mean, 5) # Properly rounds to 5 decimal places
Which data structure is used to implement priority queue?
Recommended Readings
Sorting Summary Comparisons We presented quite a few sorting algorithms and it is essential to know the advantages and disadvantages of each one The basic algorithms are easy to visualize and easy to learn for beginner programmers because of their simplicity As such they will suffice if you don't know any advanced
Coding Interview Patterns Your Personal Dijkstra's Algorithm to Landing Your Dream Job The goal of AlgoMonster is to help you get a job in the shortest amount of time possible in a data driven way We compiled datasets of tech interview problems and broke them down by patterns This way
Recursion Recursion is one of the most important concepts in computer science Simply speaking recursion is the process of a function calling itself Using a real life analogy imagine a scenario where you invite your friends to lunch https assets algo monster recursion jpg You first call Ben and ask
Want a Structured Path to Master System Design Too? Donβt Miss This!