275. H-Index II
Problem Description
You are given an array citations where citations[i] represents the number of citations a researcher received for their i-th paper. The array is sorted in non-descending order (ascending order).
Your task is to find the researcher's h-index.
The h-index is defined as the maximum value of h such that the researcher has published at least h papers that have each been cited at least h times.
For example:
- If a researcher has 5 papers with citations [0, 1, 3, 5, 6], their h-index is 3 because:
- They have 3 papers with at least 3 citations each (papers with 3, 5, and 6 citations)
- They don't have 4 papers with at least 4 citations each
The key insight for the binary search approach is that we're looking for the maximum h where at least h papers have h or more citations. Since the array is sorted, if we want to check if there are at least mid papers with at least mid citations, we need to verify that citations[n - mid] >= mid. This works because:
- The last
midpapers in the sorted array are from indexn - midton - 1 - If
citations[n - mid] >= mid, then all papers from indexn - midonwards have at leastmidcitations - This means we have at least
midpapers with at leastmidcitations
The algorithm must run in logarithmic time complexity, which suggests using binary search to find the optimal h-index value.
Intuition
The key observation is that the h-index has a monotonic property that creates a true/false pattern. For any value h, we can ask: "Do we have at least h papers with h or more citations?" As h increases, at some point this condition changes from true to false.
Think of it this way: if you have 5 papers with citations [0, 1, 3, 5, 6]:
- h=1: Do we have 1 paper with ≥1 citation? Yes (papers with 1,3,5,6 citations)
- h=2: Do we have 2 papers with ≥2 citations? Yes (papers with 3,5,6 citations)
- h=3: Do we have 3 papers with ≥3 citations? Yes (papers with 3,5,6 citations)
- h=4: Do we have 4 papers with ≥4 citations? No (only 2 papers have ≥4 citations)
This creates a pattern: true, true, true, false, false, ...
We want to find the last true (maximum valid h-index), which is equivalent to finding the first false and subtracting 1. Alternatively, we can reframe as finding the first position where the condition "NOT valid h-index" becomes true.
Since the array is sorted in ascending order, checking validity is straightforward:
- The last
hpapers start from indexn - h - If
citations[n - h] >= h, we have at leasthpapers withhor more citations - The feasible function:
feasible(h) = citations[n - h] < h(h is TOO LARGE to be valid)
We search for the first h where it's "too large" to be valid, then the answer is h - 1 (or equivalently n - first_true_index).
Solution Implementation
1class Solution:
2 def hIndex(self, citations: List[int]) -> int:
3 """
4 Calculate the H-Index from a sorted list of citations using binary search template.
5
6 Feasible condition: citations[mid] >= n - mid
7 This checks if paper at position mid has enough citations to contribute to h-index.
8 """
9 n = len(citations)
10 if n == 0:
11 return 0
12
13 # Binary search using the template: find first position where feasible
14 left, right = 0, n - 1
15 first_true_index = -1
16
17 while left <= right:
18 mid = (left + right) // 2
19
20 # Feasible: does this paper have enough citations?
21 # citations[mid] >= (n - mid) means papers from mid to end all qualify
22 if citations[mid] >= n - mid:
23 first_true_index = mid
24 right = mid - 1 # Search for earlier feasible position
25 else:
26 left = mid + 1
27
28 # If no position is feasible, h-index is 0
29 # Otherwise, h-index is n - first_true_index
30 return n - first_true_index if first_true_index != -1 else 0
311class Solution {
2 /**
3 * Calculates the H-index using the binary search template.
4 * Feasible condition: citations[mid] >= n - mid
5 */
6 public int hIndex(int[] citations) {
7 int n = citations.length;
8 if (n == 0) {
9 return 0;
10 }
11
12 // Binary search template
13 int left = 0;
14 int right = n - 1;
15 int firstTrueIndex = -1;
16
17 while (left <= right) {
18 int mid = left + (right - left) / 2;
19
20 // Feasible: does paper at mid have enough citations?
21 if (citations[mid] >= n - mid) {
22 firstTrueIndex = mid;
23 right = mid - 1; // Search for earlier feasible position
24 } else {
25 left = mid + 1;
26 }
27 }
28
29 // Return h-index
30 return firstTrueIndex != -1 ? n - firstTrueIndex : 0;
31 }
32}
331class Solution {
2public:
3 int hIndex(vector<int>& citations) {
4 int n = citations.size();
5 if (n == 0) {
6 return 0;
7 }
8
9 // Binary search template
10 int left = 0;
11 int right = n - 1;
12 int firstTrueIndex = -1;
13
14 while (left <= right) {
15 int mid = left + (right - left) / 2;
16
17 // Feasible: does paper at mid have enough citations?
18 if (citations[mid] >= n - mid) {
19 firstTrueIndex = mid;
20 right = mid - 1; // Search for earlier feasible position
21 } else {
22 left = mid + 1;
23 }
24 }
25
26 // Return h-index
27 return firstTrueIndex != -1 ? n - firstTrueIndex : 0;
28 }
29};
301/**
2 * Calculate the H-index using the binary search template.
3 * Feasible condition: citations[mid] >= n - mid
4 */
5function hIndex(citations: number[]): number {
6 const n: number = citations.length;
7 if (n === 0) {
8 return 0;
9 }
10
11 // Binary search template
12 let left: number = 0;
13 let right: number = n - 1;
14 let firstTrueIndex: number = -1;
15
16 while (left <= right) {
17 const mid: number = Math.floor((left + right) / 2);
18
19 // Feasible: does paper at mid have enough citations?
20 if (citations[mid] >= n - mid) {
21 firstTrueIndex = mid;
22 right = mid - 1; // Search for earlier feasible position
23 } else {
24 left = mid + 1;
25 }
26 }
27
28 // Return h-index
29 return firstTrueIndex !== -1 ? n - firstTrueIndex : 0;
30}
31Solution Approach
We use the standard binary search template to find the h-index. The key insight is to define a feasible function that creates a false, false, ..., true, true pattern.
Feasible Function:
We define feasible(index) as: at position index, does citations[index] >= n - index?
This checks if the paper at position index has enough citations to contribute to an h-index of n - index. When this becomes true, all papers from this position onwards contribute to the h-index.
Template Implementation:
left, right = 0, n - 1
first_true_index = -1
while left <= right:
mid = (left + right) // 2
if feasible(mid): # citations[mid] >= n - mid
first_true_index = mid
right = mid - 1 # search for earlier true position
else:
left = mid + 1
return n - first_true_index if first_true_index != -1 else 0
How it works:
- Initialize
left = 0,right = n - 1, andfirst_true_index = -1 - Use
while left <= rightloop (NOTleft < right) - Check if
citations[mid] >= n - mid(feasible condition) - If true, record
first_true_index = midand search left withright = mid - 1 - If false, search right with
left = mid + 1 - The h-index is
n - first_true_index
The algorithm achieves O(log n) time complexity and O(1) space complexity.
Ready to land your dream job?
Unlock your dream job with a 5-minute evaluator for a personalized learning plan!
Start EvaluatorExample Walkthrough
Let's walk through the solution with citations = [0, 1, 3, 5, 6] using the binary search template.
Initial Setup:
- Array length
n = 5 left = 0,right = 4(n - 1)first_true_index = -1- Feasible condition:
citations[mid] >= n - mid
Iteration 1:
left = 0,right = 4mid = (0 + 4) // 2 = 2- Check
citations[2] >= 5 - 2: Is3 >= 3? Yes! (feasible) - Update:
first_true_index = 2,right = mid - 1 = 1
Iteration 2:
left = 0,right = 1mid = (0 + 1) // 2 = 0- Check
citations[0] >= 5 - 0: Is0 >= 5? No! (not feasible) - Update:
left = mid + 1 = 1
Iteration 3:
left = 1,right = 1mid = (1 + 1) // 2 = 1- Check
citations[1] >= 5 - 1: Is1 >= 4? No! (not feasible) - Update:
left = mid + 1 = 2
Termination:
left = 2 > right = 1, exit loopfirst_true_index = 2- h-index =
n - first_true_index = 5 - 2 = 3
Verification:
- Position 2 is the first where
citations[index] >= n - index - From index 2 onwards: papers with 3, 5, 6 citations all meet the threshold
- h-index = 3 means 3 papers have at least 3 citations each ✓
Time and Space Complexity
The time complexity is O(log n), where n is the length of the array citations. This is because the algorithm uses binary search to find the h-index. In each iteration of the while loop, the search space is reduced by half (either left = mid or right = mid - 1), resulting in at most log n iterations.
The space complexity is O(1). The algorithm only uses a constant amount of extra space for the variables n, left, right, and mid, regardless of the input size. No additional data structures that scale with the input are created.
Common Pitfalls
1. Using Wrong Binary Search Template Variant
Pitfall: Using while left < right with right = mid instead of the standard template with while left <= right and right = mid - 1.
# WRONG - different template variant while left < right: mid = (left + right) // 2 if citations[mid] >= n - mid: right = mid else: left = mid + 1 return n - left
Solution: Use the standard binary search template with first_true_index:
# CORRECT - standard template left, right = 0, n - 1 first_true_index = -1 while left <= right: mid = (left + right) // 2 if citations[mid] >= n - mid: first_true_index = mid right = mid - 1 else: left = mid + 1 return n - first_true_index if first_true_index != -1 else 0
2. Not Handling Empty Array
Pitfall: Accessing citations[mid] when the array is empty causes an index error.
Solution: Add an early return for empty arrays:
if n == 0: return 0
3. Confusing the Feasible Condition
Pitfall: Using citations[mid] >= mid instead of citations[mid] >= n - mid.
The condition checks if the paper at position mid contributes to an h-index. The h-index from position mid onwards is n - mid papers, so the citation count must be at least n - mid.
Solution: Remember the feasible condition clearly:
# citations[mid] >= n - mid means: # "Paper at mid has enough citations for an h-index of (n - mid)"
4. Forgetting to Handle first_true_index == -1
Pitfall: When no position satisfies the feasible condition (all papers have 0 citations), first_true_index remains -1 and the formula n - first_true_index gives wrong result.
Solution: Check for the not-found case:
return n - first_true_index if first_true_index != -1 else 0
Which of the following uses divide and conquer strategy?
Recommended Readings
https assets algo monster cover_photos Binary_Search svg Binary Search Intuition Binary search is an efficient array search algorithm It works by narrowing down the search range by half each time If you have looked up a word in a physical dictionary you've already used binary search in real life Let's
Coding Interview Patterns Your Personal Dijkstra's Algorithm to Landing Your Dream Job The goal of AlgoMonster is to help you get a job in the shortest amount of time possible in a data driven way We compiled datasets of tech interview problems and broke them down by patterns This way
Recursion If you prefer videos here's a video that explains recursion in a fun and easy way Recursion is one of the most important concepts in computer science Simply speaking recursion is the process of a function calling itself Using a real life analogy imagine a scenario where you invite your friends to lunch https assets algo monster recursion jpg You first call Ben and ask him
Want a Structured Path to Master System Design Too? Don’t Miss This!