Facebook Pixel

275. H-Index II

Problem Description

You are given an array citations where citations[i] represents the number of citations a researcher received for their i-th paper. The array is sorted in non-descending order (ascending order).

Your task is to find the researcher's h-index.

The h-index is defined as the maximum value of h such that the researcher has published at least h papers that have each been cited at least h times.

For example:

  • If a researcher has 5 papers with citations [0, 1, 3, 5, 6], their h-index is 3 because:
    • They have 3 papers with at least 3 citations each (papers with 3, 5, and 6 citations)
    • They don't have 4 papers with at least 4 citations each

The key insight for the binary search approach is that we're looking for the maximum h where at least h papers have h or more citations. Since the array is sorted, if we want to check if there are at least mid papers with at least mid citations, we need to verify that citations[n - mid] >= mid. This works because:

  • The last mid papers in the sorted array are from index n - mid to n - 1
  • If citations[n - mid] >= mid, then all papers from index n - mid onwards have at least mid citations
  • This means we have at least mid papers with at least mid citations

The algorithm must run in logarithmic time complexity, which suggests using binary search to find the optimal h-index value.

Quick Interview Experience
Help others by sharing your interview experience
Have you seen this problem before?

Intuition

The key observation is that the h-index has a monotonic property that creates a true/false pattern. For any value h, we can ask: "Do we have at least h papers with h or more citations?" As h increases, at some point this condition changes from true to false.

Think of it this way: if you have 5 papers with citations [0, 1, 3, 5, 6]:

  • h=1: Do we have 1 paper with ≥1 citation? Yes (papers with 1,3,5,6 citations)
  • h=2: Do we have 2 papers with ≥2 citations? Yes (papers with 3,5,6 citations)
  • h=3: Do we have 3 papers with ≥3 citations? Yes (papers with 3,5,6 citations)
  • h=4: Do we have 4 papers with ≥4 citations? No (only 2 papers have ≥4 citations)

This creates a pattern: true, true, true, false, false, ...

We want to find the last true (maximum valid h-index), which is equivalent to finding the first false and subtracting 1. Alternatively, we can reframe as finding the first position where the condition "NOT valid h-index" becomes true.

Since the array is sorted in ascending order, checking validity is straightforward:

  • The last h papers start from index n - h
  • If citations[n - h] >= h, we have at least h papers with h or more citations
  • The feasible function: feasible(h) = citations[n - h] < h (h is TOO LARGE to be valid)

We search for the first h where it's "too large" to be valid, then the answer is h - 1 (or equivalently n - first_true_index).

Solution Implementation

1class Solution:
2    def hIndex(self, citations: List[int]) -> int:
3        """
4        Calculate the H-Index from a sorted list of citations using binary search template.
5
6        Feasible condition: citations[mid] >= n - mid
7        This checks if paper at position mid has enough citations to contribute to h-index.
8        """
9        n = len(citations)
10        if n == 0:
11            return 0
12
13        # Binary search using the template: find first position where feasible
14        left, right = 0, n - 1
15        first_true_index = -1
16
17        while left <= right:
18            mid = (left + right) // 2
19
20            # Feasible: does this paper have enough citations?
21            # citations[mid] >= (n - mid) means papers from mid to end all qualify
22            if citations[mid] >= n - mid:
23                first_true_index = mid
24                right = mid - 1  # Search for earlier feasible position
25            else:
26                left = mid + 1
27
28        # If no position is feasible, h-index is 0
29        # Otherwise, h-index is n - first_true_index
30        return n - first_true_index if first_true_index != -1 else 0
31
1class Solution {
2    /**
3     * Calculates the H-index using the binary search template.
4     * Feasible condition: citations[mid] >= n - mid
5     */
6    public int hIndex(int[] citations) {
7        int n = citations.length;
8        if (n == 0) {
9            return 0;
10        }
11
12        // Binary search template
13        int left = 0;
14        int right = n - 1;
15        int firstTrueIndex = -1;
16
17        while (left <= right) {
18            int mid = left + (right - left) / 2;
19
20            // Feasible: does paper at mid have enough citations?
21            if (citations[mid] >= n - mid) {
22                firstTrueIndex = mid;
23                right = mid - 1;  // Search for earlier feasible position
24            } else {
25                left = mid + 1;
26            }
27        }
28
29        // Return h-index
30        return firstTrueIndex != -1 ? n - firstTrueIndex : 0;
31    }
32}
33
1class Solution {
2public:
3    int hIndex(vector<int>& citations) {
4        int n = citations.size();
5        if (n == 0) {
6            return 0;
7        }
8
9        // Binary search template
10        int left = 0;
11        int right = n - 1;
12        int firstTrueIndex = -1;
13
14        while (left <= right) {
15            int mid = left + (right - left) / 2;
16
17            // Feasible: does paper at mid have enough citations?
18            if (citations[mid] >= n - mid) {
19                firstTrueIndex = mid;
20                right = mid - 1;  // Search for earlier feasible position
21            } else {
22                left = mid + 1;
23            }
24        }
25
26        // Return h-index
27        return firstTrueIndex != -1 ? n - firstTrueIndex : 0;
28    }
29};
30
1/**
2 * Calculate the H-index using the binary search template.
3 * Feasible condition: citations[mid] >= n - mid
4 */
5function hIndex(citations: number[]): number {
6    const n: number = citations.length;
7    if (n === 0) {
8        return 0;
9    }
10
11    // Binary search template
12    let left: number = 0;
13    let right: number = n - 1;
14    let firstTrueIndex: number = -1;
15
16    while (left <= right) {
17        const mid: number = Math.floor((left + right) / 2);
18
19        // Feasible: does paper at mid have enough citations?
20        if (citations[mid] >= n - mid) {
21            firstTrueIndex = mid;
22            right = mid - 1;  // Search for earlier feasible position
23        } else {
24            left = mid + 1;
25        }
26    }
27
28    // Return h-index
29    return firstTrueIndex !== -1 ? n - firstTrueIndex : 0;
30}
31

Solution Approach

We use the standard binary search template to find the h-index. The key insight is to define a feasible function that creates a false, false, ..., true, true pattern.

Feasible Function: We define feasible(index) as: at position index, does citations[index] >= n - index?

This checks if the paper at position index has enough citations to contribute to an h-index of n - index. When this becomes true, all papers from this position onwards contribute to the h-index.

Template Implementation:

left, right = 0, n - 1
first_true_index = -1

while left <= right:
    mid = (left + right) // 2
    if feasible(mid):  # citations[mid] >= n - mid
        first_true_index = mid
        right = mid - 1  # search for earlier true position
    else:
        left = mid + 1

return n - first_true_index if first_true_index != -1 else 0

How it works:

  1. Initialize left = 0, right = n - 1, and first_true_index = -1
  2. Use while left <= right loop (NOT left < right)
  3. Check if citations[mid] >= n - mid (feasible condition)
  4. If true, record first_true_index = mid and search left with right = mid - 1
  5. If false, search right with left = mid + 1
  6. The h-index is n - first_true_index

The algorithm achieves O(log n) time complexity and O(1) space complexity.

Ready to land your dream job?

Unlock your dream job with a 5-minute evaluator for a personalized learning plan!

Start Evaluator

Example Walkthrough

Let's walk through the solution with citations = [0, 1, 3, 5, 6] using the binary search template.

Initial Setup:

  • Array length n = 5
  • left = 0, right = 4 (n - 1)
  • first_true_index = -1
  • Feasible condition: citations[mid] >= n - mid

Iteration 1:

  • left = 0, right = 4
  • mid = (0 + 4) // 2 = 2
  • Check citations[2] >= 5 - 2: Is 3 >= 3? Yes! (feasible)
  • Update: first_true_index = 2, right = mid - 1 = 1

Iteration 2:

  • left = 0, right = 1
  • mid = (0 + 1) // 2 = 0
  • Check citations[0] >= 5 - 0: Is 0 >= 5? No! (not feasible)
  • Update: left = mid + 1 = 1

Iteration 3:

  • left = 1, right = 1
  • mid = (1 + 1) // 2 = 1
  • Check citations[1] >= 5 - 1: Is 1 >= 4? No! (not feasible)
  • Update: left = mid + 1 = 2

Termination:

  • left = 2 > right = 1, exit loop
  • first_true_index = 2
  • h-index = n - first_true_index = 5 - 2 = 3

Verification:

  • Position 2 is the first where citations[index] >= n - index
  • From index 2 onwards: papers with 3, 5, 6 citations all meet the threshold
  • h-index = 3 means 3 papers have at least 3 citations each ✓

Time and Space Complexity

The time complexity is O(log n), where n is the length of the array citations. This is because the algorithm uses binary search to find the h-index. In each iteration of the while loop, the search space is reduced by half (either left = mid or right = mid - 1), resulting in at most log n iterations.

The space complexity is O(1). The algorithm only uses a constant amount of extra space for the variables n, left, right, and mid, regardless of the input size. No additional data structures that scale with the input are created.

Common Pitfalls

1. Using Wrong Binary Search Template Variant

Pitfall: Using while left < right with right = mid instead of the standard template with while left <= right and right = mid - 1.

# WRONG - different template variant
while left < right:
    mid = (left + right) // 2
    if citations[mid] >= n - mid:
        right = mid
    else:
        left = mid + 1
return n - left

Solution: Use the standard binary search template with first_true_index:

# CORRECT - standard template
left, right = 0, n - 1
first_true_index = -1

while left <= right:
    mid = (left + right) // 2
    if citations[mid] >= n - mid:
        first_true_index = mid
        right = mid - 1
    else:
        left = mid + 1

return n - first_true_index if first_true_index != -1 else 0

2. Not Handling Empty Array

Pitfall: Accessing citations[mid] when the array is empty causes an index error.

Solution: Add an early return for empty arrays:

if n == 0:
    return 0

3. Confusing the Feasible Condition

Pitfall: Using citations[mid] >= mid instead of citations[mid] >= n - mid.

The condition checks if the paper at position mid contributes to an h-index. The h-index from position mid onwards is n - mid papers, so the citation count must be at least n - mid.

Solution: Remember the feasible condition clearly:

# citations[mid] >= n - mid means:
# "Paper at mid has enough citations for an h-index of (n - mid)"

4. Forgetting to Handle first_true_index == -1

Pitfall: When no position satisfies the feasible condition (all papers have 0 citations), first_true_index remains -1 and the formula n - first_true_index gives wrong result.

Solution: Check for the not-found case:

return n - first_true_index if first_true_index != -1 else 0
Loading...
Discover Your Strengths and Weaknesses: Take Our 5-Minute Quiz to Tailor Your Study Plan:

Which of the following uses divide and conquer strategy?


Recommended Readings

Want a Structured Path to Master System Design Too? Don’t Miss This!

Load More