Facebook Pixel

4. Median of Two Sorted Arrays

Problem Description

You are given two sorted arrays nums1 and nums2 with sizes m and n respectively. Your task is to find the median of the combined elements from both arrays.

The median is the middle value in an ordered list of numbers. If the total number of elements is odd, the median is the middle element. If the total number of elements is even, the median is the average of the two middle elements.

For example:

  • If the combined sorted array is [1, 2, 3], the median is 2
  • If the combined sorted array is [1, 2, 3, 4], the median is (2 + 3) / 2 = 2.5

The key constraint is that your solution must achieve a time complexity of O(log(m+n)). This means you cannot simply merge the two arrays and find the median directly, as that would take O(m+n) time. Instead, you need to use a more efficient approach like binary search or divide-and-conquer.

The solution uses a divide-and-conquer strategy by implementing a helper function f(i, j, k) that finds the k-th smallest element across both arrays starting from index i in nums1 and index j in nums2. The function cleverly eliminates half of the remaining elements in each recursive call by comparing the middle elements of the current search ranges.

To find the median, the solution calculates:

  • The ((m+n+1)//2)-th smallest element (call it a)
  • The ((m+n+2)//2)-th smallest element (call it b)
  • Returns (a + b) / 2

This formula works for both odd and even total lengths because when the total length is odd, a and b will be the same element (the middle one), and when even, they will be the two middle elements.

Quick Interview Experience
Help others by sharing your interview experience
Have you seen this problem before?

Intuition

The key insight begins with recognizing that finding the median is essentially finding the middle element(s) in the combined sorted array. Since we need O(log(m+n)) time complexity, we can't afford to merge the arrays. This logarithmic requirement immediately suggests binary search or divide-and-conquer.

Let's think about what the median represents. For a combined array of size m+n:

  • If m+n is odd, we need the ((m+n+1)/2)-th smallest element
  • If m+n is even, we need the average of the ((m+n)/2)-th and ((m+n)/2 + 1)-th smallest elements

Here's a clever observation: we can unify both cases! Notice that:

  • ((m+n+1)//2) gives us the lower middle element position
  • ((m+n+2)//2) gives us the upper middle element position
  • When m+n is odd, these two positions are the same
  • When m+n is even, these give us the two middle elements

So finding the median reduces to finding the k-th smallest element in two sorted arrays - twice.

Now, how do we find the k-th smallest element efficiently? The breakthrough comes from realizing we can eliminate elements in chunks rather than one by one. If we're looking for the k-th smallest element, we can:

  1. Look at the k/2-th element in each array
  2. Compare these two elements
  3. The smaller one (and all elements before it in its array) cannot possibly be the k-th element or larger
  4. We can safely discard these k/2 elements and now look for the (k - k/2)-th smallest element in the remaining portions

Why is this safe? Because even if all k/2 elements from one array are among the k smallest overall, we still have at least k/2 elements from the other array that could be smaller. The element at position k/2 in the array with the smaller value can at most be the k/2 + k/2 = k-th smallest element if all elements before it in both arrays are smaller.

This divide-and-conquer approach eliminates half of the remaining elements to find in each step, giving us the required O(log(m+n)) complexity. The recursion naturally handles edge cases where one array is exhausted or we've narrowed down to finding the 1st smallest element (which is just the minimum of the two current elements).

Learn more about Binary Search and Divide and Conquer patterns.

Solution Approach

The solution implements a divide-and-conquer approach through a recursive helper function f(i, j, k) that finds the k-th smallest element between two sorted arrays.

Main Function Structure:

def findMedianSortedArrays(self, nums1, nums2):
    m, n = len(nums1), len(nums2)
    a = f(0, 0, (m + n + 1) // 2)  # Find lower middle element
    b = f(0, 0, (m + n + 2) // 2)  # Find upper middle element
    return (a + b) / 2              # Return average

The median is calculated as the average of two positions:

  • Position (m + n + 1) // 2 - the lower middle element
  • Position (m + n + 2) // 2 - the upper middle element

When total length is odd, both positions point to the same element. When even, they point to the two middle elements.

Helper Function f(i, j, k) Implementation:

The function finds the k-th smallest element starting from index i in nums1 and index j in nums2.

Base Cases:

  1. If i >= m: Array nums1 is exhausted from position i. The k-th element must be in nums2:

    return nums2[j + k - 1]
  2. If j >= n: Array nums2 is exhausted from position j. The k-th element must be in nums1:

    return nums1[i + k - 1]
  3. If k == 1: We need the smallest element between the current positions:

    return min(nums1[i], nums2[j])

Recursive Case:

  1. Calculate p = k // 2 - we'll try to eliminate p elements

  2. Find the p-th element from current positions in both arrays:

    x = nums1[i + p - 1] if i + p - 1 < m else inf
    y = nums2[j + p - 1] if j + p - 1 < n else inf

    If an array doesn't have enough elements, we use inf as a placeholder.

  3. Compare and eliminate:

    • If x < y: Elements nums1[i] through nums1[i + p - 1] are among the smallest and can be eliminated. We advance pointer i by p and look for the (k - p)-th element:

      return f(i + p, j, k - p)
    • If x >= y: Elements nums2[j] through nums2[j + p - 1] are among the smallest and can be eliminated. We advance pointer j by p and look for the (k - p)-th element:

      return f(i, j + p, k - p)

Why This Works:

When we compare the k/2-th elements from both arrays, the smaller one (and all elements before it) cannot be larger than the k-th smallest element overall. This is because:

  • There are at most k/2 - 1 elements before it in its own array
  • There are at most k/2 elements before the compared element in the other array
  • Total: at most (k/2 - 1) + k/2 = k - 1 elements smaller than it

Therefore, we can safely discard these k/2 elements and reduce our problem size by half.

Time Complexity: O(log(m + n)) because we eliminate half of the remaining search space in each recursive call.

Space Complexity: O(log(m + n)) for the recursion stack depth.

Ready to land your dream job?

Unlock your dream job with a 5-minute evaluator for a personalized learning plan!

Start Evaluator

Example Walkthrough

Let's walk through finding the median of nums1 = [1, 3] and nums2 = [2, 4].

Step 1: Initial Setup

  • m = 2, n = 2, total length = 4 (even)
  • We need to find:
    • a = f(0, 0, (4+1)//2) = f(0, 0, 2) → 2nd smallest element
    • b = f(0, 0, (4+2)//2) = f(0, 0, 3) → 3rd smallest element
    • Median = (a + b) / 2

Step 2: Finding a = f(0, 0, 2) (2nd smallest element)

Call f(0, 0, 2):

  • i = 0, j = 0, k = 2
  • Not a base case (both arrays have elements, k > 1)
  • p = k // 2 = 1
  • x = nums1[0 + 1 - 1] = nums1[0] = 1
  • y = nums2[0 + 1 - 1] = nums2[0] = 2
  • Since x < y (1 < 2), eliminate nums1[0] and recurse

Call f(1, 0, 1):

  • i = 1, j = 0, k = 1
  • Base case: k == 1
  • Return min(nums1[1], nums2[0]) = min(3, 2) = 2

So a = 2

Step 3: Finding b = f(0, 0, 3) (3rd smallest element)

Call f(0, 0, 3):

  • i = 0, j = 0, k = 3
  • Not a base case
  • p = k // 2 = 1
  • x = nums1[0] = 1
  • y = nums2[0] = 2
  • Since x < y (1 < 2), eliminate nums1[0] and recurse

Call f(1, 0, 2):

  • i = 1, j = 0, k = 2
  • Not a base case
  • p = k // 2 = 1
  • x = nums1[1 + 1 - 1] = nums1[1] = 3
  • y = nums2[0 + 1 - 1] = nums2[0] = 2
  • Since x >= y (3 >= 2), eliminate nums2[0] and recurse

Call f(1, 1, 1):

  • i = 1, j = 1, k = 1
  • Base case: k == 1
  • Return min(nums1[1], nums2[1]) = min(3, 4) = 3

So b = 3

Step 4: Calculate Median

  • Median = (a + b) / 2 = (2 + 3) / 2 = 2.5

The combined sorted array would be [1, 2, 3, 4], and indeed the median is (2 + 3) / 2 = 2.5.

Notice how the algorithm efficiently found the 2nd and 3rd smallest elements without merging the arrays, eliminating half of the remaining search space at each step.

Solution Implementation

1class Solution:
2    def findMedianSortedArrays(self, nums1: List[int], nums2: List[int]) -> float:
3        """
4        Find the median of two sorted arrays.
5      
6        Args:
7            nums1: First sorted array
8            nums2: Second sorted array
9          
10        Returns:
11            The median value of the combined sorted arrays
12        """
13      
14        def find_kth_element(index1: int, index2: int, k: int) -> int:
15            """
16            Find the k-th smallest element in the union of nums1[index1:] and nums2[index2:].
17            Uses binary search by eliminating k/2 elements at each step.
18          
19            Args:
20                index1: Starting index in nums1
21                index2: Starting index in nums2
22                k: The k-th element to find (1-indexed)
23              
24            Returns:
25                The k-th smallest element
26            """
27            # Base case: if nums1 is exhausted, return from nums2
28            if index1 >= len_nums1:
29                return nums2[index2 + k - 1]
30          
31            # Base case: if nums2 is exhausted, return from nums1
32            if index2 >= len_nums2:
33                return nums1[index1 + k - 1]
34          
35            # Base case: if k is 1, return the minimum of current elements
36            if k == 1:
37                return min(nums1[index1], nums2[index2])
38          
39            # Binary search: compare elements at position k/2 - 1 from current indices
40            half_k = k // 2
41          
42            # Get the element at position (index + k/2 - 1) or infinity if out of bounds
43            nums1_mid_value = nums1[index1 + half_k - 1] if index1 + half_k - 1 < len_nums1 else float('inf')
44            nums2_mid_value = nums2[index2 + half_k - 1] if index2 + half_k - 1 < len_nums2 else float('inf')
45          
46            # Eliminate the smaller half and recursively find the (k - k/2)-th element
47            if nums1_mid_value < nums2_mid_value:
48                # Eliminate first half_k elements from nums1
49                return find_kth_element(index1 + half_k, index2, k - half_k)
50            else:
51                # Eliminate first half_k elements from nums2
52                return find_kth_element(index1, index2 + half_k, k - half_k)
53      
54        # Store array lengths
55        len_nums1, len_nums2 = len(nums1), len(nums2)
56        total_length = len_nums1 + len_nums2
57      
58        # For odd total length: find the middle element
59        # For even total length: find the two middle elements and average them
60        # Using (total + 1) / 2 and (total + 2) / 2 handles both cases elegantly
61        left_median = find_kth_element(0, 0, (total_length + 1) // 2)
62        right_median = find_kth_element(0, 0, (total_length + 2) // 2)
63      
64        # Return the average of the two middle values
65        # For odd length, left_median == right_median
66        return (left_median + right_median) / 2.0
67
1class Solution {
2    // Length of first array
3    private int firstArrayLength;
4    // Length of second array
5    private int secondArrayLength;
6    // Reference to first sorted array
7    private int[] firstArray;
8    // Reference to second sorted array
9    private int[] secondArray;
10
11    /**
12     * Finds the median of two sorted arrays using binary search approach.
13     * The median is found by finding the (n+1)/2-th and (n+2)/2-th smallest elements
14     * and averaging them, which handles both odd and even total lengths.
15     * 
16     * @param nums1 First sorted array
17     * @param nums2 Second sorted array
18     * @return The median value of the combined sorted arrays
19     */
20    public double findMedianSortedArrays(int[] nums1, int[] nums2) {
21        // Initialize instance variables
22        firstArrayLength = nums1.length;
23        secondArrayLength = nums2.length;
24        this.firstArray = nums1;
25        this.secondArray = nums2;
26      
27        // For odd total length: both positions point to the same middle element
28        // For even total length: positions point to the two middle elements
29        int leftMedianElement = findKthSmallest(0, 0, (firstArrayLength + secondArrayLength + 1) / 2);
30        int rightMedianElement = findKthSmallest(0, 0, (firstArrayLength + secondArrayLength + 2) / 2);
31      
32        // Average the two middle elements (same element for odd length)
33        return (leftMedianElement + rightMedianElement) / 2.0;
34    }
35
36    /**
37     * Recursively finds the k-th smallest element in two sorted arrays.
38     * Uses binary search by eliminating k/2 elements at each step.
39     * 
40     * @param firstArrayStartIndex Starting index in the first array
41     * @param secondArrayStartIndex Starting index in the second array
42     * @param k The position (1-indexed) of the element to find
43     * @return The k-th smallest element
44     */
45    private int findKthSmallest(int firstArrayStartIndex, int secondArrayStartIndex, int k) {
46        // Base case: first array is exhausted, return k-th element from second array
47        if (firstArrayStartIndex >= firstArrayLength) {
48            return secondArray[secondArrayStartIndex + k - 1];
49        }
50      
51        // Base case: second array is exhausted, return k-th element from first array
52        if (secondArrayStartIndex >= secondArrayLength) {
53            return firstArray[firstArrayStartIndex + k - 1];
54        }
55      
56        // Base case: finding the 1st smallest element (minimum of current elements)
57        if (k == 1) {
58            return Math.min(firstArray[firstArrayStartIndex], secondArray[secondArrayStartIndex]);
59        }
60      
61        // Number of elements to potentially eliminate (half of k)
62        int halfK = k / 2;
63      
64        // Get the element at position halfK from current start in first array
65        // Use MAX_VALUE as sentinel if index would be out of bounds
66        int firstArrayMidValue = (firstArrayStartIndex + halfK - 1 < firstArrayLength) 
67            ? firstArray[firstArrayStartIndex + halfK - 1] 
68            : Integer.MAX_VALUE;
69      
70        // Get the element at position halfK from current start in second array
71        // Use MAX_VALUE as sentinel if index would be out of bounds
72        int secondArrayMidValue = (secondArrayStartIndex + halfK - 1 < secondArrayLength) 
73            ? secondArray[secondArrayStartIndex + halfK - 1] 
74            : Integer.MAX_VALUE;
75      
76        // Compare middle values and eliminate the smaller half
77        // The array with smaller middle value cannot contain the k-th smallest element
78        // beyond its middle point, so we can safely skip those elements
79        if (firstArrayMidValue < secondArrayMidValue) {
80            // Eliminate halfK elements from first array
81            return findKthSmallest(firstArrayStartIndex + halfK, secondArrayStartIndex, k - halfK);
82        } else {
83            // Eliminate halfK elements from second array
84            return findKthSmallest(firstArrayStartIndex, secondArrayStartIndex + halfK, k - halfK);
85        }
86    }
87}
88
1class Solution {
2public:
3    double findMedianSortedArrays(vector<int>& nums1, vector<int>& nums2) {
4        int size1 = nums1.size();
5        int size2 = nums2.size();
6      
7        // Lambda function to find the k-th smallest element in two sorted arrays
8        // Parameters: index1 - starting index in nums1
9        //            index2 - starting index in nums2
10        //            k - we want to find the k-th smallest element (1-indexed)
11        function<int(int, int, int)> findKthElement = [&](int index1, int index2, int k) {
12            // If nums1 is exhausted, return the k-th element from nums2
13            if (index1 >= size1) {
14                return nums2[index2 + k - 1];
15            }
16          
17            // If nums2 is exhausted, return the k-th element from nums1
18            if (index2 >= size2) {
19                return nums1[index1 + k - 1];
20            }
21          
22            // Base case: if k is 1, return the minimum of current elements
23            if (k == 1) {
24                return min(nums1[index1], nums2[index2]);
25            }
26          
27            // Binary search approach: eliminate k/2 elements at a time
28            int halfK = k / 2;
29          
30            // Get the element at position (index + k/2 - 1) in each array
31            // If out of bounds, use INT_MAX to ensure we don't choose this array
32            int value1 = (index1 + halfK - 1 < size1) ? nums1[index1 + halfK - 1] : INT_MAX;
33            int value2 = (index2 + halfK - 1 < size2) ? nums2[index2 + halfK - 1] : INT_MAX;
34          
35            // Compare and eliminate the smaller half
36            // If value1 < value2, all elements from nums1[index1] to nums1[index1 + halfK - 1]
37            // are guaranteed to be among the smallest k elements
38            if (value1 < value2) {
39                return findKthElement(index1 + halfK, index2, k - halfK);
40            } else {
41                return findKthElement(index1, index2 + halfK, k - halfK);
42            }
43        };
44      
45        // Find the median elements
46        // For odd total length: both leftMedian and rightMedian are the same (middle element)
47        // For even total length: leftMedian is the left middle, rightMedian is the right middle
48        int totalLength = size1 + size2;
49        int leftMedian = findKthElement(0, 0, (totalLength + 1) / 2);
50        int rightMedian = findKthElement(0, 0, (totalLength + 2) / 2);
51      
52        // Return the average of the two middle elements
53        return (leftMedian + rightMedian) / 2.0;
54    }
55};
56
1/**
2 * Finds the median of two sorted arrays
3 * @param nums1 - First sorted array
4 * @param nums2 - Second sorted array
5 * @returns The median value of the combined arrays
6 */
7function findMedianSortedArrays(nums1: number[], nums2: number[]): number {
8    const firstArrayLength: number = nums1.length;
9    const secondArrayLength: number = nums2.length;
10  
11    /**
12     * Recursively finds the k-th smallest element in two sorted arrays
13     * @param firstIndex - Current index in nums1
14     * @param secondIndex - Current index in nums2
15     * @param kthElement - The k-th element to find (1-indexed)
16     * @returns The k-th smallest element
17     */
18    const findKthElement = (firstIndex: number, secondIndex: number, kthElement: number): number => {
19        // If we've exhausted all elements in nums1, return from nums2
20        if (firstIndex >= firstArrayLength) {
21            return nums2[secondIndex + kthElement - 1];
22        }
23      
24        // If we've exhausted all elements in nums2, return from nums1
25        if (secondIndex >= secondArrayLength) {
26            return nums1[firstIndex + kthElement - 1];
27        }
28      
29        // Base case: if looking for the 1st element, return the minimum
30        if (kthElement === 1) {
31            return Math.min(nums1[firstIndex], nums2[secondIndex]);
32        }
33      
34        // Calculate the midpoint for binary search
35        const halfK: number = Math.floor(kthElement / 2);
36      
37        // Get the element at position (currentIndex + halfK - 1) in each array
38        // Use a large value (1 << 30) if index is out of bounds
39        const midValueInFirst: number = firstIndex + halfK - 1 < firstArrayLength 
40            ? nums1[firstIndex + halfK - 1] 
41            : 1 << 30;
42      
43        const midValueInSecond: number = secondIndex + halfK - 1 < secondArrayLength 
44            ? nums2[secondIndex + halfK - 1] 
45            : 1 << 30;
46      
47        // Eliminate halfK elements from the array with smaller mid value
48        if (midValueInFirst < midValueInSecond) {
49            return findKthElement(firstIndex + halfK, secondIndex, kthElement - halfK);
50        } else {
51            return findKthElement(firstIndex, secondIndex + halfK, kthElement - halfK);
52        }
53    };
54  
55    // Calculate the median position(s)
56    const totalLength: number = firstArrayLength + secondArrayLength;
57  
58    // For odd total length: find the middle element
59    // For even total length: find the two middle elements
60    const leftMedian: number = findKthElement(0, 0, Math.floor((totalLength + 1) / 2));
61    const rightMedian: number = findKthElement(0, 0, Math.floor((totalLength + 2) / 2));
62  
63    // Return the average of the two middle values
64    return (leftMedian + rightMedian) / 2;
65}
66

Time and Space Complexity

Time Complexity: O(log(m + n))

The algorithm uses a recursive approach to find the k-th smallest element in two sorted arrays. The key insight is that in each recursive call, the function f eliminates half of the remaining elements to search (specifically k//2 elements).

  • In each recursive call, we compare elements at positions i + p - 1 and j + p - 1 where p = k // 2
  • Based on the comparison, we eliminate p elements from either nums1 or nums2
  • The value of k reduces by p in each recursive call, effectively halving the search space
  • Since we're reducing k by half in each iteration and k starts at most at (m + n + 2) // 2, the maximum recursion depth is O(log(m + n))
  • Each recursive call performs O(1) operations (comparisons and index calculations)

Therefore, the total time complexity is O(log(m + n)).

Space Complexity: O(log(m + n))

The space complexity is determined by the recursion call stack:

  • The function f is called recursively without using any additional data structures
  • The maximum recursion depth is O(log(m + n)) as explained above
  • Each recursive call uses O(1) space for its parameters (i, j, k) and local variables (p, x, y)
  • The call stack will have at most O(log(m + n)) frames

Therefore, the space complexity is O(log(m + n)) due to the recursion stack.

Learn more about how to find time and space complexity quickly.

Common Pitfalls

1. Off-by-One Errors in Index Calculation

One of the most common mistakes is incorrectly calculating indices when finding the k-th element or when checking bounds.

Pitfall Example:

# Incorrect: forgetting that k is 1-indexed
return nums2[j + k]  # Should be nums2[j + k - 1]

# Incorrect: wrong boundary check
if index1 + half_k < len_nums1:  # Should be index1 + half_k - 1
    nums1_mid_value = nums1[index1 + half_k]

Solution: Always remember that:

  • The k-th element (1-indexed) is at position index + k - 1 (0-indexed)
  • When comparing elements at k/2 position, use index + k/2 - 1

2. Integer Overflow When Using Infinity

When one array doesn't have enough elements, using a very large number as infinity can cause issues in some languages or edge cases.

Pitfall Example:

# Potential issue with integer operations
nums1_mid_value = 10**9  # Might not be large enough for all cases

Solution: Use float('inf') in Python, which handles comparisons correctly:

nums1_mid_value = nums1[index1 + half_k - 1] if index1 + half_k - 1 < len_nums1 else float('inf')

3. Incorrect Handling of k = 0 or k/2 = 0

When k becomes 1, dividing by 2 gives 0, which can lead to infinite recursion if not handled properly.

Pitfall Example:

def find_kth_element(index1, index2, k):
    half_k = k // 2  # When k=1, half_k=0
    # Without the k==1 base case, this would try to eliminate 0 elements
    return find_kth_element(index1 + half_k, index2, k - half_k)  # Infinite loop!

Solution: Always include the k == 1 base case before the recursive logic:

if k == 1:
    return min(nums1[index1], nums2[index2])

4. Wrong Median Position Calculation

A subtle but critical error is using the wrong formula for finding median positions.

Pitfall Example:

# Incorrect for even-length arrays
left_median = find_kth_element(0, 0, total_length // 2)
right_median = find_kth_element(0, 0, total_length // 2 + 1)
# This finds positions (n/2) and (n/2 + 1) instead of (n/2) and (n/2 + 1) for 1-indexed

Solution: Use the formula (n + 1) // 2 and (n + 2) // 2:

left_median = find_kth_element(0, 0, (total_length + 1) // 2)
right_median = find_kth_element(0, 0, (total_length + 2) // 2)

This correctly handles both odd and even lengths:

  • Odd (n=5): positions 3 and 3 (same element)
  • Even (n=4): positions 2 and 3 (two middle elements)

5. Not Handling Empty Arrays

The code might fail if one or both arrays are empty.

Pitfall Example:

# Without proper checks, this could fail
if nums1[index1] < nums2[index2]:  # IndexError if either array is empty

Solution: The base cases should handle empty arrays naturally:

if index1 >= len_nums1:  # Handles empty nums1 when len_nums1 = 0
    return nums2[index2 + k - 1]
if index2 >= len_nums2:  # Handles empty nums2 when len_nums2 = 0
    return nums1[index1 + k - 1]

6. Forgetting to Update k After Elimination

When eliminating elements, it's crucial to reduce k by the number of eliminated elements.

Pitfall Example:

# Incorrect: not updating k
return find_kth_element(index1 + half_k, index2, k)  # Should be k - half_k

Solution: Always subtract the number of eliminated elements from k:

return find_kth_element(index1 + half_k, index2, k - half_k)
Discover Your Strengths and Weaknesses: Take Our 5-Minute Quiz to Tailor Your Study Plan:

Which algorithm is best for finding the shortest distance between two points in an unweighted graph?


Recommended Readings

Want a Structured Path to Master System Design Too? Don’t Miss This!

Load More