Facebook Pixel

3641. Longest Semi-Repeating Subarray 🔒

Problem Description

You are given an integer array nums of length n and an integer k.

A semi-repeating subarray is defined as a contiguous subarray where at most k elements appear more than once (i.e., at most k elements are repeating).

Your task is to find and return the length of the longest semi-repeating subarray in nums.

For example, if you have an array [1, 2, 3, 2, 4, 3, 5] and k = 2, the subarray [1, 2, 3, 2, 4, 3] is valid because exactly 2 elements (2 and 3) appear more than once. However, if we tried to extend it to include 5, making it [1, 2, 3, 2, 4, 3, 5], it would still be valid since we still have only 2 repeating elements.

The key constraint is that you can have at most k distinct values that appear multiple times within any subarray you consider. Elements that appear exactly once in the subarray don't count toward this limit.

Quick Interview Experience
Help others by sharing your interview experience
Have you seen this problem before?

Intuition

When we need to find the longest subarray with certain properties, a sliding window approach often works well. The key insight here is that we need to track how many elements are "repeating" (appearing more than once) in our current window.

Think about expanding a window from left to right. As we add new elements, we need to know:

  1. Has this element appeared before in our window?
  2. If it has appeared exactly once before, adding it now makes it a "repeating" element

The critical observation is that we only care about the transition point - when an element goes from appearing once to appearing twice. That's when it becomes a "repeating" element and contributes to our count.

Similarly, when we shrink the window from the left:

  • If removing an element causes its count to drop from 2 to 1, it's no longer "repeating"
  • We need to decrease our repeating element counter accordingly

This leads us to maintain:

  • A frequency map (cnt) to track occurrences of each element
  • A counter (cur) for the number of repeating elements
  • When cnt[x] becomes 2, we increment cur
  • When cnt[x] drops to 1, we decrement cur

The window remains valid as long as cur <= k. When cur > k, we have too many repeating elements and must shrink the window from the left until the condition is satisfied again.

This two-pointer sliding window technique ensures we explore all possible valid subarrays while maintaining O(n) time complexity, as each element is visited at most twice (once by right pointer, once by left pointer).

Solution Approach

We implement a sliding window technique using two pointers to efficiently find the longest semi-repeating subarray.

Data Structures Used:

  • cnt: A hash table (defaultdict) to track the frequency of each element in the current window
  • cur: Counter for the number of repeating elements (elements appearing more than once)
  • l and r: Left and right pointers defining our sliding window
  • ans: Stores the maximum length found so far

Algorithm Steps:

  1. Initialize variables:

    • cnt = defaultdict(int) for frequency tracking
    • ans = cur = l = 0 to start with empty window and zero repeating elements
  2. Expand the window by moving right pointer:

    for r, x in enumerate(nums):
        cnt[x] += 1
        cur += cnt[x] == 2
    • For each element x at position r, increment its count in cnt
    • If cnt[x] becomes exactly 2, this element just became a repeating element, so increment cur
    • The expression cnt[x] == 2 returns True (1) or False (0), elegantly updating the counter
  3. Shrink window when constraint is violated:

    while cur > k:
        cnt[nums[l]] -= 1
        cur -= cnt[nums[l]] == 1
        l += 1
    • When we have more than k repeating elements, shrink from the left
    • Decrement the count of the leftmost element
    • If its count drops to exactly 1, it's no longer repeating, so decrement cur
    • Move the left pointer forward
  4. Update the answer:

    ans = max(ans, r - l + 1)
    • After ensuring the window is valid, update ans with the current window size
  5. Return the result:

    • After processing all elements, return the maximum length found

Time Complexity: O(n) - Each element is visited at most twice (once by each pointer)

Space Complexity: O(min(n, m)) where m is the number of distinct elements in the array

The beauty of this solution lies in tracking state transitions (1→2 for becoming repeating, 2→1 for stopping repetition) rather than constantly recounting repeating elements, making it highly efficient.

Ready to land your dream job?

Unlock your dream job with a 3-minute evaluator for a personalized learning plan!

Start Evaluator

Example Walkthrough

Let's walk through the solution with nums = [1, 2, 3, 2, 4, 3, 5] and k = 2.

We'll track our sliding window with left pointer l and right pointer r, maintaining:

  • cnt: frequency map of elements in current window
  • cur: count of repeating elements (those appearing > 1 time)
  • ans: maximum valid subarray length found

Initial State: l = 0, cur = 0, ans = 0, cnt = {}

Step 1: r = 0, element = 1

  • Add 1 to window: cnt = {1: 1}
  • cnt[1] becomes 1 (not 2), so cur stays 0
  • Window [1] is valid, ans = max(0, 0-0+1) = 1

Step 2: r = 1, element = 2

  • Add 2 to window: cnt = {1: 1, 2: 1}
  • cnt[2] becomes 1 (not 2), so cur stays 0
  • Window [1, 2] is valid, ans = max(1, 1-0+1) = 2

Step 3: r = 2, element = 3

  • Add 3 to window: cnt = {1: 1, 2: 1, 3: 1}
  • cnt[3] becomes 1 (not 2), so cur stays 0
  • Window [1, 2, 3] is valid, ans = max(2, 2-0+1) = 3

Step 4: r = 3, element = 2

  • Add 2 to window: cnt = {1: 1, 2: 2, 3: 1}
  • cnt[2] becomes 2, so cur = 1 (2 is now repeating)
  • Window [1, 2, 3, 2] is valid (cur ≤ k), ans = max(3, 3-0+1) = 4

Step 5: r = 4, element = 4

  • Add 4 to window: cnt = {1: 1, 2: 2, 3: 1, 4: 1}
  • cnt[4] becomes 1 (not 2), so cur stays 1
  • Window [1, 2, 3, 2, 4] is valid, ans = max(4, 4-0+1) = 5

Step 6: r = 5, element = 3

  • Add 3 to window: cnt = {1: 1, 2: 2, 3: 2, 4: 1}
  • cnt[3] becomes 2, so cur = 2 (3 is now also repeating)
  • Window [1, 2, 3, 2, 4, 3] is valid (cur = k), ans = max(5, 5-0+1) = 6

Step 7: r = 6, element = 5

  • Add 5 to window: cnt = {1: 1, 2: 2, 3: 2, 4: 1, 5: 1}
  • cnt[5] becomes 1 (not 2), so cur stays 2
  • Window [1, 2, 3, 2, 4, 3, 5] is valid (cur ≤ k), ans = max(6, 6-0+1) = 7

Final Result: The longest semi-repeating subarray has length 7 (the entire array).

In this example, we have exactly 2 repeating elements (2 and 3), which satisfies our constraint of k = 2. The sliding window never needed to shrink because we never exceeded k repeating elements.

Solution Implementation

1from collections import defaultdict
2from typing import List
3
4class Solution:
5    def longestSubarray(self, nums: List[int], k: int) -> int:
6        # Dictionary to track frequency of each element in current window
7        frequency_map = defaultdict(int)
8      
9        # Initialize variables
10        max_length = 0  # Maximum length of valid subarray found
11        duplicate_count = 0  # Count of elements that appear more than once
12        left_pointer = 0  # Left boundary of sliding window
13      
14        # Iterate through array with right pointer
15        for right_pointer, current_num in enumerate(nums):
16            # Add current element to window
17            frequency_map[current_num] += 1
18          
19            # If element now appears exactly twice, increment duplicate count
20            if frequency_map[current_num] == 2:
21                duplicate_count += 1
22          
23            # Shrink window from left while we have too many duplicates
24            while duplicate_count > k:
25                left_num = nums[left_pointer]
26                frequency_map[left_num] -= 1
27              
28                # If element was appearing twice and now appears once,
29                # decrement duplicate count
30                if frequency_map[left_num] == 1:
31                    duplicate_count -= 1
32              
33                left_pointer += 1
34          
35            # Update maximum length with current valid window size
36            current_window_size = right_pointer - left_pointer + 1
37            max_length = max(max_length, current_window_size)
38      
39        return max_length
40
1class Solution {
2    public int longestSubarray(int[] nums, int k) {
3        // HashMap to store the frequency count of each element in the current window
4        Map<Integer, Integer> frequencyMap = new HashMap<>();
5      
6        // Variable to track the maximum length of valid subarray found
7        int maxLength = 0;
8      
9        // Variable to track the count of elements that appear at least twice in current window
10        int elementsWithDuplicates = 0;
11      
12        // Left pointer for the sliding window
13        int left = 0;
14      
15        // Iterate through the array using right pointer
16        for (int right = 0; right < nums.length; right++) {
17            // Add current element to the window and update its frequency
18            // merge() returns the new value after merging
19            int newFrequency = frequencyMap.merge(nums[right], 1, Integer::sum);
20          
21            // If this element now appears exactly twice, increment the duplicate counter
22            if (newFrequency == 2) {
23                elementsWithDuplicates++;
24            }
25          
26            // Shrink the window from left while we have more than k elements with duplicates
27            while (elementsWithDuplicates > k) {
28                // Decrease frequency of the leftmost element
29                int updatedFrequency = frequencyMap.merge(nums[left], -1, Integer::sum);
30              
31                // If this element's frequency dropped from 2 to 1, decrement duplicate counter
32                if (updatedFrequency == 1) {
33                    elementsWithDuplicates--;
34                }
35              
36                // Move left pointer forward
37                left++;
38            }
39          
40            // Update maximum length with current valid window size
41            maxLength = Math.max(maxLength, right - left + 1);
42        }
43      
44        return maxLength;
45    }
46}
47
1class Solution {
2public:
3    int longestSubarray(vector<int>& nums, int k) {
4        // Hash map to store the frequency of each element in the current window
5        unordered_map<int, int> frequencyMap;
6      
7        // Variable to track the maximum length of valid subarray found
8        int maxLength = 0;
9      
10        // Variable to count elements that appear at least twice in the current window
11        int elementsWithDuplicates = 0;
12      
13        // Left pointer for the sliding window
14        int left = 0;
15      
16        // Iterate through the array with right pointer
17        for (int right = 0; right < nums.size(); ++right) {
18            // Add current element to the window and increment its frequency
19            frequencyMap[nums[right]]++;
20          
21            // If this element now appears exactly twice, increment the duplicate counter
22            if (frequencyMap[nums[right]] == 2) {
23                elementsWithDuplicates++;
24            }
25          
26            // Shrink the window from the left while we have too many elements with duplicates
27            while (elementsWithDuplicates > k) {
28                // Decrement the frequency of the leftmost element
29                frequencyMap[nums[left]]--;
30              
31                // If this element's frequency drops from 2 to 1, decrement the duplicate counter
32                if (frequencyMap[nums[left]] == 1) {
33                    elementsWithDuplicates--;
34                }
35              
36                // Move the left pointer forward
37                left++;
38            }
39          
40            // Update the maximum length with the current valid window size
41            maxLength = max(maxLength, right - left + 1);
42        }
43      
44        return maxLength;
45    }
46};
47
1/**
2 * Finds the longest subarray where at most k elements appear more than once
3 * @param nums - The input array of numbers
4 * @param k - Maximum number of elements that can appear more than once
5 * @returns The length of the longest valid subarray
6 */
7function longestSubarray(nums: number[], k: number): number {
8    // Map to track frequency of each element in current window
9    const frequencyMap: Map<number, number> = new Map();
10  
11    // maxLength: maximum subarray length found so far
12    // duplicateCount: count of elements appearing more than once in current window
13    // leftPointer: left boundary of sliding window
14    let maxLength: number = 0;
15    let duplicateCount: number = 0;
16    let leftPointer: number = 0;
17  
18    // Iterate through array with right pointer
19    for (let rightPointer = 0; rightPointer < nums.length; rightPointer++) {
20        // Add current element to frequency map
21        const currentElement: number = nums[rightPointer];
22        frequencyMap.set(currentElement, (frequencyMap.get(currentElement) || 0) + 1);
23      
24        // If element now appears exactly twice, increment duplicate count
25        if (frequencyMap.get(currentElement) === 2) {
26            duplicateCount++;
27        }
28      
29        // Shrink window from left while duplicate count exceeds k
30        while (duplicateCount > k) {
31            const leftElement: number = nums[leftPointer];
32            frequencyMap.set(leftElement, frequencyMap.get(leftElement)! - 1);
33          
34            // If element count drops to 1, it's no longer a duplicate
35            if (frequencyMap.get(leftElement) === 1) {
36                duplicateCount--;
37            }
38          
39            leftPointer++;
40        }
41      
42        // Update maximum length with current valid window size
43        maxLength = Math.max(maxLength, rightPointer - leftPointer + 1);
44    }
45  
46    return maxLength;
47}
48

Time and Space Complexity

Time Complexity: O(n)

The algorithm uses a sliding window approach with two pointers (l and r). The right pointer r iterates through the array exactly once via the enumerate function. The left pointer l only moves forward and never backwards, meaning each element is visited at most twice (once by r and once by l). Despite the nested while loop, the total number of operations across all iterations is bounded by 2n, resulting in O(n) time complexity where n is the length of the array nums.

Space Complexity: O(n)

The space complexity is dominated by the defaultdict(int) data structure cnt, which stores the frequency count of elements in the current window. In the worst case, when all elements in the array are distinct, the dictionary could contain up to n unique keys, where n is the length of the array nums. The other variables (ans, cur, l, r, x) use constant space O(1). Therefore, the overall space complexity is O(n).

Common Pitfalls

Pitfall 1: Incorrectly Tracking State Transitions

The Problem: A common mistake is to increment/decrement the duplicate counter at the wrong frequency thresholds. Developers often write:

# INCORRECT - increments for every occurrence beyond the first
if frequency_map[current_num] > 1:
    duplicate_count += 1

This would incorrectly increment duplicate_count every time we see a duplicate, not just when an element transitions from appearing once to appearing twice.

Why It Fails: If we have the array [1, 2, 2, 2], the element 2 would incorrectly contribute 2 to duplicate_count (when going from 1→2 and 2→3), even though it should only count as one repeating element.

The Correct Approach:

# CORRECT - only increments when transitioning from 1 to 2 occurrences
if frequency_map[current_num] == 2:
    duplicate_count += 1

Pitfall 2: Forgetting to Clean Up Zero-Frequency Elements

The Problem: When using a defaultdict, elements with zero frequency remain in the dictionary, potentially causing memory issues with large inputs or when the window slides extensively.

The Fix:

while duplicate_count > k:
    left_num = nums[left_pointer]
    frequency_map[left_num] -= 1
  
    if frequency_map[left_num] == 1:
        duplicate_count -= 1
    # Clean up elements with zero frequency
    elif frequency_map[left_num] == 0:
        del frequency_map[left_num]
  
    left_pointer += 1

Pitfall 3: Off-by-One Error in Window Size Calculation

The Problem: Calculating window size as right_pointer - left_pointer instead of right_pointer - left_pointer + 1.

Why It Matters: For a window from index 2 to 5, the size should be 4 elements (indices 2, 3, 4, 5), not 3.

The Correct Formula:

current_window_size = right_pointer - left_pointer + 1

Pitfall 4: Misunderstanding the Problem Constraint

The Problem: Confusing "at most k elements appear more than once" with:

  • "Each element can appear at most k times"
  • "The total number of duplicates is at most k"
  • "At most k pairs of duplicates"

Clarification: The constraint means we can have at most k distinct values that have frequency > 1 in our window. For example, with k=2:

  • Valid: [1, 2, 2, 3, 3, 4] - only elements 2 and 3 repeat
  • Invalid: [1, 1, 2, 2, 3, 3] - three elements (1, 2, 3) repeat

Pitfall 5: Not Handling Edge Cases

The Problem: Failing to consider special cases like:

  • Empty array or single element array
  • k = 0 (no repeating elements allowed)
  • All elements are unique
  • All elements are the same

Robust Solution Structure:

def longestSubarray(self, nums: List[int], k: int) -> int:
    # Handle edge cases
    if not nums:
        return 0
    if len(nums) == 1:
        return 1
  
    # Main sliding window logic...

These pitfalls highlight the importance of carefully tracking state transitions and understanding exactly what the problem is asking for, rather than making assumptions about the constraint interpretation.

Discover Your Strengths and Weaknesses: Take Our 3-Minute Quiz to Tailor Your Study Plan:

Which of the following shows the order of node visit in a Breadth-first Search?


Recommended Readings

Want a Structured Path to Master System Design Too? Don’t Miss This!

Load More