830. Positions of Large Groups
Problem Description
You are given a string s
consisting of lowercase letters. In this string, consecutive identical characters form groups.
For example, in the string s = "abbxxxxzyy"
:
"a"
forms a group at position[0, 0]
"bb"
forms a group at position[1, 2]
"xxxx"
forms a group at position[3, 6]
"z"
forms a group at position[7, 7]
"yy"
forms a group at position[8, 9]
Each group is represented by an interval [start, end]
where:
start
is the index of the first character in the groupend
is the index of the last character in the group (inclusive)
A group is considered large if it contains 3 or more characters.
Your task is to find all the large groups in the string and return their intervals as a list of [start, end]
pairs. The result should be sorted in increasing order by the start index.
For the example above, only "xxxx"
has 3 or more characters, so the output would be [[3, 6]]
.
Intuition
The key insight is that we need to identify consecutive sequences of the same character and track their boundaries. Think of scanning through the string and grouping identical adjacent characters together.
We can use a two-pointer approach where one pointer marks the start of a group and another pointer extends forward as long as it encounters the same character. Once we hit a different character, we've found the complete group.
For each group we identify, we need to:
- Record where it starts (position
i
) - Find where it ends by moving forward while characters remain the same (position
j-1
) - Check if the group size
(j - i)
is at least 3 - If it's large enough, save the interval
[i, j-1]
After processing one group, we move to the next unprocessed character and repeat. This naturally gives us the groups in order since we're scanning left to right through the string.
The beauty of this approach is its simplicity - we make a single pass through the string, and for each position, we either extend the current group or start a new one. Since we process characters sequentially from left to right, the resulting intervals are automatically sorted by start position.
Solution Approach
The solution implements a two-pointer technique to identify and measure consecutive character groups:
Algorithm Steps:
-
Initialize pointers and result list: Start with pointer
i = 0
to track the beginning of each group, and create an empty listans
to store the intervals of large groups. -
Outer loop - Process each group: While
i < n
(wheren
is the string length):- Set
j = i
to start exploring the current group - Inner loop - Find group boundary: While
j < n
ands[j] == s[i]
, incrementj
. This extendsj
to find all consecutive characters that matchs[i]
- After the inner loop,
j
points to the first character that differs froms[i]
(or the end of string)
- Set
-
Check group size: Calculate the group size as
j - i
:- If
j - i >= 3
, the group is large, so append[i, j-1]
to the result - Note that we use
j-1
as the end position becausej
points one position past the last matching character
- If
-
Move to next group: Set
i = j
to start processing the next group -
Return result: The list
ans
contains all large group intervals in order
Example Walkthrough:
For s = "abbxxxxzyy"
:
i=0, j=0→1
: Group "a" has size 1 (not large)i=1, j=1→3
: Group "bb" has size 2 (not large)i=3, j=3→7
: Group "xxxx" has size 4 (large), add[3, 6]
i=7, j=7→8
: Group "z" has size 1 (not large)i=8, j=8→10
: Group "yy" has size 2 (not large)
The time complexity is O(n)
since each character is visited exactly once, and space complexity is O(1)
excluding the output array.
Ready to land your dream job?
Unlock your dream job with a 3-minute evaluator for a personalized learning plan!
Start EvaluatorExample Walkthrough
Let's trace through the algorithm with s = "aabbbbaaca"
:
Initial Setup:
- String:
"aabbbbaaca"
(length n = 10) - Initialize
i = 0
,ans = []
Step 1: Process first group starting at i=0
- Set
j = 0
- Inner loop:
s[0] = 'a'
, so extend while we see 'a':j=0
:s[0] = 'a'
equalss[0]
, increment j → 1j=1
:s[1] = 'a'
equalss[0]
, increment j → 2j=2
:s[2] = 'b'
differs froms[0]
, stop
- Group size:
j - i = 2 - 0 = 2
(not large) - Move to next:
i = j = 2
Step 2: Process group starting at i=2
- Set
j = 2
- Inner loop:
s[2] = 'b'
, extend while we see 'b':j=2
:s[2] = 'b'
equalss[2]
, increment j → 3j=3
:s[3] = 'b'
equalss[2]
, increment j → 4j=4
:s[4] = 'b'
equalss[2]
, increment j → 5j=5
:s[5] = 'b'
equalss[2]
, increment j → 6j=6
:s[6] = 'a'
differs froms[2]
, stop
- Group size:
j - i = 6 - 2 = 4
(large!) - Add interval
[2, 5]
to ans (note: end index is j-1) - Move to next:
i = j = 6
Step 3: Process group starting at i=6
- Set
j = 6
- Inner loop:
s[6] = 'a'
, extend while we see 'a':j=6
:s[6] = 'a'
equalss[6]
, increment j → 7j=7
:s[7] = 'a'
equalss[6]
, increment j → 8j=8
:s[8] = 'c'
differs froms[6]
, stop
- Group size:
j - i = 8 - 6 = 2
(not large) - Move to next:
i = j = 8
Step 4: Process group starting at i=8
- Set
j = 8
- Inner loop:
s[8] = 'c'
, extend while we see 'c':j=8
:s[8] = 'c'
equalss[8]
, increment j → 9j=9
:s[9] = 'a'
differs froms[8]
, stop
- Group size:
j - i = 9 - 8 = 1
(not large) - Move to next:
i = j = 9
Step 5: Process final group starting at i=9
- Set
j = 9
- Inner loop:
s[9] = 'a'
, extend while we see 'a':j=9
:s[9] = 'a'
equalss[9]
, increment j → 10j=10
: reached end of string, stop
- Group size:
j - i = 10 - 9 = 1
(not large) - Move to next:
i = j = 10
Termination: i = 10 >= n
, exit outer loop
Final Result: ans = [[2, 5]]
The algorithm correctly identified "bbbb" at positions 2-5 as the only large group with 4 consecutive characters.
Solution Implementation
1class Solution:
2 def largeGroupPositions(self, s: str) -> List[List[int]]:
3 """
4 Find all positions of large groups in string s.
5 A large group is a group of 3 or more consecutive identical characters.
6
7 Args:
8 s: Input string to search for large groups
9
10 Returns:
11 List of [start, end] positions for each large group
12 """
13 # Initialize pointer and get string length
14 current_index = 0
15 string_length = len(s)
16 result = []
17
18 # Iterate through the string
19 while current_index < string_length:
20 # Start of current group
21 group_start = current_index
22
23 # Find the end of current group of identical characters
24 while current_index < string_length and s[current_index] == s[group_start]:
25 current_index += 1
26
27 # Check if group is large (3 or more characters)
28 group_size = current_index - group_start
29 if group_size >= 3:
30 # Add [start, end] position to result (end is inclusive)
31 result.append([group_start, current_index - 1])
32
33 # Move to the next group (current_index already points to next different character)
34
35 return result
36
1class Solution {
2 public List<List<Integer>> largeGroupPositions(String s) {
3 int stringLength = s.length();
4 int currentIndex = 0;
5 List<List<Integer>> result = new ArrayList<>();
6
7 // Iterate through the string to find groups of consecutive characters
8 while (currentIndex < stringLength) {
9 // Mark the start position of current group
10 int groupEndIndex = currentIndex;
11
12 // Extend the group while characters are the same
13 while (groupEndIndex < stringLength &&
14 s.charAt(groupEndIndex) == s.charAt(currentIndex)) {
15 groupEndIndex++;
16 }
17
18 // Check if the group size is at least 3 (large group)
19 int groupSize = groupEndIndex - currentIndex;
20 if (groupSize >= 3) {
21 // Add the interval [start, end] to result
22 // Note: groupEndIndex is one past the last matching character
23 result.add(List.of(currentIndex, groupEndIndex - 1));
24 }
25
26 // Move to the next group
27 currentIndex = groupEndIndex;
28 }
29
30 return result;
31 }
32}
33
1class Solution {
2public:
3 vector<vector<int>> largeGroupPositions(string s) {
4 int stringLength = s.size();
5 int currentIndex = 0;
6 vector<vector<int>> result;
7
8 // Iterate through the string to find groups of consecutive identical characters
9 while (currentIndex < stringLength) {
10 int groupEndIndex = currentIndex;
11
12 // Extend the group while characters are the same
13 while (groupEndIndex < stringLength && s[groupEndIndex] == s[currentIndex]) {
14 ++groupEndIndex;
15 }
16
17 // Check if the group size is at least 3 (large group)
18 int groupSize = groupEndIndex - currentIndex;
19 if (groupSize >= 3) {
20 // Add the start and end indices of the large group
21 result.push_back({currentIndex, groupEndIndex - 1});
22 }
23
24 // Move to the next group
25 currentIndex = groupEndIndex;
26 }
27
28 return result;
29 }
30};
31
1/**
2 * Finds all positions of large groups in a string.
3 * A large group is a group of 3 or more consecutive identical characters.
4 * @param s - The input string to search for large groups
5 * @returns An array of [start, end] positions for each large group
6 */
7function largeGroupPositions(s: string): number[][] {
8 const stringLength: number = s.length;
9 const result: number[][] = [];
10
11 // Iterate through the string to find groups of identical characters
12 for (let startIndex: number = 0; startIndex < stringLength; ) {
13 let endIndex: number = startIndex;
14
15 // Extend endIndex while characters are the same as the character at startIndex
16 while (endIndex < stringLength && s[endIndex] === s[startIndex]) {
17 endIndex++;
18 }
19
20 // Check if the group size is 3 or more (large group)
21 const groupSize: number = endIndex - startIndex;
22 if (groupSize >= 3) {
23 // Add the group's start and end positions (end position is inclusive)
24 result.push([startIndex, endIndex - 1]);
25 }
26
27 // Move to the next different character
28 startIndex = endIndex;
29 }
30
31 return result;
32}
33
Time and Space Complexity
Time Complexity: O(n)
, where n
is the length of the string s
.
The algorithm uses a two-pointer approach with variables i
and j
. The outer while loop iterates through the string with pointer i
, and the inner while loop advances pointer j
to find consecutive identical characters. Although there are nested loops, each character in the string is visited exactly once by pointer j
. Once j
moves forward, it never goes back, making the total number of operations proportional to n
. Therefore, the overall time complexity is linear.
Space Complexity: O(1)
for auxiliary space, or O(k)
for the output space, where k
is the number of large groups found.
The algorithm uses only a constant amount of extra space for variables i
, j
, and n
. However, the output list ans
stores the results, which in the worst case could contain O(n/3)
groups (when the string consists of alternating groups of exactly 3 identical characters). Since we typically don't count the space required for the output in space complexity analysis, the auxiliary space complexity is O(1)
.
Learn more about how to find time and space complexity quickly.
Common Pitfalls
1. Off-by-One Error in End Position
One of the most common mistakes is incorrectly calculating the end position of a large group. After the inner loop, the pointer j
(or current_index
) points to the first character that differs from the group or one position past the string's end. Many developers forget to subtract 1 when recording the end position.
Incorrect Implementation:
# Wrong: Using current_index directly as the end position if group_size >= 3: result.append([group_start, current_index]) # ❌ Should be current_index - 1
Why it's wrong: If we have "xxxx" at positions [3, 6], after the inner loop, current_index
would be 7 (pointing to the next different character). Using 7 as the end position would incorrectly suggest the group extends to index 7.
Correct Implementation:
# Correct: Subtracting 1 from current_index for the actual end position if group_size >= 3: result.append([group_start, current_index - 1]) # ✓ Correct
2. Infinite Loop Due to Missing Pointer Update
Another pitfall occurs when developers use a separate variable for the inner loop but forget to update the main loop pointer, causing an infinite loop.
Incorrect Implementation:
i = 0
while i < len(s):
j = i
while j < len(s) and s[j] == s[i]:
j += 1
if j - i >= 3:
result.append([i, j - 1])
# Forgot to update i! This causes infinite loop
# i = j # ❌ Missing this line
Correct Implementation:
i = 0
while i < len(s):
j = i
while j < len(s) and s[j] == s[i]:
j += 1
if j - i >= 3:
result.append([i, j - 1])
i = j # ✓ Essential: Move to the next group
3. Edge Case: Empty String or Single Character
While the algorithm handles these cases correctly by default, developers sometimes add unnecessary special case handling that can introduce bugs.
Unnecessary and Error-Prone:
# Overcomplicated with unnecessary checks
if not s or len(s) < 3: # ❌ Unnecessary special casing
return []
# Rest of the algorithm...
Better Approach: The main algorithm naturally handles edge cases:
- Empty string: The while loop never executes
- Single character: Group size is 1, which is less than 3
- Two characters: Maximum group size is 2, still less than 3
The clean implementation without special cases is more maintainable and less error-prone.
How does quick sort divide the problem into subproblems?
Recommended Readings
Coding Interview Patterns Your Personal Dijkstra's Algorithm to Landing Your Dream Job The goal of AlgoMonster is to help you get a job in the shortest amount of time possible in a data driven way We compiled datasets of tech interview problems and broke them down by patterns This way
Recursion Recursion is one of the most important concepts in computer science Simply speaking recursion is the process of a function calling itself Using a real life analogy imagine a scenario where you invite your friends to lunch https assets algo monster recursion jpg You first call Ben and ask
Runtime Overview When learning about algorithms and data structures you'll frequently encounter the term time complexity This concept is fundamental in computer science and offers insights into how long an algorithm takes to complete given a certain input size What is Time Complexity Time complexity represents the amount of time
Want a Structured Path to Master System Design Too? Don’t Miss This!