Facebook Pixel

604. Design Compressed String Iterator 🔒

EasyDesignArrayStringIterator
Leetcode Link

Problem Description

You need to design a data structure that can iterate through a compressed string. The compressed string follows a specific format where each letter is immediately followed by a positive integer indicating how many times that letter appears.

For example, if you have a compressed string like "L1e2t1C1o1d1e1", it represents the uncompressed string "LeetCode" where:

  • L appears 1 time
  • e appears 2 times
  • t appears 1 time
  • C appears 1 time
  • o appears 1 time
  • d appears 1 time
  • e appears 1 time

Your StringIterator class should support two operations:

  1. next(): Returns the next character from the uncompressed sequence. If all characters have been consumed, it returns a white space ' '.

  2. hasNext(): Returns true if there are still characters left to be returned from the uncompressed string, false otherwise.

The solution parses the compressed string during initialization by:

  • Extracting each character c and its count x
  • Storing these pairs in a list d as [character, count]
  • Using a pointer p to track which character-count pair is currently being processed

When next() is called:

  • It returns the current character if available
  • Decrements the count for that character
  • Moves to the next character-count pair when the current count reaches zero

The hasNext() method checks if there are any remaining characters by verifying if the pointer is within bounds and the current character still has a positive count.

Quick Interview Experience
Help others by sharing your interview experience
Have you seen this problem before?

Intuition

When we look at this problem, we need to think about how to efficiently handle a compressed string where we might need to return the same character multiple times. The key insight is that we don't want to actually decompress the entire string upfront - that would be wasteful in terms of space, especially if the compressed string represents millions of characters (like "a1000000").

Instead, we can think of the compressed string as a series of "chunks" where each chunk consists of a character and how many times it should be repeated. For example, "L1e2t1" can be thought of as three chunks: [L, 1], [e, 2], and [t, 1].

The natural approach is to parse these chunks once during initialization and store them in a structured format. This way, we only need to keep track of:

  1. Which chunk we're currently reading from (using a pointer p)
  2. How many characters from the current chunk we've already consumed

When someone calls next(), we don't need to search through the string again - we just:

  • Look at the current chunk
  • Return its character
  • Decrease the count by 1
  • If the count reaches 0, move to the next chunk

This approach gives us O(1) time complexity for both next() and hasNext() operations after the initial parsing. The parsing itself takes O(n) where n is the length of the compressed string, but this only happens once during initialization.

The beauty of this solution is that it maintains the compression benefits - we're not expanding "a1000000" into a million 'a's in memory. We're just storing ['a', 1000000] and keeping track of how many we've used.

Solution Approach

The implementation follows a parsing and storing strategy with efficient iteration tracking.

Initialization Phase:

During initialization of StringIterator, we parse the compressed string into a structured format:

  1. Create an empty list self.d to store character-count pairs
  2. Initialize a pointer self.p = 0 to track the current position in our parsed data
  3. Iterate through the compressed string with index i:
    • Extract the current character c = compressedString[i]
    • Move to the next position and start building the count x = 0
    • Continue reading digits and build the number: x = x * 10 + int(compressedString[i])
    • Once we've read all digits for this count, store the pair [c, x] in our list
    • Repeat until the entire string is parsed

For example, parsing "L1e2t1" would produce: d = [['L', 1], ['e', 2], ['t', 1]]

Next Operation:

The next() method retrieves characters one at a time:

  1. First check if there are any characters left using hasNext()
  2. If no characters remain, return a space character ' '
  3. Otherwise:
    • Get the character from the current position: ans = self.d[self.p][0]
    • Decrement the count for this character: self.d[self.p][1] -= 1
    • If the count reaches 0, move the pointer to the next character-count pair: self.p += 1
    • Return the character

HasNext Operation:

The hasNext() method checks availability:

  • Return True if:
    • The pointer p is within the bounds of our list (p < len(self.d))
    • AND the current character still has a positive count (self.d[self.p][1] > 0)
  • Otherwise return False

This design ensures that we maintain the compressed representation in memory while providing efficient O(1) access to each character. The space complexity is O(k) where k is the number of character-count pairs in the compressed string, which is much better than storing the fully decompressed string.

Ready to land your dream job?

Unlock your dream job with a 3-minute evaluator for a personalized learning plan!

Start Evaluator

Example Walkthrough

Let's walk through the solution with the compressed string "a2b1c3", which represents "aabccc".

Initialization:

  • Start with compressedString = "a2b1c3"
  • Create empty list d = [] and set pointer p = 0
  • Parse the string:
    • i=0: Read 'a', then read '2' → Store ['a', 2]
    • i=2: Read 'b', then read '1' → Store ['b', 1]
    • i=4: Read 'c', then read '3' → Store ['c', 3]
  • Final state: d = [['a', 2], ['b', 1], ['c', 3]], p = 0

Operation Sequence:

  1. Call next():

    • Check hasNext() → True (p=0 < 3 and d[0][1]=2 > 0)
    • Return 'a' from d[0][0]
    • Decrement: d[0][1] = 2-1 = 1
    • State: d = [['a', 1], ['b', 1], ['c', 3]], p = 0
  2. Call next():

    • Check hasNext() → True (p=0 < 3 and d[0][1]=1 > 0)
    • Return 'a' from d[0][0]
    • Decrement: d[0][1] = 1-1 = 0
    • Count is 0, so increment p: p = 1
    • State: d = [['a', 0], ['b', 1], ['c', 3]], p = 1
  3. Call hasNext():

    • Check: p=1 < 3 and d[1][1]=1 > 0
    • Return True
  4. Call next():

    • Check hasNext() → True
    • Return 'b' from d[1][0]
    • Decrement: d[1][1] = 1-1 = 0
    • Count is 0, so increment p: p = 2
    • State: d = [['a', 0], ['b', 0], ['c', 3]], p = 2
  5. Call next() three times for 'c':

    • Each call returns 'c' and decrements d[2][1]
    • After third call: d[2][1] = 0, p increments to 3
    • State: d = [['a', 0], ['b', 0], ['c', 0]], p = 3
  6. Call hasNext():

    • Check: p=3 is NOT < 3
    • Return False
  7. Call next():

    • Check hasNext() → False
    • Return ' ' (space character)

This walkthrough demonstrates how the solution efficiently tracks position without decompressing the entire string, maintaining O(1) operations for both next() and hasNext().

Solution Implementation

1class StringIterator:
2    def __init__(self, compressedString: str):
3        """
4        Initialize the iterator with a compressed string.
5        The compressed string format is: character followed by its count.
6        Example: "L1e2t1" represents "Leet"
7        """
8        # Store parsed character-count pairs
9        self.char_count_pairs = []
10        # Current position in the pairs list
11        self.current_index = 0
12      
13        # Parse the compressed string
14        string_length = len(compressedString)
15        i = 0
16      
17        while i < string_length:
18            # Extract the character
19            character = compressedString[i]
20            i += 1
21          
22            # Extract the count (which may be multiple digits)
23            count = 0
24            while i < string_length and compressedString[i].isdigit():
25                count = count * 10 + int(compressedString[i])
26                i += 1
27          
28            # Store the character-count pair
29            self.char_count_pairs.append([character, count])
30
31    def next(self) -> str:
32        """
33        Return the next character in the uncompressed sequence.
34        Returns ' ' if there are no more characters.
35        """
36        # Check if there are more characters available
37        if not self.hasNext():
38            return ' '
39      
40        # Get the current character
41        result_char = self.char_count_pairs[self.current_index][0]
42      
43        # Decrement the count for this character
44        self.char_count_pairs[self.current_index][1] -= 1
45      
46        # Move to next character-count pair if current count reaches 0
47        if self.char_count_pairs[self.current_index][1] == 0:
48            self.current_index += 1
49      
50        return result_char
51
52    def hasNext(self) -> bool:
53        """
54        Check if there are more characters to iterate.
55        Returns True if there are more characters, False otherwise.
56        """
57        # Check if current index is valid and current character has remaining count
58        return (self.current_index < len(self.char_count_pairs) and 
59                self.char_count_pairs[self.current_index][1] > 0)
60
61
62# Your StringIterator object will be instantiated and called as such:
63# obj = StringIterator(compressedString)
64# param_1 = obj.next()
65# param_2 = obj.hasNext()
66
1/**
2 * Iterator for compressed strings where each letter is followed by its count.
3 * Example: "L1e2t1" represents "Leet"
4 */
5class StringIterator {
6    // List to store character-count pairs from the compressed string
7    private List<CharCountPair> charCountPairs = new ArrayList<>();
8    // Current position/index in the list of character-count pairs
9    private int currentIndex;
10
11    /**
12     * Constructor that parses the compressed string into character-count pairs.
13     * @param compressedString The compressed string in format like "L1e2t1"
14     */
15    public StringIterator(String compressedString) {
16        int length = compressedString.length();
17        int i = 0;
18      
19        // Parse the compressed string
20        while (i < length) {
21            // Extract the character
22            char character = compressedString.charAt(i);
23          
24            // Extract the count number following the character
25            int count = 0;
26            i++; // Move to the first digit
27            while (i < length && Character.isDigit(compressedString.charAt(i))) {
28                count = count * 10 + (compressedString.charAt(i) - '0');
29                i++;
30            }
31          
32            // Store the character-count pair
33            charCountPairs.add(new CharCountPair(character, count));
34        }
35      
36        currentIndex = 0;
37    }
38
39    /**
40     * Returns the next character in the uncompressed sequence.
41     * @return The next character, or ' ' if no more characters exist
42     */
43    public char next() {
44        // Check if there are more characters available
45        if (!hasNext()) {
46            return ' ';
47        }
48      
49        // Get the current character
50        char result = charCountPairs.get(currentIndex).character;
51      
52        // Decrement the count for this character
53        charCountPairs.get(currentIndex).count--;
54      
55        // Move to next character-count pair if current count reaches 0
56        if (charCountPairs.get(currentIndex).count == 0) {
57            currentIndex++;
58        }
59      
60        return result;
61    }
62
63    /**
64     * Checks if there are more characters to iterate.
65     * @return true if more characters exist, false otherwise
66     */
67    public boolean hasNext() {
68        return currentIndex < charCountPairs.size() && 
69               charCountPairs.get(currentIndex).count > 0;
70    }
71}
72
73/**
74 * Helper class to store a character and its occurrence count.
75 */
76class CharCountPair {
77    char character;  // The character
78    int count;        // Number of times the character appears
79  
80    /**
81     * Constructor for CharCountPair.
82     * @param character The character to store
83     * @param count The number of occurrences of the character
84     */
85    CharCountPair(char character, int count) {
86        this.character = character;
87        this.count = count;
88    }
89}
90
91/**
92 * Your StringIterator object will be instantiated and called as such:
93 * StringIterator obj = new StringIterator(compressedString);
94 * char param_1 = obj.next();
95 * boolean param_2 = obj.hasNext();
96 */
97
1class StringIterator {
2public:
3    /**
4     * Constructor: Decompresses the input string into character-count pairs
5     * @param compressedString: A string in format "char1count1char2count2..."
6     *                          e.g., "L1e2t1C1o1d1e1" represents "LeetCode"
7     */
8    StringIterator(string compressedString) {
9        int stringLength = compressedString.size();
10        int index = 0;
11      
12        // Parse the compressed string
13        while (index < stringLength) {
14            // Extract the character
15            char character = compressedString[index];
16          
17            // Extract the count (may be multi-digit)
18            int count = 0;
19            index++; // Move past the character
20            while (index < stringLength && isdigit(compressedString[index])) {
21                count = count * 10 + (compressedString[index] - '0');
22                index++;
23            }
24          
25            // Store the character-count pair
26            charCountPairs.push_back({character, count});
27        }
28    }
29  
30    /**
31     * Returns the next character in the uncompressed sequence
32     * @return: The next character, or ' ' if no more characters exist
33     */
34    char next() {
35        // Check if there are remaining characters
36        if (!hasNext()) {
37            return ' ';
38        }
39      
40        // Get the current character
41        char result = charCountPairs[currentPairIndex].first;
42      
43        // Decrement the count for current character
44        charCountPairs[currentPairIndex].second--;
45      
46        // Move to next pair if current count reaches 0
47        if (charCountPairs[currentPairIndex].second == 0) {
48            currentPairIndex++;
49        }
50      
51        return result;
52    }
53  
54    /**
55     * Checks if there are more characters to iterate
56     * @return: true if more characters exist, false otherwise
57     */
58    bool hasNext() {
59        // Check if current index is valid and has remaining count
60        return currentPairIndex < charCountPairs.size() && 
61               charCountPairs[currentPairIndex].second > 0;
62    }
63
64private:
65    // Vector storing pairs of (character, remaining count)
66    vector<pair<char, int>> charCountPairs;
67  
68    // Index of the current character-count pair being processed
69    int currentPairIndex = 0;
70};
71
72/**
73 * Your StringIterator object will be instantiated and called as such:
74 * StringIterator* obj = new StringIterator(compressedString);
75 * char param_1 = obj->next();
76 * bool param_2 = obj->hasNext();
77 */
78
1// Array to store character-count pairs from the compressed string
2let characterCountPairs: [string, number][] = [];
3// Current position pointer in the characterCountPairs array
4let currentPosition: number = 0;
5
6/**
7 * Initializes the iterator with a compressed string
8 * @param compressedString - String in format like "L1e2t1" where each letter is followed by its count
9 */
10function StringIterator(compressedString: string): void {
11    // Reset state for new initialization
12    characterCountPairs = [];
13    currentPosition = 0;
14  
15    const stringLength = compressedString.length;
16    let index = 0;
17  
18    // Parse the compressed string into character-count pairs
19    while (index < stringLength) {
20        // Extract the character
21        const character = compressedString[index];
22        index++;
23      
24        // Extract the count number that follows the character
25        let count = 0;
26        while (index < stringLength && !isNaN(Number(compressedString[index]))) {
27            count = count * 10 + Number(compressedString[index]);
28            index++;
29        }
30      
31        // Store the character-count pair
32        characterCountPairs.push([character, count]);
33    }
34}
35
36/**
37 * Returns the next character from the iterator
38 * @returns The next character, or ' ' if no more characters are available
39 */
40function next(): string {
41    // Check if there are more characters available
42    if (!hasNext()) {
43        return ' ';
44    }
45  
46    // Get the character at current position
47    const nextCharacter = characterCountPairs[currentPosition][0];
48  
49    // Decrement the count for this character
50    characterCountPairs[currentPosition][1]--;
51  
52    // Move to next character-count pair if current count reaches zero
53    if (characterCountPairs[currentPosition][1] === 0) {
54        currentPosition++;
55    }
56  
57    return nextCharacter;
58}
59
60/**
61 * Checks if there are more characters available in the iterator
62 * @returns true if more characters are available, false otherwise
63 */
64function hasNext(): boolean {
65    // Check if current position is valid and has remaining count
66    return currentPosition < characterCountPairs.length && 
67           characterCountPairs[currentPosition][1] > 0;
68}
69

Time and Space Complexity

Time Complexity:

  • __init__: O(n) where n is the length of compressedString. The initialization parses the entire compressed string once, extracting each character and its count. Even though there's a nested while loop for parsing digits, each character in the string is visited exactly once.
  • next(): O(1). This method performs constant time operations - checking conditions, accessing array elements by index, decrementing a counter, and potentially incrementing the pointer.
  • hasNext(): O(1). This method only performs a simple comparison and array access by index.

Space Complexity:

  • O(k) where k is the number of character-count pairs in the compressed string. The self.d list stores pairs of [character, count]. In the worst case where every character appears once (like "a1b1c1..."), k could be n/2. In the best case with long runs (like "a1000"), k would be much smaller than n. The space used is proportional to the number of unique character segments, not the expanded string length.

Learn more about how to find time and space complexity quickly.

Common Pitfalls

1. Not Handling Multi-Digit Counts Correctly

One of the most common mistakes is assuming each count is a single digit. The compressed string can have counts like "a15b20" where 'a' appears 15 times and 'b' appears 20 times.

Incorrect Implementation:

# Wrong: Only reads single digit
count = int(compressedString[i])
i += 1

Correct Implementation:

# Correct: Reads all consecutive digits
count = 0
while i < string_length and compressedString[i].isdigit():
    count = count * 10 + int(compressedString[i])
    i += 1

2. Modifying Original Data Structure During Iteration

Another pitfall is directly modifying the count in place without considering that multiple iterator instances might exist or that the modification affects the original structure permanently.

Better Alternative Approach:

class StringIterator:
    def __init__(self, compressedString: str):
        self.char_count_pairs = []
        # ... parsing logic ...
      
        # Keep track of remaining count separately
        self.remaining_count = 0
        self.current_char = None
        if self.char_count_pairs:
            self.current_char = self.char_count_pairs[0][0]
            self.remaining_count = self.char_count_pairs[0][1]
  
    def next(self) -> str:
        if not self.hasNext():
            return ' '
      
        result = self.current_char
        self.remaining_count -= 1
      
        # Move to next pair if needed
        if self.remaining_count == 0:
            self.current_index += 1
            if self.current_index < len(self.char_count_pairs):
                self.current_char = self.char_count_pairs[self.current_index][0]
                self.remaining_count = self.char_count_pairs[self.current_index][1]
      
        return result

3. Index Out of Bounds When Checking hasNext()

Failing to properly check bounds before accessing array elements can cause runtime errors.

Problematic Pattern:

def hasNext(self) -> bool:
    # Wrong: May cause index error if current_index >= len(self.char_count_pairs)
    return self.char_count_pairs[self.current_index][1] > 0

Safe Implementation:

def hasNext(self) -> bool:
    # Correct: Check bounds first
    return (self.current_index < len(self.char_count_pairs) and 
            self.char_count_pairs[self.current_index][1] > 0)

4. Not Handling Edge Cases

Empty strings or strings with zero counts need special consideration:

def __init__(self, compressedString: str):
    self.char_count_pairs = []
    self.current_index = 0
  
    # Handle empty string
    if not compressedString:
        return
  
    # ... parsing logic ...
  
    # Skip any pairs with 0 count during parsing
    if count > 0:
        self.char_count_pairs.append([character, count])

5. Memory Inefficiency with Large Counts

While the current solution is memory-efficient, an alternative lazy parsing approach could be even better for very large compressed strings:

class StringIterator:
    def __init__(self, compressedString: str):
        self.compressed = compressedString
        self.pos = 0  # Position in compressed string
        self.current_char = None
        self.current_count = 0
        self._advance()
  
    def _advance(self):
        """Parse next character-count pair on demand"""
        if self.pos >= len(self.compressed):
            self.current_char = None
            self.current_count = 0
            return
      
        self.current_char = self.compressed[self.pos]
        self.pos += 1
      
        count = 0
        while self.pos < len(self.compressed) and self.compressed[self.pos].isdigit():
            count = count * 10 + int(self.compressed[self.pos])
            self.pos += 1
      
        self.current_count = count

This lazy approach only parses what's needed, saving memory when dealing with very long compressed strings where only a small portion might be accessed.

Discover Your Strengths and Weaknesses: Take Our 3-Minute Quiz to Tailor Your Study Plan:

Consider the classic dynamic programming of fibonacci numbers, what is the recurrence relation?


Recommended Readings

Want a Structured Path to Master System Design Too? Don’t Miss This!

Load More