Facebook Pixel

722. Remove Comments

MediumArrayString
Leetcode Link

Problem Description

You need to remove all comments from a C++ program. The program is given as an array of strings source, where each string represents one line of the source code.

There are two types of comments in C++:

  1. Line comments: Starting with "//", everything from this point to the end of the current line should be ignored.

  2. Block comments: Starting with "/*" and ending with "*/", all characters between these markers should be ignored, even if they span multiple lines. Note that "/*/" does not end a block comment because the ending would overlap with the beginning.

Important rules:

  • The first comment marker encountered takes precedence. If "//" appears inside a block comment, it's ignored. Similarly, if "/*" appears inside any comment, it's also ignored.
  • After removing comments, empty lines should not be included in the output.
  • Every string in the result must be non-empty.
  • Block comments can span multiple lines and effectively join parts of different lines together.
  • Every "/*" is guaranteed to have a matching "*/".

The algorithm processes the source code line by line, maintaining a flag blockComment to track whether we're currently inside a block comment. For each line:

  • If inside a block comment, look for "*/" to end it
  • If not in a block comment, check for "/*" to start one or "//" to ignore the rest of the line
  • Valid characters outside comments are collected and form the output lines

The solution returns the source code with all comments removed, preserving the original format but excluding empty lines.

Quick Interview Experience
Help others by sharing your interview experience
Have you seen this problem before?

Intuition

The key insight is that we need to process the source code character by character while maintaining state about whether we're currently inside a block comment. This is necessary because block comments can span multiple lines, so we can't process each line independently.

Think of it like reading through the code with a highlighter - we need to know at any point whether we should be "highlighting" (ignoring) text or not. The challenge is that comment markers have different behaviors:

  • "/*" starts a highlighting region that continues until we find "*/"
  • "//" only highlights until the end of the current line
  • Once we're highlighting (in a comment), other comment markers don't matter

This naturally leads to a state machine approach where we track whether we're in a block comment (blockComment flag). As we scan through each character:

  • If we're in a block comment, we only care about finding "*/" to exit
  • If we're not in a block comment, we need to check for both "/*" and "//"

The tricky part is handling how block comments can merge lines. When a block comment spans multiple lines like:

line1 /* comment starts
comment continues */ line2

The result should be "line1 line2" on a single line. This is why we use a temporary buffer t to accumulate valid characters. We only create a new output line when:

  1. We finish processing a source line
  2. We're not currently in a block comment
  3. We have accumulated some valid characters

By processing character by character and maintaining these states, we can correctly handle all the edge cases of nested appearances and multi-line comments.

Solution Approach

We implement a character-by-character scanning approach with state tracking. Here's how the solution works:

Data Structures Used:

  • ans: List to store the final result (lines with comments removed)
  • t: Temporary list to accumulate valid characters for the current line being built
  • block_comment: Boolean flag to track if we're currently inside a block comment

Algorithm Steps:

  1. Initialize State: Start with block_comment = False and an empty temporary buffer t.

  2. Process Each Line: For each line s in the source:

    • Set index i = 0 and get line length m = len(s)
    • Process characters one by one using a while loop
  3. Character Processing Logic:

    Case 1: Inside Block Comment (block_comment == True):

    • Check if current and next characters form "*/"
    • If yes: Set block_comment = False and skip both characters (i += 1 extra)
    • If no: Just move to next character (ignore current character)

    Case 2: Outside Block Comment (block_comment == False):

    • Check if current and next characters form "/*":
      • If yes: Set block_comment = True and skip both characters
    • Check if current and next characters form "//":
      • If yes: Break from current line (rest of line is commented)
    • Otherwise:
      • Current character is valid, append to t
  4. Line Completion: After processing all characters in a line:

    • If block_comment == False and t is not empty:
      • Join characters in t to form a string and add to ans
      • Clear t for next line
    • If still in block comment or t is empty, continue to next line without outputting
  5. Return Result: Return ans containing all non-empty lines after comment removal

Key Implementation Details:

  • We check i + 1 < m before accessing two-character sequences to avoid index out of bounds
  • The i += 1 at the end of the main loop moves to the next character
  • When we find a two-character comment marker, we do an additional i += 1 to skip both characters
  • The temporary buffer t allows us to build lines that may span multiple source lines (when block comments join them)

This approach handles all edge cases correctly:

  • Multi-line block comments that join parts of different lines
  • Nested comment markers that should be ignored
  • Empty lines after comment removal that shouldn't appear in output

Ready to land your dream job?

Unlock your dream job with a 5-minute evaluator for a personalized learning plan!

Start Evaluator

Example Walkthrough

Let's walk through a small example to illustrate how the solution handles different comment types:

Input:

source = ["int main() {", "  /* block comment", "  continues */ int x = 5;", "  // line comment", "  return 0;", "}"]

Processing Step by Step:

Line 0: "int main() {"

  • block_comment = False, t = []
  • Scan each character: No comment markers found
  • Add all characters to t: ['i','n','t',' ','m','a','i','n','(',')',' ','{']
  • End of line: Not in block comment and t is not empty
  • Output: "int main() {" added to ans
  • Clear t for next line

Line 1: " /* block comment"

  • block_comment = False, t = []
  • Characters ' ',' ' added to t
  • At index 2-3: Found "/*" → Set block_comment = True, skip both characters
  • Continue scanning: remaining characters are ignored (we're in block comment)
  • End of line: Still in block comment, so don't output anything
  • Keep t = [' ',' '] for potential joining with next line

Line 2: " continues */ int x = 5;"

  • block_comment = True, t = [' ',' '] (from previous line)
  • Scan for "*/": Found at index 12-13
  • Set block_comment = False, skip the "*/"
  • Continue from index 14: Add ' ','i','n','t',' ','x',' ','=',' ','5',';' to t
  • End of line: Not in block comment and t is not empty
  • Output: " int x = 5;" added to ans (notice the two spaces from line 1 are preserved)
  • Clear t

Line 3: " // line comment"

  • block_comment = False, t = []
  • Characters ' ',' ' added to t
  • At index 2-3: Found "//" → Break (rest of line is commented)
  • End of line: Not in block comment but t only has spaces
  • Output: " " added to ans
  • Clear t

Line 4: " return 0;"

  • block_comment = False, t = []
  • Scan each character: No comment markers found
  • Add all characters to t
  • End of line: Output " return 0;" added to ans

Line 5: "}"

  • block_comment = False, t = []
  • Single character '}' added to t
  • End of line: Output "}" added to ans

Final Result:

ans = ["int main() {", "  int x = 5;", "  ", "  return 0;", "}"]

This example demonstrates:

  1. How block comments spanning multiple lines join text from different lines
  2. How line comments cause the rest of the line to be ignored
  3. How the state (block_comment flag) carries across lines
  4. How the temporary buffer t accumulates characters until a valid line is complete

Solution Implementation

1class Solution:
2    def removeComments(self, source: List[str]) -> List[str]:
3        """
4        Remove all comments from source code.
5        Handles both single-line comments (//) and multi-line comments (/* */).
6      
7        Args:
8            source: List of strings representing lines of source code
9          
10        Returns:
11            List of strings with all comments removed
12        """
13        result = []
14        current_line = []
15        in_block_comment = False
16      
17        for line in source:
18            index = 0
19            line_length = len(line)
20          
21            while index < line_length:
22                if in_block_comment:
23                    # Currently inside a block comment, look for closing */
24                    if index + 1 < line_length and line[index:index + 2] == "*/":
25                        in_block_comment = False
26                        index += 1  # Skip the closing */ by advancing one extra position
27                else:
28                    # Not in a block comment, check for comment starts
29                    if index + 1 < line_length and line[index:index + 2] == "/*":
30                        # Start of block comment
31                        in_block_comment = True
32                        index += 1  # Skip the opening /* by advancing one extra position
33                    elif index + 1 < line_length and line[index:index + 2] == "//":
34                        # Single-line comment starts, ignore rest of the line
35                        break
36                    else:
37                        # Regular character, add to current line
38                        current_line.append(line[index])
39              
40                index += 1
41          
42            # If not in a block comment and we have accumulated characters,
43            # add the line to results
44            if not in_block_comment and current_line:
45                result.append("".join(current_line))
46                current_line.clear()
47      
48        return result
49
1class Solution {
2    public List<String> removeComments(String[] source) {
3        // List to store the final result after removing comments
4        List<String> result = new ArrayList<>();
5      
6        // StringBuilder to accumulate characters for the current line
7        StringBuilder currentLine = new StringBuilder();
8      
9        // Flag to track if we're currently inside a block comment
10        boolean insideBlockComment = false;
11      
12        // Process each line in the source code
13        for (String line : source) {
14            int lineLength = line.length();
15          
16            // Process each character in the current line
17            for (int i = 0; i < lineLength; i++) {
18              
19                if (insideBlockComment) {
20                    // Check for end of block comment "*/"
21                    if (i + 1 < lineLength && line.charAt(i) == '*' && line.charAt(i + 1) == '/') {
22                        insideBlockComment = false;
23                        i++; // Skip the next character '/'
24                    }
25                    // Otherwise, skip this character since we're inside a block comment
26                  
27                } else {
28                    // Check for start of block comment "/*"
29                    if (i + 1 < lineLength && line.charAt(i) == '/' && line.charAt(i + 1) == '*') {
30                        insideBlockComment = true;
31                        i++; // Skip the next character '*'
32                      
33                    // Check for line comment "//"
34                    } else if (i + 1 < lineLength && line.charAt(i) == '/' && line.charAt(i + 1) == '/') {
35                        break; // Ignore the rest of the line
36                      
37                    // Regular character, add to current line
38                    } else {
39                        currentLine.append(line.charAt(i));
40                    }
41                }
42            }
43          
44            // If we're not inside a block comment and have accumulated characters,
45            // add the line to result and reset the StringBuilder
46            if (!insideBlockComment && currentLine.length() > 0) {
47                result.add(currentLine.toString());
48                currentLine.setLength(0); // Clear the StringBuilder for next line
49            }
50        }
51      
52        return result;
53    }
54}
55
1class Solution {
2public:
3    vector<string> removeComments(vector<string>& source) {
4        vector<string> result;
5        string currentLine;
6        bool inBlockComment = false;
7      
8        // Process each line in the source code
9        for (const auto& line : source) {
10            int lineLength = line.size();
11          
12            // Process each character in the current line
13            for (int i = 0; i < lineLength; ++i) {
14                if (inBlockComment) {
15                    // Look for end of block comment "*/"
16                    if (i + 1 < lineLength && line[i] == '*' && line[i + 1] == '/') {
17                        inBlockComment = false;
18                        ++i;  // Skip the '/' character
19                    }
20                    // Otherwise, skip characters inside block comment
21                } else {
22                    // Check for start of block comment "/*"
23                    if (i + 1 < lineLength && line[i] == '/' && line[i + 1] == '*') {
24                        inBlockComment = true;
25                        ++i;  // Skip the '*' character
26                    }
27                    // Check for line comment "//"
28                    else if (i + 1 < lineLength && line[i] == '/' && line[i + 1] == '/') {
29                        break;  // Ignore rest of the line
30                    }
31                    // Regular character, add to current line
32                    else {
33                        currentLine.push_back(line[i]);
34                    }
35                }
36            }
37          
38            // If not in block comment and current line has content, add to result
39            if (!inBlockComment && !currentLine.empty()) {
40                result.emplace_back(currentLine);
41                currentLine.clear();
42            }
43        }
44      
45        return result;
46    }
47};
48
1/**
2 * Removes comments from source code lines
3 * Handles both single-line (//) and multi-line (/* */) comments
4 * @param source - Array of source code lines
5 * @returns Array of source code lines with comments removed
6 */
7function removeComments(source: string[]): string[] {
8    // Result array to store processed lines
9    const result: string[] = [];
10  
11    // Buffer to accumulate characters for the current line being processed
12    const currentLineBuffer: string[] = [];
13  
14    // Flag to track if we're currently inside a block comment
15    let isInsideBlockComment = false;
16  
17    // Process each line of source code
18    for (const line of source) {
19        const lineLength = line.length;
20      
21        // Process each character in the current line
22        for (let charIndex = 0; charIndex < lineLength; charIndex++) {
23            if (isInsideBlockComment) {
24                // Check for end of block comment (*/)
25                if (charIndex + 1 < lineLength && line.slice(charIndex, charIndex + 2) === '*/') {
26                    isInsideBlockComment = false;
27                    charIndex++; // Skip the next character as it's part of */
28                }
29                // Skip all other characters while inside block comment
30            } else {
31                // Check for start of block comment (/*) 
32                if (charIndex + 1 < lineLength && line.slice(charIndex, charIndex + 2) === '/*') {
33                    isInsideBlockComment = true;
34                    charIndex++; // Skip the next character as it's part of /*
35                } 
36                // Check for single-line comment (//)
37                else if (charIndex + 1 < lineLength && line.slice(charIndex, charIndex + 2) === '//') {
38                    break; // Skip the rest of the line
39                } 
40                // Regular character - add to buffer
41                else {
42                    currentLineBuffer.push(line[charIndex]);
43                }
44            }
45        }
46      
47        // If not inside a block comment and buffer has content, add line to result
48        if (!isInsideBlockComment && currentLineBuffer.length > 0) {
49            result.push(currentLineBuffer.join(''));
50            currentLineBuffer.length = 0; // Clear the buffer for next line
51        }
52    }
53  
54    return result;
55}
56

Time and Space Complexity

The time complexity is O(L), where L is the total length of all source code strings combined. The algorithm processes each character in the source code exactly once. Even though there's a nested loop structure (outer loop iterating through lines and inner while loop iterating through characters in each line), each character is visited only once during the entire execution. The operations performed for each character (checking for comment markers, appending to list) are all O(1) operations.

The space complexity is O(L), where L is the total length of the source code. In the worst case, when there are no comments to remove, the algorithm needs to store all the original characters. The temporary list t can hold up to L characters across all iterations before being joined and added to the answer list. The answer list ans also stores the processed strings, which in the worst case could be nearly as long as the original input. Therefore, the total space used is proportional to the input size.

Learn more about how to find time and space complexity quickly.

Common Pitfalls

Pitfall 1: Incorrect Handling of Comment Marker Boundaries

The Problem: A common mistake is checking for comment markers without verifying that we have enough characters remaining in the line. This leads to index out of bounds errors when a line ends with a single / or *.

Incorrect Implementation:

# WRONG - will crash if line ends with single '/'
if line[index] == '/' and line[index + 1] == '/':
    break

The Solution: Always check that index + 1 < line_length before accessing two-character sequences:

# CORRECT - safely checks bounds first
if index + 1 < line_length and line[index:index + 2] == "//":
    break

Pitfall 2: Forgetting to Skip Both Characters of Comment Markers

The Problem: When encountering a two-character comment marker (/*, */, or //), developers often forget to skip both characters, leading to incorrect parsing where the second character of the marker gets treated as regular code.

Incorrect Implementation:

# WRONG - only increments index once at the end of loop
if index + 1 < line_length and line[index:index + 2] == "/*":
    in_block_comment = True
    # Missing: index += 1

The Solution: Add an extra index += 1 when processing two-character markers:

# CORRECT - skips both characters of the comment marker
if index + 1 < line_length and line[index:index + 2] == "/*":
    in_block_comment = True
    index += 1  # Extra increment to skip both characters

Pitfall 3: Creating New Line Buffer for Each Source Line

The Problem: Creating a new line buffer for each source line breaks the ability to join parts of different lines when a block comment spans multiple lines. This results in incorrect output when code continues after a multi-line block comment.

Incorrect Implementation:

# WRONG - resets buffer each iteration
for line in source:
    current_line = []  # Creates new buffer each time
    # ... processing ...

Example that breaks:

Input:  ["int x = 1; /* comment", "continues */ int y = 2;"]
Wrong:  ["int x = 1; ", "int y = 2;"]  
Right:  ["int x = 1;  int y = 2;"]

The Solution: Maintain the buffer across iterations and only clear it after adding a complete line to results:

# CORRECT - maintains buffer across lines
current_line = []  # Created once outside the loop

for line in source:
    # ... processing ...
    if not in_block_comment and current_line:
        result.append("".join(current_line))
        current_line.clear()  # Only clear after outputting

Pitfall 4: Outputting Empty Lines

The Problem: After removing comments, some lines may become empty, but the problem requires excluding empty lines from the output. Forgetting to check if the accumulated buffer is non-empty before adding to results violates this requirement.

Incorrect Implementation:

# WRONG - might add empty strings
if not in_block_comment:
    result.append("".join(current_line))  # No check if current_line is empty

The Solution: Always check that the buffer contains characters before adding to results:

# CORRECT - only adds non-empty lines
if not in_block_comment and current_line:
    result.append("".join(current_line))
    current_line.clear()
Discover Your Strengths and Weaknesses: Take Our 5-Minute Quiz to Tailor Your Study Plan:

The three-steps of Depth First Search are:

  1. Identify states;
  2. Draw the state-space tree;
  3. DFS on the state-space tree.

Recommended Readings

Want a Structured Path to Master System Design Too? Don’t Miss This!

Load More