1805. Number of Different Integers in a String

EasyHash TableString
Leetcode Link

Problem Description

This problem involves a string word that is composed of both digits and lowercase English letters. The goal is to transform this string in a way that all non-digit characters are replaced by spaces, which effectively segments any clusters of digits into separate numbers. After this replacement, you'll have a collection of separate numbers that might include leading zeroes.

The task is now to count how many different integers you get from these numbers. The important point to note here is that integers with leading zeros should be considered the same as those without leading zeros. For instance, "001" and "1" should be counted as the same integer. The output should be the total number of unique integers you can find after performing the operations specified.

Intuition

When attacking this problem, we must focus on two main tasks:

  1. Segregating the numbers from the mix of digits and letters.
  2. Ensuring that any leading zeros are ignored to avoid counting the same number multiple times.

The intuitive approach here is to iterate through the string, identifying digits and collecting consecutive digits into whole numbers. Whenever we encounter a non-digit character, we treat that as a delimiter indicating the end of a number.

While iterating over the string and finding these numbers, we also need to deal with leading zeros. The solution handles this efficiently by first going through any '0's before finishing collecting the number and only then adding it to a set. A set is the ideal data structure here because it naturally avoids duplication. Whenever we encounter a sequence of digits, we slice that part out of the string and add it to our set, and the set ensures that it only contains unique numbers.

After the iteration, the size of the set corresponds to the number of different integers present in the modified string because each entry in the set is unique. Thus, the last step of the solution is simply to return this count.

Solution Approach

The solution to the problem uses a simple yet effective algorithm that employs a set data structure and a single pass through the string to identify and store unique integers. Here's step-by-step implementation details:

  1. Initialize a Set: A set named s is created to store integers uniquely encountered during the iteration. In Python, a set automatically handles duplicates, only storing unique values.

  2. Iterate Over the String: The algorithm uses a while loop to iterate over the entire string, indexed by i. Using while is more flexible here than a for loop because we may need to increment i by more than one to skip over the digits or leading zeros.

  3. Find Integers: Inside the loop, whenever a digit is encountered, the following steps are taken:

    • Skip over Leading Zeros: First, while the current character is '0', increment i to remove leading zeros from consideration.
    • Collect the Number: Then set a secondary index j to i and increment j until a non-digit character is found, which signifies the end of the current integer.
    • Add the Integer to the Set: With the start and end of an integer determined, a slice of word from i to j is taken (which is our integer without leading zeros), and added to the set s.
    • Update the Primary Index: Lastly, set i equal to j, as we've finished processing the current number and need to continue scanning the rest of the string.
  4. Increment the Index: If the current character is not a digit, just increment i by one to move to the next character.

  5. Return Unique Count: After the while loop is done, the algorithm returns the size of the set s (len(s)), which represents the count of unique integers found in the string.

By using a set and carefully iterating over the string, the solution efficiently filters out non-digit characters, handles leading zeros, and ensures that each integer is considered only once, despite how many times it appears in the string.

Example Walkthrough

Let's take a simple example string word equal to "a10b00100c01". We aim to find out how many different integers are in the string after converting non-digit characters to spaces and considering integers with leading zeros as the same.

Here is how the solution approach applies to this example:

  1. Initialize a Set: We create an empty set s to store unique integers.

  2. Iterate Over the String: We start with i = 0. Since word[0] is 'a', a letter, we just increment i to 1.

  3. Find Integers: When i = 1, we find '1', which is a digit. At this point:

    • Skip over Leading Zeros: There are no leading zeros, so we do nothing.
    • Collect the Number: We set j to 1 and increment j until we find a non-digit character. In this case, j becomes 3 when it reaches 'b'.
    • Add the Integer to the Set: We add word[1:3] (which is "10") to set s.
    • Update the Primary Index: We set i equal to j (i.e., i = 3).
  4. Increment the Index: At i = 3, we have 'b', a non-digit. We increment i by one to 4.

  5. Find Integers: Now, i = 4, and we find '0', a digit:

    • Skip over Leading Zeros: We increment i while the character is '0'. Now i = 6, at the next non-zero digit.
    • Collect the Number: We set j to 6 and increment j until we find a non-digit character. This happens immediately as 'c' at index 6 is a letter.
    • No number to add to the Set: Since i and j are the same, we don't have a number slice to add to the set s.
    • Update the Primary Index: i remains equal to j (i.e., i = 6).
  6. Increment the Index: As i = 6 and word[i] is 'c', we increment i by one to 7.

  7. Find Integers: At i = 7, we find '0':

    • Skip over Leading Zeros: We increment i to skip over zeros and reach the non-zero digit '1' at i = 9.
    • Collect the Number: We set j to 9 and increment j. Since the end of the string is reached, j becomes the end of the string.
    • Add the Integer to the Set: We add word[9:end] (which is "1") to the set s.
    • Update the Primary Index: We set i to the end of the string because we've reached the end.
  8. Return Unique Count: The iteration is finished. Our set s contains {"10", "1"}. The count of unique integers is the size of the set s, which is 2.

The different integers we have found are 10 and 1. Despite the multiple representations of the number one, such as "01", "001", "0001" etc., they are all counted as a single unique integer 1.

Thus, the algorithm successfully arrives at the correct answer: there are 2 different integers in the string "a10b00100c01".

Python Solution

1class Solution:
2    def numDifferentIntegers(self, word: str) -> int:
3        # Initialize a set to store unique integers found in the string
4        unique_integers = set()
5
6        # Initialize index and get the length of the word
7        index, length = 0, len(word)
8
9        # Iterate over each character in the word
10        while index < length:
11            # Check if the current character is a digit
12            if word[index].isdigit():
13                # Skip leading zeros
14                while index < length and word[index] == '0':
15                    index += 1
16                # Mark the start of the non-zero sequence
17                start = index
18                # Find the end of the continuous digit sequence
19                while index < length and word[index].isdigit():
20                    index += 1
21                # Add the non-zero starting sequence of digits to the set
22                unique_integers.add(word[start:index])
23            # Move to the next character
24            index += 1
25
26        # Return the count of unique integers found
27        return len(unique_integers)
28

Java Solution

1class Solution {
2
3    // Method to count the number of different integers in a string
4    public int numDifferentIntegers(String word) {
5        // Set to store unique integers represented as strings
6        Set<String> uniqueNumbers = new HashSet<>();
7        int length = word.length(); // Length of the input string
8
9        // Loop through each character in the string
10        for (int i = 0; i < length; ++i) {
11            // Check if the current character is a digit
12            if (Character.isDigit(word.charAt(i))) {
13                // Skip leading zeros for the current number
14                while (i < length && word.charAt(i) == '0') {
15                    ++i;
16                }
17              
18                // Starting index of the non-zero digit
19                int startIndex = i;
20                // Find the end index of the contiguous digits
21                while (startIndex < length && Character.isDigit(word.charAt(startIndex))) {
22                    ++startIndex;
23                }
24              
25                // Add the integer (without leading zeros) to the set
26                uniqueNumbers.add(word.substring(i, startIndex)); 
27              
28                // Move to the position after the current number
29                i = startIndex;
30            }
31        }
32
33        // Return the size of the set, which is the number of unique integers
34        return uniqueNumbers.size();
35    }
36}
37

C++ Solution

1#include <string>
2#include <unordered_set>
3
4class Solution {
5public:
6    // Function to count the number of distinct integers in a string
7    int numDifferentIntegers(std::string word) {
8        std::unordered_set<std::string> distinctNumbers; // Set to store unique numbers as strings
9        int length = word.size(); // Size of the input string
10
11        for (int i = 0; i < length; ++i) {
12            // If the current character is a digit, start processing the number
13            if (isdigit(word[i])) {
14                // Skip leading zeros
15                while (i < length && word[i] == '0') ++i;
16
17                // Find the end of the current number
18                int start = i;
19                while (start < length && isdigit(word[start])) ++start;
20
21                // Insert the number as a substring into the set, ensuring uniqueness
22                distinctNumbers.insert(word.substr(i, start - i));
23
24                // Move to the end of the current number to continue the outer loop
25                i = start;
26            }
27        }
28
29        // Return the count of unique numbers found in the string
30        return distinctNumbers.size();
31    }
32};
33

Typescript Solution

1function numDifferentIntegers(word: string): number {
2    // Use a regular expression to replace all non-digit characters with a single space
3    const wordWithSpaces = word.replace(/\D+/g, ' ');
4
5    // Trim leading and trailing spaces and split the string into an array on spaces
6    const rawNumbers = wordWithSpaces.trim().split(' ');
7
8    // Filter out any empty strings that may result from multiple adjacent non-digit characters
9    const filteredNumbers = rawNumbers.filter(value => value !== '');
10
11    // Remove leading zeros from each number string
12    const cleanedNumbers = filteredNumbers.map(value => value.replace(/^0+/g, ''));
13
14    // Create a Set to hold unique numbers and determine its size
15    const uniqueNumbers = new Set(cleanedNumbers);
16
17    return uniqueNumbers.size;
18}
19

Time and Space Complexity

The given Python code defines a method to count the number of distinct integers in a string. Let's analyze both time and space complexity:

Time Complexity

The time complexity of the code can be broken down as follows:

  1. Loop through each character in the string: O(n), where n is the length of the string word.
  2. Nested loop to skip leading zeros: In the worst case, this occurs once for each digit, contributing O(n).
  3. Nested loop to identify the digits of the integer: This also occurs once for each digit, contributing O(n).

Although there are nested loops, each character is processed only once. Thus, the overall time complexity is O(n).

Space Complexity

The space complexity is determined by the extra space used:

  1. A set s is used to store distinct integers. In the worst case, each character could be a different non-zero digit, resulting in a set size of O(n).
  2. Substrings representing integers are added to the set, but the total length of all unique integer substrings cannot exceed n.

Hence, the space complexity is O(n).

Overall, both the time complexity and the space complexity of the method are O(n), where n is the length of the input string word.


Got a question? Ask the Teaching Assistant anything you don't understand.

Still not clear? Ask in the Forum,  Discord or Submit the part you don't understand to our editors.


TA 👨‍🏫