Facebook Pixel

3758. Convert Number Words to Digits 🔒

MediumTrieString
LeetCode ↗

Problem Description

You are given a string s consisting of lowercase English letters. The string s may contain valid concatenated English words representing the digits 0 to 9, written without any spaces between them.

Your task is to extract each valid number word in order and convert it to its corresponding digit, producing a string of digits.

The parsing rule works from left to right. At each position in the string:

  • If a valid number word (one of "zero", "one", "two", "three", "four", "five", "six", "seven", "eight", "nine") starts at the current position, append its corresponding digit to the result and advance the position by the length of that matched word.
  • Otherwise, skip exactly one character and continue parsing from the next position.

Return the resulting digit string formed by concatenating all the matched digits in order. If no number words are found in s, return an empty string.

Quick Interview Experience
Help others by sharing your interview experience
Have you seen this problem before?

How We Pick the Algorithm

Why Trie / String Matching / Rolling Hash?

This problem maps to Trie / String Matching / Rolling Hash through a short path in the full flowchart.

Stringpatternmatching?yesSubstringsordictionary?yesTrie / StringMatching /Rolling Hash

Matching number words (zero, one, ..., nine) as substrings in a concatenated string is a dictionary-based string matching problem.

Open in Flowchart

Intuition

The problem asks us to scan the string and recognize number words wherever they appear, converting each into its corresponding digit. Since the rule explicitly tells us to process the string from left to right and either match a word or skip a single character, this naturally suggests a straightforward linear scan of the string.

The key observation is that at each position i, we only need to check whether any of the ten known number words begins exactly at that position. If we build a list d where d[j] stores the word for digit j, then for each position we can try matching the substring starting at i against every word in d.

When a word d[j] matches the substring s[i : i + len(d[j])], we know a valid number word starts here. We record the digit j and jump forward by the length of that word, because those characters have been consumed. If no word matches at position i, we simply move forward by one character, following the "skip exactly one character" rule.

By repeating this matching-and-advancing process until we reach the end of the string, we collect all digits in the order they appear. Concatenating these digits gives the final answer. This approach works directly because the problem's parsing rule is itself a precise description of the procedure, so we just need to faithfully implement it.

Pattern Learn more about Trie patterns.

Solution Approach

Solution 1: Enumeration

We first establish a mapping relationship between number words and their corresponding digits, recorded in array d, where d[i] represents the word corresponding to digit i. For example, d[0] = "zero", d[1] = "one", and so on up to d[9] = "nine".

Then we traverse the string s from left to right using a pointer i, where n is the length of s. For each position i, we enumerate the number words d[j] in order and check whether the substring starting at position i matches d[j]. Concretely, letting m = len(d[j]), we verify that i + m <= n (so the substring does not run past the end) and that s[i : i + m] == d[j].

If a match is found, we append the digit j to the result list ans and advance position i forward by the length of the matched word. In the code, this is done by setting i += m - 1 and then breaking out of the inner loop, followed by an i += 1 after the loop — together these move i forward by exactly m positions. If no word matches at the current position, the inner loop finishes without breaking, and the trailing i += 1 moves i forward by just one character, implementing the "skip exactly one character" rule.

We repeat this process until we have traversed the entire string s. Finally, we concatenate the digits stored in ans into a single string with "".join(ans) and return it. If no number words were ever matched, ans is empty and the returned string is empty as well.

The time complexity is O(n * L), where n is the length of the string s and L is the total length of all number words (a constant, since there are always 10 fixed words). The space complexity is O(n) for storing the result, ignoring the constant space used by the mapping array d.

Example Walkthrough

Let's trace through a small example with s = "xtwone".

Setup: We build the mapping array d where:

  • d[0] = "zero", d[1] = "one", d[2] = "two", d[3] = "three", d[4] = "four"
  • d[5] = "five", d[6] = "six", d[7] = "seven", d[8] = "eight", d[9] = "nine"

We have n = 6 (length of "xtwone"), pointer i = 0, and an empty result list ans = [].

Step 1: i = 0, character 'x'

We try to match each word d[j] against the substring starting at index 0.

  • None of the words ("zero", "one", ..., "nine") match a substring beginning with 'x'.

No match found, so the inner loop finishes without breaking. The trailing i += 1 moves us forward by exactly one character (the "skip one character" rule).

  • ans = [], i = 1

Step 2: i = 1, substring "twone"

We try matching each word against the substring starting at index 1.

  • Checking d[2] = "two": with m = 3, we verify 1 + 3 <= 6 ✓ and s[1:4] == "two" ✓ — match!

We append digit 2 to ans, then set i += m - 1 (so i = 1 + 2 = 3) and break. After the loop, i += 1 brings us to i = 4, advancing by exactly 3 positions total (the length of "two").

  • ans = ['2'], i = 4

Step 3: i = 4, substring "ne"

We try matching each word against the substring starting at index 4.

  • Checking d[1] = "one": with m = 3, we verify 4 + 3 <= 6 → 7 <= 6 ✗ — fails the bounds check, so no match.
  • No other word matches "ne" either.

No match found, so i += 1 skips one character.

  • ans = ['2'], i = 5

Step 4: i = 5, character 'e'

We try matching each word against the substring starting at index 5.

  • The only remaining characters are "e", which is too short for any word — none match.

No match, so i += 1.

  • ans = ['2'], i = 6

Termination: Now i = 6 = n, so we stop traversing.

Result: We concatenate ans with "".join(ans), producing "2".

This illustrates the core mechanic: at each position we either consume a full matched word and jump past it, or skip a single character. Note how the partially-overlapping "one" inside "twone" was not captured, because after matching "two" the pointer had already advanced past the 'o', leaving only "ne" — a faithful reflection of the left-to-right greedy parsing rule.

Solution Implementation

1class Solution:
2    def convertNumber(self, s: str) -> str:
3        # Mapping from index to the English word of each digit (0-9)
4        digit_words = [
5            "zero",
6            "one",
7            "two",
8            "three",
9            "four",
10            "five",
11            "six",
12            "seven",
13            "eight",
14            "nine",
15        ]
16
17        index, length = 0, len(s)
18        result = []
19
20        # Scan the string from left to right
21        while index < length:
22            # Try to match a digit word starting at the current position
23            for digit, word in enumerate(digit_words):
24                word_len = len(word)
25                # Check that the word fits and matches the substring
26                if index + word_len <= length and s[index : index + word_len] == word:
27                    result.append(str(digit))
28                    # Advance past the matched word (minus 1, because the
29                    # outer loop will add 1 more below)
30                    index += word_len - 1
31                    break
32            # Move to the next position
33            index += 1
34
35        # Join all matched digits into the final numeric string
36        return "".join(result)
37
1class Solution {
2    /**
3     * Converts a string composed of concatenated English number words
4     * (e.g. "onetwothree") into its corresponding digit string (e.g. "123").
5     *
6     * @param s the input string consisting of spelled-out digit words
7     * @return the resulting string of digits
8     */
9    public String convertNumber(String s) {
10        // Mapping from a digit value (index) to its English word representation.
11        String[] words = {
12            "zero", "one", "two", "three", "four",
13            "five", "six", "seven", "eight", "nine"
14        };
15
16        int length = s.length();
17        StringBuilder ans = new StringBuilder();
18
19        // Scan the input string from left to right.
20        for (int i = 0; i < length; ++i) {
21            // Try to match a number word starting at the current position.
22            for (int digit = 0; digit < words.length; ++digit) {
23                String word = words[digit];
24                int wordLength = word.length();
25
26                // Check that the substring fits within bounds and equals the word.
27                if (i + wordLength <= length
28                        && s.substring(i, i + wordLength).equals(word)) {
29                    // Append the matched digit.
30                    ans.append(digit);
31                    // Advance the index past this word.
32                    // (-1 because the outer loop's ++i will move forward by one.)
33                    i += wordLength - 1;
34                    break;
35                }
36            }
37        }
38
39        return ans.toString();
40    }
41}
42
1class Solution {
2public:
3    string convertNumber(string s) {
4        // Map each digit (0-9) to its English word representation
5        vector<string> digitWords = {
6            "zero", "one", "two", "three", "four",
7            "five", "six", "seven", "eight", "nine"
8        };
9
10        int length = s.length();
11        string result;
12
13        // Scan through the input string character by character
14        for (int i = 0; i < length; ++i) {
15            // Try to match each digit word starting at position i
16            for (int digit = 0; digit < digitWords.size(); ++digit) {
17                const string& word = digitWords[digit];
18                int wordLength = word.length();
19
20                // Check if the substring starting at i matches the current word
21                if (i + wordLength <= length && s.substr(i, wordLength) == word) {
22                    // Append the corresponding digit to the result
23                    result += to_string(digit);
24                    // Advance i past the matched word (minus 1, since the loop also increments)
25                    i += wordLength - 1;
26                    break;
27                }
28            }
29        }
30
31        return result;
32    }
33};
34
1/**
2 * Converts a string composed of concatenated English number words
3 * (e.g. "onetwothree") into its corresponding digit string ("123").
4 *
5 * @param s - The input string made up of lowercase English number words.
6 * @returns A string of digits corresponding to the parsed number words.
7 */
8function convertNumber(s: string): string {
9    // Lookup table where the index represents the digit
10    // and the value is its English word representation.
11    const words: string[] = [
12        'zero',
13        'one',
14        'two',
15        'three',
16        'four',
17        'five',
18        'six',
19        'seven',
20        'eight',
21        'nine',
22    ];
23
24    const length = s.length;
25
26    // Collects the resulting digits as strings.
27    const result: string[] = [];
28
29    // Scan the input string from left to right.
30    for (let i = 0; i < length; ++i) {
31        // Try to match each number word at the current position.
32        for (let digit = 0; digit < words.length; ++digit) {
33            const word = words[digit];
34            const wordLength = word.length;
35
36            // Check that the word fits within the remaining string
37            // and matches the substring starting at index i.
38            if (i + wordLength <= length && s.substring(i, i + wordLength) === word) {
39                // Record the matched digit.
40                result.push(digit.toString());
41
42                // Advance i past the matched word
43                // (minus 1 because the loop will increment i).
44                i += wordLength - 1;
45                break;
46            }
47        }
48    }
49
50    // Join all collected digits into the final number string.
51    return result.join('');
52}
53

Time and Space Complexity

  • Time Complexity: O(n × |d|), where n is the length of string s and |d| is the number of digit words. The outer while loop iterates over the string, advancing the pointer i across all n characters. For each position, the inner for loop iterates over all |d| digit words, and for each word it performs a substring comparison s[i : i + m] == t that costs O(m) time, where m is the word length. Since the word lengths are bounded by a small constant, the comparison cost is treated as constant, yielding an overall time complexity of O(n × |d|).

  • Space Complexity: O(|d|). The list d stores all |d| digit words, requiring O(|d|) space. The result list ans and the joined string depend on the number of matched digits, which is bounded by the input, but the dominant auxiliary space contributed by the fixed dictionary is O(|d|).

Pattern Learn more about how to find time and space complexity quickly.

Common Pitfalls

Pitfall 1: Mishandling the Pointer Advancement After a Match

The most common mistake lies in how the pointer index is advanced after matching a word. The code uses a two-step advancement:

index += word_len - 1   # inside the inner loop, before break
...
index += 1              # after the inner loop

Together these move the pointer forward by exactly word_len. Developers frequently get this wrong in one of two ways:

  1. Writing index += word_len inside the loop AND keeping the trailing index += 1. This causes the pointer to over-advance by one position, silently skipping a character right after each matched word. For example, parsing "oneone" would match "one", then jump past the first character of the second "one", failing to recognize it.

  2. Forgetting the break. Without break, after matching a word the inner loop keeps iterating. Even though no other word will match the same already-consumed substring in most cases, the logic becomes fragile and the index += word_len - 1 may execute multiple times, corrupting the position.

Solution: Make the advancement explicit and unambiguous by using continue instead of the split logic. This avoids the subtle "minus 1 then plus 1" trick entirely:

class Solution:
    def convertNumber(self, s: str) -> str:
        digit_words = [
            "zero", "one", "two", "three", "four",
            "five", "six", "seven", "eight", "nine",
        ]

        index, length = 0, len(s)
        result = []

        while index < length:
            matched = False
            for digit, word in enumerate(digit_words):
                word_len = len(word)
                if s.startswith(word, index):  # cleaner & bounds-safe
                    result.append(str(digit))
                    index += word_len           # advance by full length
                    matched = True
                    break
            if not matched:
                index += 1                       # skip exactly one char

        return "".join(result)

Pitfall 2: Index Out-of-Bounds When Checking the Substring

A second pitfall is checking s[index : index + word_len] == word without first verifying index + word_len <= length. While Python slicing does not raise an error when the end index exceeds the length (it just returns a shorter slice), relying on this behavior can mask bugs and produces a subtle dependency:

  • In Python, "on"[0:4] == "one" is False, so it happens to work correctly.
  • However, if you port this logic to a language like Java or C++ where substring throws on out-of-range indices, the missing bounds check causes a runtime exception.

Solution: Always pair the comparison with the bounds check, as the original code does (index + word_len <= length and ...), or use a bounds-safe helper like s.startswith(word, index) which never raises and never over-reads.

Pitfall 3: Assuming a "Greedy Longest Match" Is Required

One might assume that overlapping words require choosing the longest match. In this problem, no two digit words share a prefix that creates ambiguity at the same starting position (e.g., there is no word that is a proper prefix of another). Because of this, matching in numerical order 0..9 and breaking on the first hit is always correct.

Solution: Trust the first-match-and-break strategy. There is no need to sort words by length or implement longest-match logic. Adding such complexity would be unnecessary overhead and could even introduce bugs if implemented incorrectly.

Ready to land your dream job?

Unlock your dream job with a 5-minute quiz for a personalized study roadmap!

Get My Roadmap
Discover Your Strengths and Weaknesses: Take Our 5-Minute Quiz to Get a Personalized Study Roadmap:

What is the best way of checking if an element exists in a sorted array once in terms of time complexity? Select the best that applies.


Recommended Readings

Want a Structured Path to Master System Design Too? Don’t Miss This!

Load More