Facebook Pixel

2047. Number of Valid Words in a Sentence

EasyString
Leetcode Link

Problem Description

You are given a sentence that contains only lowercase letters ('a' to 'z'), digits ('0' to '9'), hyphens ('-'), punctuation marks ('!', '.', and ','), and spaces (' '). The sentence can be split into one or more tokens using spaces as separators.

A token is considered a valid word if it satisfies all three of the following conditions:

  1. No digits allowed: The token can only contain lowercase letters, hyphens, and/or punctuation marks. It cannot contain any digits.

  2. Hyphen rules:

    • There can be at most one hyphen '-' in the token
    • If a hyphen is present, it must be surrounded by lowercase letters on both sides
    • Examples: "a-b" is valid, but "-ab" and "ab-" are not valid
  3. Punctuation rules:

    • There can be at most one punctuation mark ('!', '.', or ',') in the token
    • If a punctuation mark is present, it must be at the end of the token
    • Examples: "ab,", "cd!", and "." are valid, but "a!b" and "c.," are not valid

Valid word examples include: "a-b.", "afad", "ba-c", "a!", and "!".

Given a string sentence, you need to count and return the total number of valid words in the sentence.

Quick Interview Experience
Help others by sharing your interview experience
Have you seen this problem before?

Intuition

The problem asks us to validate each word in a sentence according to specific rules. The natural approach is to break down the problem into smaller, manageable parts.

First, we need to separate the sentence into individual tokens since spaces are the delimiters. Python's split() method handles this perfectly, automatically dealing with multiple consecutive spaces.

For each token, we need to validate it against three rules. The key insight is that we can check these rules character by character in a single pass through each token:

  1. Digit check: If we encounter any digit, we can immediately reject the token.

  2. Hyphen validation: We need to track if we've already seen a hyphen (using a boolean flag st). When we encounter a hyphen, we check:

    • Is this the first hyphen? (if st is already True, it's invalid)
    • Is it at the beginning or end? (positions 0 or len(s)-1 are invalid)
    • Are the characters immediately before and after it letters? (we can check s[i-1] and s[i+1])
  3. Punctuation validation: Punctuation marks are only valid at the end of the token. So if we find a punctuation mark at any position other than the last (i < len(s) - 1), the token is invalid.

By combining these checks in a single traversal of each token, we can efficiently determine validity. The beauty of this approach is that we can return False as soon as we find any violation, making it efficient. If we complete the traversal without finding any violations, the token is valid.

Finally, we count all valid tokens using a simple sum comprehension: sum(check(s) for s in sentence.split()), where the boolean results (True for valid, False for invalid) are automatically converted to 1s and 0s for counting.

Solution Approach

The solution uses a simulation approach where we validate each word according to the given rules. Here's the step-by-step implementation:

Step 1: Split the sentence into tokens

sentence.split()

This automatically handles multiple spaces and gives us individual tokens to validate.

Step 2: Create a validation function for each token

We define a helper function check(s: str) that validates a single token. The function uses a boolean variable st to track whether we've already encountered a hyphen.

Step 3: Character-by-character validation

For each character c at position i in the token s, we perform the following checks:

  1. Digit check:

    if c.isdigit():
        return False

    If any character is a digit, the token is invalid.

  2. Punctuation position check:

    if c in "!.," and i < len(s) - 1:
        return False

    Punctuation marks can only appear at the end of the token. If we find one before the last position, the token is invalid.

  3. Hyphen validation:

    if c == "-":
        if (st or 
            i in (0, len(s) - 1) or 
            not s[i - 1].isalpha() or 
            not s[i + 1].isalpha()):
            return False
        st = True

    For a hyphen, we check multiple conditions:

    • st: Has a hyphen already appeared? (only one allowed)
    • i in (0, len(s) - 1): Is it at the beginning or end? (not allowed)
    • not s[i - 1].isalpha(): Is the character before it a letter? (required)
    • not s[i + 1].isalpha(): Is the character after it a letter? (required)

    If all conditions pass, we set st = True to mark that we've seen a hyphen.

Step 4: Count valid words

Finally, we use a generator expression with sum() to count all valid tokens:

return sum(check(s) for s in sentence.split())

The check() function returns True (counted as 1) for valid tokens and False (counted as 0) for invalid ones. The sum gives us the total count of valid words.

Time Complexity: O(n) where n is the length of the sentence, as we traverse each character once.

Space Complexity: O(m) where m is the number of tokens after splitting, needed to store the split results.

Ready to land your dream job?

Unlock your dream job with a 5-minute evaluator for a personalized learning plan!

Start Evaluator

Example Walkthrough

Let's walk through the solution with the sentence: "cat and-mice- d0g!"

Step 1: Split into tokens After splitting by spaces, we get: ["cat", "and-mice-", "d0g!"]

Step 2: Validate each token

Token 1: "cat"

  • Initialize st = False (no hyphen seen yet)
  • Check each character:
    • 'c': Not a digit ✓, not punctuation ✓, not hyphen ✓
    • 'a': Not a digit ✓, not punctuation ✓, not hyphen ✓
    • 't': Not a digit ✓, not punctuation ✓, not hyphen ✓
  • Result: Valid (returns True)

Token 2: "and-mice-"

  • Initialize st = False
  • Check each character:
    • 'a': Not a digit ✓, not punctuation ✓, not hyphen ✓
    • 'n': Not a digit ✓, not punctuation ✓, not hyphen ✓
    • 'd': Not a digit ✓, not punctuation ✓, not hyphen ✓
    • '-' (at index 3):
      • st is False ✓ (first hyphen)
      • Not at position 0 or 8 (last position) ✓
      • s[2] = 'd' is a letter ✓
      • s[4] = 'm' is a letter ✓
      • Set st = True
    • 'm': Not a digit ✓, not punctuation ✓, not hyphen ✓
    • 'i': Not a digit ✓, not punctuation ✓, not hyphen ✓
    • 'c': Not a digit ✓, not punctuation ✓, not hyphen ✓
    • 'e': Not a digit ✓, not punctuation ✓, not hyphen ✓
    • '-' (at index 8):
      • This is at the last position (index 8 = len("and-mice-") - 1)
      • Hyphen at the end is invalid ✗
  • Result: Invalid (returns False)

Token 3: "d0g!"

  • Initialize st = False
  • Check each character:
    • 'd': Not a digit ✓, not punctuation ✓, not hyphen ✓
    • '0': This is a digit
  • Result: Invalid (returns False immediately)

Step 3: Count valid words sum([True, False, False]) = sum([1, 0, 0]) = 1

The sentence contains 1 valid word.

Solution Implementation

1class Solution:
2    def countValidWords(self, sentence: str) -> int:
3        """
4        Count the number of valid words in a sentence.
5        A valid word contains:
6        - No digits
7        - At most one hyphen, which must be surrounded by lowercase letters
8        - At most one punctuation mark (!,.) which must be at the end
9        """
10      
11        def is_valid_word(word: str) -> bool:
12            """
13            Check if a word is valid according to the rules.
14          
15            Args:
16                word: A string to validate
17              
18            Returns:
19                True if the word is valid, False otherwise
20            """
21            hyphen_seen = False
22          
23            for index, char in enumerate(word):
24                # Check for invalid digit or punctuation not at end
25                if char.isdigit() or (char in "!.," and index < len(word) - 1):
26                    return False
27              
28                # Handle hyphen validation
29                if char == "-":
30                    # Check hyphen conditions:
31                    # 1. Only one hyphen allowed
32                    # 2. Cannot be at start or end
33                    # 3. Must be surrounded by letters
34                    if (hyphen_seen or 
35                        index in (0, len(word) - 1) or 
36                        not word[index - 1].isalpha() or 
37                        not word[index + 1].isalpha()):
38                        return False
39                    hyphen_seen = True
40          
41            return True
42      
43        # Split sentence into words and count valid ones
44        return sum(is_valid_word(word) for word in sentence.split())
45
1class Solution {
2    /**
3     * Counts the number of valid words in a sentence.
4     * A valid word follows specific rules regarding digits, punctuation, and hyphens.
5     * 
6     * @param sentence the input sentence to be processed
7     * @return the count of valid words in the sentence
8     */
9    public int countValidWords(String sentence) {
10        int validWordCount = 0;
11      
12        // Split sentence by spaces and check each token
13        for (String token : sentence.split(" ")) {
14            validWordCount += checkIfValidWord(token.toCharArray());
15        }
16      
17        return validWordCount;
18    }
19
20    /**
21     * Checks if a character array represents a valid word.
22     * Valid word rules:
23     * - No digits allowed
24     * - Punctuation marks (! . ,) can only appear at the end
25     * - At most one hyphen allowed, and it must be surrounded by letters
26     * 
27     * @param characters the character array to validate
28     * @return 1 if valid word, 0 otherwise
29     */
30    private int checkIfValidWord(char[] characters) {
31        // Empty strings are not valid words
32        if (characters.length == 0) {
33            return 0;
34        }
35      
36        boolean hyphenFound = false;
37      
38        for (int i = 0; i < characters.length; ++i) {
39            char currentChar = characters[i];
40          
41            // Rule: No digits allowed in valid words
42            if (Character.isDigit(currentChar)) {
43                return 0;
44            }
45          
46            // Rule: Punctuation marks can only appear at the end of the word
47            if ((currentChar == '!' || currentChar == '.' || currentChar == ',') 
48                && i < characters.length - 1) {
49                return 0;
50            }
51          
52            // Rule: Handle hyphen validation
53            if (currentChar == '-') {
54                // Check if hyphen already found (only one allowed)
55                // or if hyphen is at the beginning or end (not allowed)
56                if (hyphenFound || i == 0 || i == characters.length - 1) {
57                    return 0;
58                }
59              
60                // Check if hyphen is surrounded by letters
61                if (!Character.isAlphabetic(characters[i - 1]) 
62                    || !Character.isAlphabetic(characters[i + 1])) {
63                    return 0;
64                }
65              
66                hyphenFound = true;
67            }
68        }
69      
70        // Word is valid
71        return 1;
72    }
73}
74
1class Solution {
2public:
3    int countValidWords(string sentence) {
4        // Lambda function to validate if a word meets the criteria
5        auto isValidWord = [](const string& word) -> int {
6            bool hasHyphen = false;  // Track if we've already seen a hyphen
7          
8            for (int i = 0; i < word.length(); ++i) {
9                // Rule 1: Word cannot contain any digits
10                if (isdigit(word[i])) {
11                    return 0;
12                }
13              
14                // Rule 2: Punctuation marks (!, ., ,) can only appear at the end
15                if ((word[i] == '!' || word[i] == '.' || word[i] == ',') && 
16                    i < word.length() - 1) {
17                    return 0;
18                }
19              
20                // Rule 3: Hyphen validation
21                if (word[i] == '-') {
22                    // Only one hyphen allowed, and it cannot be at start or end
23                    if (hasHyphen || i == 0 || i == word.length() - 1) {
24                        return 0;
25                    }
26                    // Hyphen must be surrounded by lowercase letters
27                    if (!isalpha(word[i - 1]) || !isalpha(word[i + 1])) {
28                        return 0;
29                    }
30                    hasHyphen = true;
31                }
32            }
33          
34            return 1;  // Word is valid
35        };
36
37        int validWordCount = 0;
38        stringstream tokenizer(sentence);
39        string currentWord;
40      
41        // Parse the sentence word by word (space-separated)
42        while (tokenizer >> currentWord) {
43            validWordCount += isValidWord(currentWord);
44        }
45      
46        return validWordCount;
47    }
48};
49
1/**
2 * Counts the number of valid words in a sentence based on specific validation rules
3 * @param sentence - The input sentence to validate
4 * @returns The number of valid words in the sentence
5 */
6function countValidWords(sentence: string): number {
7    /**
8     * Validates if a single word meets the criteria for being valid
9     * Rules:
10     * - No digits allowed
11     * - Punctuation marks (! . ,) can only appear at the end
12     * - Hyphen can appear at most once and must be between two letters
13     * @param word - The word to validate
14     * @returns 1 if valid, 0 if invalid
15     */
16    const validateWord = (word: string): number => {
17        // Empty strings are not valid words
18        if (word.length === 0) {
19            return 0;
20        }
21      
22        // Track if we've already seen a hyphen
23        let hasHyphen: boolean = false;
24      
25        for (let i = 0; i < word.length; i++) {
26            const currentChar: string = word[i];
27          
28            // Check for digits - invalid if any digit is found
29            if (/\d/.test(currentChar)) {
30                return 0;
31            }
32          
33            // Check punctuation - must only be at the end of word
34            const punctuationMarks: string[] = ['!', '.', ','];
35            if (punctuationMarks.includes(currentChar) && i < word.length - 1) {
36                return 0;
37            }
38          
39            // Check hyphen rules
40            if (currentChar === '-') {
41                // Multiple hyphens or hyphen at start/end is invalid
42                if (hasHyphen || i === 0 || i === word.length - 1) {
43                    return 0;
44                }
45              
46                // Hyphen must be between two letters
47                const prevChar: string = word[i - 1];
48                const nextChar: string = word[i + 1];
49                if (!/[a-zA-Z]/.test(prevChar) || !/[a-zA-Z]/.test(nextChar)) {
50                    return 0;
51                }
52              
53                hasHyphen = true;
54            }
55        }
56      
57        // Word passes all validation rules
58        return 1;
59    };
60  
61    // Split sentence by whitespace and count valid words
62    const words: string[] = sentence.split(/\s+/);
63    const validWordCount: number = words.reduce(
64        (accumulator: number, word: string) => accumulator + validateWord(word), 
65        0
66    );
67  
68    return validWordCount;
69}
70

Time and Space Complexity

The time complexity is O(n), where n is the length of the sentence string. This is because:

  • The split() method traverses the entire sentence once to split it into words, which takes O(n) time
  • For each word generated, the check() function examines each character exactly once
  • Since the total number of characters across all words cannot exceed n (the original sentence length), the overall character examination also takes O(n) time
  • Therefore, the total time complexity is O(n) + O(n) = O(n)

The space complexity is O(n), where n is the length of the sentence string. This is because:

  • The split() method creates a list of words that, in the worst case (sentence with no spaces), could store the entire sentence, requiring O(n) space
  • The check() function uses only a constant amount of extra space with variables like st, i, and c, which is O(1)
  • Therefore, the total space complexity is dominated by the split operation, resulting in O(n)

Learn more about how to find time and space complexity quickly.

Common Pitfalls

1. Mishandling Empty Tokens

One common pitfall is not considering that split() automatically filters out empty strings when splitting by spaces. However, if you were to use a different splitting method or manual parsing, you might encounter empty strings that could cause index errors.

Example problematic scenario:

# If manually parsing with split(' ') instead of split()
words = sentence.split(' ')  # This keeps empty strings from multiple spaces
# "hello  world" → ['hello', '', 'world']

Solution: Always use split() without arguments, which automatically handles multiple spaces and filters empty strings.

2. Index Out of Bounds When Checking Hyphen

The most critical pitfall is accessing word[index - 1] when index = 0 or word[index + 1] when index = len(word) - 1. The current solution handles this correctly by checking boundary conditions first, but a common mistake is to check the surrounding characters before verifying the position.

Incorrect approach:

if char == "-":
    # WRONG: Checking surrounding chars before boundary check
    if not word[index - 1].isalpha() or not word[index + 1].isalpha():
        return False
    if hyphen_seen or index in (0, len(word) - 1):
        return False

Correct approach (as in solution):

if char == "-":
    # RIGHT: Check boundaries FIRST using short-circuit evaluation
    if (hyphen_seen or 
        index in (0, len(word) - 1) or 
        not word[index - 1].isalpha() or 
        not word[index + 1].isalpha()):
        return False

3. Misunderstanding Punctuation Rules

A subtle pitfall is misinterpreting "at most one punctuation mark" to mean checking for duplicate punctuation marks separately, rather than understanding that the position check inherently enforces this rule.

Overcomplicated approach:

punctuation_count = 0
for char in word:
    if char in "!.,":
        punctuation_count += 1
        if punctuation_count > 1:
            return False

Better approach (as in solution): Since punctuation can only be at the end, checking index < len(word) - 1 automatically ensures at most one punctuation mark exists (multiple punctuation marks would mean at least one is not at the end).

4. Forgetting Edge Cases

Not handling single-character tokens properly, especially:

  • Single punctuation mark like "!" or "," (should be valid)
  • Single hyphen "-" (should be invalid)
  • Single letter "a" (should be valid)

Solution: The current implementation handles these correctly through the logical flow, but it's important to test these edge cases explicitly.

Discover Your Strengths and Weaknesses: Take Our 5-Minute Quiz to Tailor Your Study Plan:

Which data structure is used in a depth first search?


Recommended Readings

Want a Structured Path to Master System Design Too? Don’t Miss This!

Load More