2047. Number of Valid Words in a Sentence
Problem Description
You are given a sentence that contains only lowercase letters ('a'
to 'z'
), digits ('0'
to '9'
), hyphens ('-'
), punctuation marks ('!'
, '.'
, and ','
), and spaces (' '
). The sentence can be split into one or more tokens using spaces as separators.
A token is considered a valid word if it satisfies all three of the following conditions:
-
No digits allowed: The token can only contain lowercase letters, hyphens, and/or punctuation marks. It cannot contain any digits.
-
Hyphen rules:
- There can be at most one hyphen
'-'
in the token - If a hyphen is present, it must be surrounded by lowercase letters on both sides
- Examples:
"a-b"
is valid, but"-ab"
and"ab-"
are not valid
- There can be at most one hyphen
-
Punctuation rules:
- There can be at most one punctuation mark (
'!'
,'.'
, or','
) in the token - If a punctuation mark is present, it must be at the end of the token
- Examples:
"ab,"
,"cd!"
, and"."
are valid, but"a!b"
and"c.,"
are not valid
- There can be at most one punctuation mark (
Valid word examples include: "a-b."
, "afad"
, "ba-c"
, "a!"
, and "!"
.
Given a string sentence
, you need to count and return the total number of valid words in the sentence.
Intuition
The problem asks us to validate each word in a sentence according to specific rules. The natural approach is to break down the problem into smaller, manageable parts.
First, we need to separate the sentence into individual tokens since spaces are the delimiters. Python's split()
method handles this perfectly, automatically dealing with multiple consecutive spaces.
For each token, we need to validate it against three rules. The key insight is that we can check these rules character by character in a single pass through each token:
-
Digit check: If we encounter any digit, we can immediately reject the token.
-
Hyphen validation: We need to track if we've already seen a hyphen (using a boolean flag
st
). When we encounter a hyphen, we check:- Is this the first hyphen? (if
st
is alreadyTrue
, it's invalid) - Is it at the beginning or end? (positions
0
orlen(s)-1
are invalid) - Are the characters immediately before and after it letters? (we can check
s[i-1]
ands[i+1]
)
- Is this the first hyphen? (if
-
Punctuation validation: Punctuation marks are only valid at the end of the token. So if we find a punctuation mark at any position other than the last (
i < len(s) - 1
), the token is invalid.
By combining these checks in a single traversal of each token, we can efficiently determine validity. The beauty of this approach is that we can return False
as soon as we find any violation, making it efficient. If we complete the traversal without finding any violations, the token is valid.
Finally, we count all valid tokens using a simple sum comprehension: sum(check(s) for s in sentence.split())
, where the boolean results (True
for valid, False
for invalid) are automatically converted to 1s and 0s for counting.
Solution Approach
The solution uses a simulation approach where we validate each word according to the given rules. Here's the step-by-step implementation:
Step 1: Split the sentence into tokens
sentence.split()
This automatically handles multiple spaces and gives us individual tokens to validate.
Step 2: Create a validation function for each token
We define a helper function check(s: str)
that validates a single token. The function uses a boolean variable st
to track whether we've already encountered a hyphen.
Step 3: Character-by-character validation
For each character c
at position i
in the token s
, we perform the following checks:
-
Digit check:
if c.isdigit(): return False
If any character is a digit, the token is invalid.
-
Punctuation position check:
if c in "!.," and i < len(s) - 1: return False
Punctuation marks can only appear at the end of the token. If we find one before the last position, the token is invalid.
-
Hyphen validation:
if c == "-": if (st or i in (0, len(s) - 1) or not s[i - 1].isalpha() or not s[i + 1].isalpha()): return False st = True
For a hyphen, we check multiple conditions:
st
: Has a hyphen already appeared? (only one allowed)i in (0, len(s) - 1)
: Is it at the beginning or end? (not allowed)not s[i - 1].isalpha()
: Is the character before it a letter? (required)not s[i + 1].isalpha()
: Is the character after it a letter? (required)
If all conditions pass, we set
st = True
to mark that we've seen a hyphen.
Step 4: Count valid words
Finally, we use a generator expression with sum()
to count all valid tokens:
return sum(check(s) for s in sentence.split())
The check()
function returns True
(counted as 1) for valid tokens and False
(counted as 0) for invalid ones. The sum gives us the total count of valid words.
Time Complexity: O(n)
where n
is the length of the sentence, as we traverse each character once.
Space Complexity: O(m)
where m
is the number of tokens after splitting, needed to store the split results.
Ready to land your dream job?
Unlock your dream job with a 5-minute evaluator for a personalized learning plan!
Start EvaluatorExample Walkthrough
Let's walk through the solution with the sentence: "cat and-mice- d0g!"
Step 1: Split into tokens
After splitting by spaces, we get: ["cat", "and-mice-", "d0g!"]
Step 2: Validate each token
Token 1: "cat"
- Initialize
st = False
(no hyphen seen yet) - Check each character:
- 'c': Not a digit ✓, not punctuation ✓, not hyphen ✓
- 'a': Not a digit ✓, not punctuation ✓, not hyphen ✓
- 't': Not a digit ✓, not punctuation ✓, not hyphen ✓
- Result: Valid (returns
True
)
Token 2: "and-mice-"
- Initialize
st = False
- Check each character:
- 'a': Not a digit ✓, not punctuation ✓, not hyphen ✓
- 'n': Not a digit ✓, not punctuation ✓, not hyphen ✓
- 'd': Not a digit ✓, not punctuation ✓, not hyphen ✓
- '-' (at index 3):
st
isFalse
✓ (first hyphen)- Not at position 0 or 8 (last position) ✓
s[2]
= 'd' is a letter ✓s[4]
= 'm' is a letter ✓- Set
st = True
- 'm': Not a digit ✓, not punctuation ✓, not hyphen ✓
- 'i': Not a digit ✓, not punctuation ✓, not hyphen ✓
- 'c': Not a digit ✓, not punctuation ✓, not hyphen ✓
- 'e': Not a digit ✓, not punctuation ✓, not hyphen ✓
- '-' (at index 8):
- This is at the last position (index 8 = len("and-mice-") - 1)
- Hyphen at the end is invalid ✗
- Result: Invalid (returns
False
)
Token 3: "d0g!"
- Initialize
st = False
- Check each character:
- 'd': Not a digit ✓, not punctuation ✓, not hyphen ✓
- '0': This is a digit ✗
- Result: Invalid (returns
False
immediately)
Step 3: Count valid words
sum([True, False, False])
= sum([1, 0, 0])
= 1
The sentence contains 1 valid word.
Solution Implementation
1class Solution:
2 def countValidWords(self, sentence: str) -> int:
3 """
4 Count the number of valid words in a sentence.
5 A valid word contains:
6 - No digits
7 - At most one hyphen, which must be surrounded by lowercase letters
8 - At most one punctuation mark (!,.) which must be at the end
9 """
10
11 def is_valid_word(word: str) -> bool:
12 """
13 Check if a word is valid according to the rules.
14
15 Args:
16 word: A string to validate
17
18 Returns:
19 True if the word is valid, False otherwise
20 """
21 hyphen_seen = False
22
23 for index, char in enumerate(word):
24 # Check for invalid digit or punctuation not at end
25 if char.isdigit() or (char in "!.," and index < len(word) - 1):
26 return False
27
28 # Handle hyphen validation
29 if char == "-":
30 # Check hyphen conditions:
31 # 1. Only one hyphen allowed
32 # 2. Cannot be at start or end
33 # 3. Must be surrounded by letters
34 if (hyphen_seen or
35 index in (0, len(word) - 1) or
36 not word[index - 1].isalpha() or
37 not word[index + 1].isalpha()):
38 return False
39 hyphen_seen = True
40
41 return True
42
43 # Split sentence into words and count valid ones
44 return sum(is_valid_word(word) for word in sentence.split())
45
1class Solution {
2 /**
3 * Counts the number of valid words in a sentence.
4 * A valid word follows specific rules regarding digits, punctuation, and hyphens.
5 *
6 * @param sentence the input sentence to be processed
7 * @return the count of valid words in the sentence
8 */
9 public int countValidWords(String sentence) {
10 int validWordCount = 0;
11
12 // Split sentence by spaces and check each token
13 for (String token : sentence.split(" ")) {
14 validWordCount += checkIfValidWord(token.toCharArray());
15 }
16
17 return validWordCount;
18 }
19
20 /**
21 * Checks if a character array represents a valid word.
22 * Valid word rules:
23 * - No digits allowed
24 * - Punctuation marks (! . ,) can only appear at the end
25 * - At most one hyphen allowed, and it must be surrounded by letters
26 *
27 * @param characters the character array to validate
28 * @return 1 if valid word, 0 otherwise
29 */
30 private int checkIfValidWord(char[] characters) {
31 // Empty strings are not valid words
32 if (characters.length == 0) {
33 return 0;
34 }
35
36 boolean hyphenFound = false;
37
38 for (int i = 0; i < characters.length; ++i) {
39 char currentChar = characters[i];
40
41 // Rule: No digits allowed in valid words
42 if (Character.isDigit(currentChar)) {
43 return 0;
44 }
45
46 // Rule: Punctuation marks can only appear at the end of the word
47 if ((currentChar == '!' || currentChar == '.' || currentChar == ',')
48 && i < characters.length - 1) {
49 return 0;
50 }
51
52 // Rule: Handle hyphen validation
53 if (currentChar == '-') {
54 // Check if hyphen already found (only one allowed)
55 // or if hyphen is at the beginning or end (not allowed)
56 if (hyphenFound || i == 0 || i == characters.length - 1) {
57 return 0;
58 }
59
60 // Check if hyphen is surrounded by letters
61 if (!Character.isAlphabetic(characters[i - 1])
62 || !Character.isAlphabetic(characters[i + 1])) {
63 return 0;
64 }
65
66 hyphenFound = true;
67 }
68 }
69
70 // Word is valid
71 return 1;
72 }
73}
74
1class Solution {
2public:
3 int countValidWords(string sentence) {
4 // Lambda function to validate if a word meets the criteria
5 auto isValidWord = [](const string& word) -> int {
6 bool hasHyphen = false; // Track if we've already seen a hyphen
7
8 for (int i = 0; i < word.length(); ++i) {
9 // Rule 1: Word cannot contain any digits
10 if (isdigit(word[i])) {
11 return 0;
12 }
13
14 // Rule 2: Punctuation marks (!, ., ,) can only appear at the end
15 if ((word[i] == '!' || word[i] == '.' || word[i] == ',') &&
16 i < word.length() - 1) {
17 return 0;
18 }
19
20 // Rule 3: Hyphen validation
21 if (word[i] == '-') {
22 // Only one hyphen allowed, and it cannot be at start or end
23 if (hasHyphen || i == 0 || i == word.length() - 1) {
24 return 0;
25 }
26 // Hyphen must be surrounded by lowercase letters
27 if (!isalpha(word[i - 1]) || !isalpha(word[i + 1])) {
28 return 0;
29 }
30 hasHyphen = true;
31 }
32 }
33
34 return 1; // Word is valid
35 };
36
37 int validWordCount = 0;
38 stringstream tokenizer(sentence);
39 string currentWord;
40
41 // Parse the sentence word by word (space-separated)
42 while (tokenizer >> currentWord) {
43 validWordCount += isValidWord(currentWord);
44 }
45
46 return validWordCount;
47 }
48};
49
1/**
2 * Counts the number of valid words in a sentence based on specific validation rules
3 * @param sentence - The input sentence to validate
4 * @returns The number of valid words in the sentence
5 */
6function countValidWords(sentence: string): number {
7 /**
8 * Validates if a single word meets the criteria for being valid
9 * Rules:
10 * - No digits allowed
11 * - Punctuation marks (! . ,) can only appear at the end
12 * - Hyphen can appear at most once and must be between two letters
13 * @param word - The word to validate
14 * @returns 1 if valid, 0 if invalid
15 */
16 const validateWord = (word: string): number => {
17 // Empty strings are not valid words
18 if (word.length === 0) {
19 return 0;
20 }
21
22 // Track if we've already seen a hyphen
23 let hasHyphen: boolean = false;
24
25 for (let i = 0; i < word.length; i++) {
26 const currentChar: string = word[i];
27
28 // Check for digits - invalid if any digit is found
29 if (/\d/.test(currentChar)) {
30 return 0;
31 }
32
33 // Check punctuation - must only be at the end of word
34 const punctuationMarks: string[] = ['!', '.', ','];
35 if (punctuationMarks.includes(currentChar) && i < word.length - 1) {
36 return 0;
37 }
38
39 // Check hyphen rules
40 if (currentChar === '-') {
41 // Multiple hyphens or hyphen at start/end is invalid
42 if (hasHyphen || i === 0 || i === word.length - 1) {
43 return 0;
44 }
45
46 // Hyphen must be between two letters
47 const prevChar: string = word[i - 1];
48 const nextChar: string = word[i + 1];
49 if (!/[a-zA-Z]/.test(prevChar) || !/[a-zA-Z]/.test(nextChar)) {
50 return 0;
51 }
52
53 hasHyphen = true;
54 }
55 }
56
57 // Word passes all validation rules
58 return 1;
59 };
60
61 // Split sentence by whitespace and count valid words
62 const words: string[] = sentence.split(/\s+/);
63 const validWordCount: number = words.reduce(
64 (accumulator: number, word: string) => accumulator + validateWord(word),
65 0
66 );
67
68 return validWordCount;
69}
70
Time and Space Complexity
The time complexity is O(n)
, where n
is the length of the sentence string. This is because:
- The
split()
method traverses the entire sentence once to split it into words, which takesO(n)
time - For each word generated, the
check()
function examines each character exactly once - Since the total number of characters across all words cannot exceed
n
(the original sentence length), the overall character examination also takesO(n)
time - Therefore, the total time complexity is
O(n) + O(n) = O(n)
The space complexity is O(n)
, where n
is the length of the sentence string. This is because:
- The
split()
method creates a list of words that, in the worst case (sentence with no spaces), could store the entire sentence, requiringO(n)
space - The
check()
function uses only a constant amount of extra space with variables likest
,i
, andc
, which isO(1)
- Therefore, the total space complexity is dominated by the split operation, resulting in
O(n)
Learn more about how to find time and space complexity quickly.
Common Pitfalls
1. Mishandling Empty Tokens
One common pitfall is not considering that split()
automatically filters out empty strings when splitting by spaces. However, if you were to use a different splitting method or manual parsing, you might encounter empty strings that could cause index errors.
Example problematic scenario:
# If manually parsing with split(' ') instead of split() words = sentence.split(' ') # This keeps empty strings from multiple spaces # "hello world" → ['hello', '', 'world']
Solution: Always use split()
without arguments, which automatically handles multiple spaces and filters empty strings.
2. Index Out of Bounds When Checking Hyphen
The most critical pitfall is accessing word[index - 1]
when index = 0
or word[index + 1]
when index = len(word) - 1
. The current solution handles this correctly by checking boundary conditions first, but a common mistake is to check the surrounding characters before verifying the position.
Incorrect approach:
if char == "-":
# WRONG: Checking surrounding chars before boundary check
if not word[index - 1].isalpha() or not word[index + 1].isalpha():
return False
if hyphen_seen or index in (0, len(word) - 1):
return False
Correct approach (as in solution):
if char == "-":
# RIGHT: Check boundaries FIRST using short-circuit evaluation
if (hyphen_seen or
index in (0, len(word) - 1) or
not word[index - 1].isalpha() or
not word[index + 1].isalpha()):
return False
3. Misunderstanding Punctuation Rules
A subtle pitfall is misinterpreting "at most one punctuation mark" to mean checking for duplicate punctuation marks separately, rather than understanding that the position check inherently enforces this rule.
Overcomplicated approach:
punctuation_count = 0 for char in word: if char in "!.,": punctuation_count += 1 if punctuation_count > 1: return False
Better approach (as in solution):
Since punctuation can only be at the end, checking index < len(word) - 1
automatically ensures at most one punctuation mark exists (multiple punctuation marks would mean at least one is not at the end).
4. Forgetting Edge Cases
Not handling single-character tokens properly, especially:
- Single punctuation mark like
"!"
or","
(should be valid) - Single hyphen
"-"
(should be invalid) - Single letter
"a"
(should be valid)
Solution: The current implementation handles these correctly through the logical flow, but it's important to test these edge cases explicitly.
Which data structure is used in a depth first search?
Recommended Readings
Coding Interview Patterns Your Personal Dijkstra's Algorithm to Landing Your Dream Job The goal of AlgoMonster is to help you get a job in the shortest amount of time possible in a data driven way We compiled datasets of tech interview problems and broke them down by patterns This way
Recursion Recursion is one of the most important concepts in computer science Simply speaking recursion is the process of a function calling itself Using a real life analogy imagine a scenario where you invite your friends to lunch https assets algo monster recursion jpg You first call Ben and ask
Runtime Overview When learning about algorithms and data structures you'll frequently encounter the term time complexity This concept is fundamental in computer science and offers insights into how long an algorithm takes to complete given a certain input size What is Time Complexity Time complexity represents the amount of time
Want a Structured Path to Master System Design Too? Don’t Miss This!