Amazon Online Assessment (OA) - Most Common Word with Exclusion List

Find the most frequently used keyword that is not in a list of ignored_keywords that is used within a paragraph. The paragraph is a single line of words that may contain punctuation, mixed with uppercase and lowercase. The word comparisons should not be case sensitive, and the output is expected to be lowercase.

Examples

Example 1:

Input:

paragraph = "If this book was written today in the midst of the slew of dystopian novels that come out, it may not have stood out. But, this book was way ahead of its time." ignored_keywords = ["of", "was", "the"]

Output: "book"
Explanation:

"of" appears three times and "was", "the" appear twice, but they are part of the ignored_keywords list. The next most popular word is "book", which appears twice.

Constraints

There is always at least one word in the paragraph and at least one word in the list of excluded keywords. The most common keyword usage count that is not ignored will always be unique. Keywords in the exclusion list only consist of lowercase alphabetical characters.

Try it yourself

Solution

1
-
from typing import List
1
+
import re
2
+
from typing import Counter, List
2
3
def most_common_word(paragraph: str, banned: List[str]) -> str:
3
-
    # WRITE YOUR BRILLIANT CODE HERE
4
+
    ban = set(banned)
4
-
    return ''
5
+
    counts = Counter(
6
+
        word
7
+
        for m in re.finditer(r'\w+', paragraph)
8
+
        for word in [m[0].lower()]
9
+
        if word not in ban
10
+
    )
11
+
    [(word, _)] = counts.most_common(1)
12
+
    return word
13
+
5
14
if __name__ == '__main__':
6
15
    paragraph = input()
7
16
    banned = input().split()
8
17
    res = most_common_word(paragraph, banned)
9
18
    print(res)