831. Masking Personal Information

MediumString
Leetcode Link

Problem Description

This problem involves designing an algorithm to mask personal information. Personal information can be an email address or a phone number, and each has specific rules for masking.

Email Address Masking Rules:

  • Transform all uppercase letters in the name and domain parts of the email address to lowercase.
  • Mask all but the first and last letter of the name part with five asterisks (*****).

Phone Number Masking Rules:

  • Remove all separation characters (which may include +, -, (, ), and spaces) to only retain digits.
  • Format the number based on the leftover digit count after the local number (last 10 digits). The formatting is as follows:
    • If there is no country code, the format is ***-***-XXXX, where XXXX is the last four digits of the local number.
    • If there is a country code, it should be represented by the same number of asterisks as digits, followed by the formatted local number ***-***-XXXX.

The algorithm must identify whether the provided string is an email or a phone number and mask it accordingly.

Intuition

The intuition behind solving this problem starts with identifying if the given s is an email address or a phone number. This can usually be identified by the presence of alphabetic characters and the special character @. If s[0] is an alphabetic character, it is an email, otherwise it's a phone number.

For an email address:

  1. Convert the entire string to lowercase to satisfy the case-insensitivity requirement.
  2. Replace the middle part of the name with five asterisks, but maintain the first character, the character before the @, and the complete domain after the @.

For a phone number:

  1. Remove all separation characters by filtering out non-digit characters, leaving only the digits.
  2. The number of characters remaining after the last 10 gives us the number of digits in the country code. Depending on this count, mask the country code with asterisks.
  3. The last 4 digits of the local number remain unmasked, while the others are replaced by asterisks, keeping consistent with the ***-***- prefix.

The provided solution uses Python comprehensions to filter and the string methods to replace and format according to the rules. The code is compact and only uses conditional logic to distinguish the masking process between an email address and a phone number.

Solution Approach

The implementation of the masking algorithm can be broken down into several steps and patterns used throughout the code:

  1. Determining the Type of Personal Information: The algorithm first checks if the given string s represents an email address or a phone number. This is done by checking the first character of the string:

    1if s[0].isalpha():

    If the first character is an alphabetical letter, it is an email address; otherwise, it is a phone number.

  2. Email Masking:

    • All characters of the email address are converted to lowercase using lower() to ensure case insensitivity.
    • The first and the last character of the name part, along with the domain, are preserved.
    • The middle part of the name is replaced with five asterisks (*****).
    1return s[0] + '*****' + s[s.find('@') - 1 :]
  3. Phone Number Masking:

    • All non-digit characters are filtered out, leaving only digits in the string:
      1s = ''.join(c for c in s if c.isdigit())
    • The algorithm calculates the length of the country code by subtracting 10 from the total length of the filtered digits. This is because the local number always consists of the last 10 digits.
    • A switch-like statement, using if-elif, is avoided. Instead, a more elegant formatting via an f-string and repetition of asterisk characters is used to handle different lengths of country codes:
      1cnt = len(s) - 10
      2suf = '***-***-' + s[-4:]
      3return suf if cnt == 0 else f'+{"*" * cnt}-{suf}'

In summary, the solution makes use of string methods (like lower() and find()), list comprehensions (to filter characters), and string concatenation to assemble the masked personal information. It uses the compactness of Python's string formatting to keep the code concise and readable. There is no explicit usage of complex data structures as the operations are basic string manipulations. This approach is efficient because it goes through the string at most two times: once to either lower the case of the characters (for emails) or filter digits (for phone numbers), and once to construct the masked string. This is essentially O(n) complexity, where n is the length of the input string.

💪
Level Up Your
Algo Skills

Example Walkthrough

Let's walk through a small example illustrating the solution approach for both an email and a phone number.

Email Example

Given the email [email protected], let's apply the solution approach:

  1. Since the first character 'E' is an alphabetical letter, we identify the string as an email address.
  2. Convert the email to lowercase: [email protected].
  3. Preserve the first and the last character of the name part and the entire domain: 'e' and 'l' are the two characters we preserve from exampleemail.
  4. Replace the middle part of the name with five asterisks: e*****[email protected].

After following these steps, we get the masked email address: e*****[email protected].

Phone Number Example

Given the phone number +1 (234) 567-8901, let's apply the solution approach:

  1. The first character '+' indicates it's a phone number.
  2. Remove all non-digit characters, leaving only the digits: 12345678901.
  3. Identify the country code, which in this case is '1', by counting the digits beyond the last 10, which gives us 1 digit here.
  4. Replace the country code with an asterisk and apply the standard mask for the remaining number, preserving the last four digits: * ***-***-8901.

After following these steps, we get the masked phone number: * ***-***-8901.

These two examples demonstrate how the algorithm distinguishes between email addresses and phone numbers and how it applies the respective masking rules to conceal personal information while following the specified format.

Python Solution

1class Solution:
2    def maskPII(self, string: str) -> str:
3        # Check if the string contains an email address by looking for an alphabetic character at the start
4        if string[0].isalpha():
5            string = string.lower() # Convert the email to lowercase
6            # Build the masked email with the first character, five asterisks, and the domain part
7            masked_email = string[0] + '*****' + string[string.find('@') - 1:]
8            return masked_email
9        # If it's not an email, assume it's a phone number
10        else:
11            # Remove all non-digit characters from the string
12            digits_only = ''.join(c for c in string if c.isdigit())
13            # Count the number of digits beyond the local 10-digit format
14            extra_digits_count = len(digits_only) - 10
15            # Create a suffix for the masked phone number
16            phone_suffix = '***-***-' + digits_only[-4:]
17            # If there are no extra digits, return the standard format
18            if extra_digits_count == 0:
19                return phone_suffix
20            # If there are extra digits (international format), add the country code mask
21            else:
22                return f'+{"*" * extra_digits_count}-{phone_suffix}'
23
24# Example usage:
25# solution = Solution()
26# print(solution.maskPII("[email protected]"))  # Output: "j*****[email protected]"
27# print(solution.maskPII("1(234)567-890"))         # Output: "***-***-7890"
28# print(solution.maskPII("+1(234)567-890"))        # Output: "+*-***-***-7890"
29

Java Solution

1class Solution {
2
3    public String maskPII(String input) {
4        // Check if the first character is a letter to determine if it's an email
5        if (Character.isLetter(input.charAt(0))) {
6            // Convert the entire input string to lower case for uniformity
7            input = input.toLowerCase();
8            // Find the index of the '@' symbol in the email
9            int atIndex = input.indexOf('@');
10            // Create a masked email with the first character, five stars, and the domain part
11            return input.charAt(0) + "*****" + input.substring(atIndex - 1);
12        }
13      
14        // StringBuilder to hold only the digits from the input (phone number)
15        StringBuilder digitBuilder = new StringBuilder();
16        // Loop through the characters in the input
17        for (char c : input.toCharArray()) {
18            // Append only if the character is a digit
19            if (Character.isDigit(c)) {
20                digitBuilder.append(c);
21            }
22        }
23      
24        // Convert the StringBuilder to a string containing only digits
25        String digits = digitBuilder.toString();
26        // Calculate the number of digits that are in the international code
27        int internationalCodeLength = digits.length() - 10;
28        // Create a masked string for the last 7 digits of the phone number
29        String maskedSuffix = "***-***-" + digits.substring(digits.length() - 4);
30      
31        // Depending on whether there is an international code, mask the phone number appropriately
32        if (internationalCodeLength == 0) {
33            // If there's no international code, return just the masked U.S. number
34            return maskedSuffix;
35        } else {
36            // If there's an international code, mask it and append the U.S. number
37            String starsForInternational = "+";
38            for(int i = 0; i < internationalCodeLength; i++) {
39                starsForInternational += "*";
40            }
41            return starsForInternational + "-" + maskedSuffix;
42        }
43    }
44}
45

C++ Solution

1#include <cctype> // include ctype for using tolower, isdigit functions
2#include <string> // include string for using string class
3using namespace std;
4
5class Solution {
6public:
7    /**
8     * @param s: The original string, either an email or a phone number.
9     * @return: The masked string with PIIs (Personally Identifiable Information) hidden.
10     */
11    string maskPII(string s) {
12        // Find the position of '@', which indicates an email.
13        size_t atPosition = s.find('@');
14        // If "@" is found, mask it as an email.
15        if (atPosition != string::npos) {
16            string maskedEmail;
17            // Convert first character to lowercase and append.
18            maskedEmail += tolower(s[0]);
19            // Append 5 asterisks to hide part of the local name.
20            maskedEmail += "*****";
21            // Convert from before '@' to lowercase and append the domain.
22            for (size_t j = atPosition - 1; j < s.size(); ++j) {
23                maskedEmail += tolower(s[j]);
24            }
25            return maskedEmail;
26        }
27
28        // If "@" is not found, it's a phone number.
29        // Extract all digits from the string.
30        string digits;
31        for (char c : s) {
32            if (isdigit(c)) {
33                digits += c;
34            }
35        }
36
37        // Determine how many digits are in the country code.
38        int countryCodeLength = digits.size() - 10;
39        // Create the suffix for the standard 10-digit mask.
40        string suffix = "***-***-" + digits.substr(digits.size() - 4);
41
42        // If there's no country code, return just the suffix.
43        if (countryCodeLength == 0) {
44            return suffix;
45        } else {
46            // If there's a country code, prepend it with masks.
47            string countryCodeMask = "+" + string(countryCodeLength, '*') + "-";
48            return countryCodeMask + suffix; // Full masked string with country code.
49        }
50    }
51};
52

Typescript Solution

1function maskPII(s: string): string {
2    // Find the index of '@' to determine if it's an email.
3    const atIndex: number = s.indexOf('@');
4    if (atIndex !== -1) {
5        // If it's an email, mask the characters between the first character and the '@' symbol.
6        let maskedEmail: string = s[0].toLowerCase() + '*****';
7        for (let j: number = atIndex - 1; j < s.length; j++) {
8            maskedEmail += s.charAt(j).toLowerCase();
9        }
10        return maskedEmail;
11    }
12
13    // If it's not an email, build a string containing only the digits from the input.
14    let numericString: string = '';
15    for (const char of s) {
16        if (/\d/.test(char)) { // Check if the character is a digit.
17            numericString += char;
18        }
19    }
20
21    // Compute the count of digits to be replaced by '*' in the case there are country codes.
22    const countryCodeStarsCount: number = numericString.length - 10;
23
24    // The suffix for the phone number hiding the first 6 digits, revealing only the last 4.
25    const maskedPhoneSuffix: string = `***-***-${numericString.substring(numericString.length - 4)}`;
26
27    // Return the masked phone number with country code as '*' if there is a country code.
28    return countryCodeStarsCount === 0 ? maskedPhoneSuffix : `+${'*'.repeat(countryCodeStarsCount)}-${maskedPhoneSuffix}`;
29}
30

Time and Space Complexity

Time Complexity:

The time complexity of the maskPII function is determined by several operations that the function performs:

  1. Checking whether the first character is alphabetical: O(1), since it's a constant-time check on a single character.

  2. Converting the string to lower case if it's an email: O(n), where n is the length of the string s.

  3. Finding the position of '@' in the string: O(n) in the worst case, since it requires scanning the string in case of an email address.

  4. Stripping non-digit characters and joining digits into a new string: O(n), because each character of the original string s is checked once.

  5. Constructing the masked string: The actual construction is O(1) because we are appending fixed strings and characters to a masked result.

Combining these, the time complexity is dictated by the string operations, which are linear in the length of the input string s. Therefore, the time complexity is O(n).

Space Complexity:

The space complexity is determined by the additional space used by the function:

  1. A new lower case email id string if s[0].isalpha(): O(n) space for the new string.

  2. The list comprehension that filters digits: O(n) space for the new list of characters before it is joined into a string.

  3. The final masked string has a constant size related to the format of the phone number or email (O(1) space), not dependent on the input size.

Thus, the dominating factor is the creation of new strings, which takes O(n) space. Hence, the space complexity is O(n).

😈
Become an
Algo Monster

Got a question? Ask the Teaching Assistant anything you don't understand.

Still not clear? Ask in the Forum,  Discord or Submit the part you don't understand to our editors.


TA 👨‍🏫