3805. Count Caesar Cipher Pairs
Problem Description
You are given an array words containing n strings. Every string in this array has the same length m and is made up of only lowercase English letters.
We say two strings s and t are similar if we can make them equal by applying the following operation any number of times (including zero times):
- Pick either
sort. - Replace every letter in the chosen string with the next letter in the alphabet, wrapping around cyclically. After
'z', the next letter is'a'.
The key idea here is that the operation shifts all characters of a string by the same amount at once. So two strings are similar if one can be obtained from the other by shifting every character by some common offset (cyclically).
Your task is to count the number of index pairs (i, j) that satisfy both of these conditions:
i < jwords[i]andwords[j]are similar.
Return an integer representing the total number of such pairs.
How We Pick the Algorithm
Why Hash Table / Counting?
This problem maps to Hash Table / Counting through a short path in the full flowchart.
Storing seen values in a hash table provides constant-time lookup and counting.
Open in FlowchartIntuition
The operation always shifts every character of a string by the same amount at the same time. This means that no matter how many times we apply the operation, the relative differences between consecutive characters of a string never change. Only the absolute position of all characters changes together.
For example, if we have the string "abc", applying the operation once gives "bcd", then "cde", and so on. Notice that in all of these, the gap from the first character to the second is always 1, and from the second to the third is always 1. These gaps stay fixed forever.
This gives us a powerful insight: two strings are similar if and only if they have the same pattern of differences between characters. The actual starting letter does not matter, because we can always shift one string until its starting letter matches the other.
So the trick is to convert each string into a canonical form, a single representative form that is identical for all strings similar to each other. The natural way to do this is to "anchor" the first character to a fixed letter (here we choose 'z'), and then shift all the other characters by the same offset k = 'z' - s[0]. After this transformation, any two similar strings collapse into exactly the same string.
Once every string is reduced to its canonical form, the problem becomes simple counting. We use a hash table to count how many strings map to each canonical form. If a particular form appears v times, then any two of those strings form a valid similar pair. The number of such pairs is the combination C(v, 2) = v * (v - 1) / 2. Summing this value over all distinct forms gives the final answer.
Pattern Learn more about Math patterns.
Solution Approach
We use String Transformation + Counting to solve this problem efficiently.
Data structure used: a hash table cnt (implemented with defaultdict(int)) that maps each canonical form of a string to the number of times it appears.
Step-by-step walkthrough:
-
Iterate over every string
sinwords. For each one, we build its canonical form so that all strings similar to it become identical. -
Compute the offset. We decide to anchor the first character to
'z'. The amount we need to shift the first character isk = ord('z') - ord(t[0]). This same offsetkmust be applied to every character, because the operation always shifts all characters by the same value. -
Shift the remaining characters. For each index
ifrom1tolen(t) - 1, we apply the offset cyclically:t[i] = chr((ord(t[i]) - ord('a') + k) % 26 + ord('a'))Here we map the character into the range
0..25, add the offsetk, take modulo26to wrap around the alphabet, and convert it back to a letter. -
Anchor the first character. Since the first character was always meant to become
'z', we directly sett[0] = 'z'. This avoids recomputing it and guarantees the anchor is consistent. -
Record the canonical form. We join the list back into a string with
"".join(t)and increment its count in the hash table:cnt["".join(t)] += 1. -
Count the pairs. After processing all strings, each distinct canonical form with count
vcontributesv * (v - 1) / 2pairs, since any two of thosevstrings are similar to each other. We sum this over all values in the hash table:sum(v * (v - 1) // 2 for v in cnt.values())
Complexity analysis:
- Time complexity:
O(n * m), wherenis the number of strings andmis the length of each string. We process each character of every string exactly once when building the canonical forms. - Space complexity:
O(n * m), used by the hash table to store the canonical form of each string.
Example Walkthrough
Let's trace through a small example to see how the solution approach works.
Input: words = ["abc", "bcd", "xyz", "acb"]
We want to count pairs (i, j) with i < j where the two strings are similar.
We'll process each string by anchoring its first character to 'z', then shifting every other character by the same offset k.
String 0: "abc"
- First character is
'a'. Offset:k = ord('z') - ord('a') = 25 - 0 = 25. - Shift index 1 (
'b'):(1 + 25) % 26 = 26 % 26 = 0→'a'. - Shift index 2 (
'c'):(2 + 25) % 26 = 27 % 26 = 1→'b'. - Anchor first character to
'z'. - Canonical form:
"zab"→cnt = {"zab": 1}
String 1: "bcd"
- First character is
'b'. Offset:k = 25 - 1 = 24. - Shift index 1 (
'c'):(2 + 24) % 26 = 26 % 26 = 0→'a'. - Shift index 2 (
'd'):(3 + 24) % 26 = 27 % 26 = 1→'b'. - Anchor first character to
'z'. - Canonical form:
"zab"→cnt = {"zab": 2}
Notice "abc" and "bcd" collapse to the same form "zab". That makes sense: shifting "abc" once gives "bcd", so they are similar.
String 2: "xyz"
- First character is
'x'. Offset:k = 25 - 23 = 2. - Shift index 1 (
'y'):(24 + 2) % 26 = 26 % 26 = 0→'a'. - Shift index 2 (
'z'):(25 + 2) % 26 = 27 % 26 = 1→'b'. - Anchor first character to
'z'. - Canonical form:
"zab"→cnt = {"zab": 3}
"xyz" also has consecutive +1 gaps, so it joins the same group.
String 3: "acb"
- First character is
'a'. Offset:k = 25 - 0 = 25. - Shift index 1 (
'c'):(2 + 25) % 26 = 27 % 26 = 1→'b'. - Shift index 2 (
'b'):(1 + 25) % 26 = 26 % 26 = 0→'a'. - Anchor first character to
'z'. - Canonical form:
"zba"→cnt = {"zab": 3, "zba": 1}
The gaps here are different (+2, then -1), so it lands in its own group.
Counting the pairs
Now we apply C(v, 2) = v * (v - 1) / 2 to each group:
| Canonical form | Count v | Pairs v(v-1)/2 |
|---|---|---|
"zab" | 3 | 3 * 2 / 2 = 3 |
"zba" | 1 | 1 * 0 / 2 = 0 |
Total = 3 + 0 = 3
Verifying the answer
The group {"abc", "bcd", "xyz"} at indices 0, 1, 2 produces these pairs:
(0, 1)→"abc"&"bcd"✓(0, 2)→"abc"&"xyz"✓(1, 2)→"bcd"&"xyz"✓
That's exactly 3 pairs, matching our computed result. The string "acb" is similar to none of the others, so it contributes nothing — consistent with its singleton group.
Solution Implementation
1from collections import defaultdict
2from typing import List
3
4
5class Solution:
6 def countPairs(self, words: List[str]) -> int:
7 # Map each word's canonical (shift-normalized) form to its frequency.
8 canonical_count: defaultdict[str, int] = defaultdict(int)
9
10 for word in words:
11 chars = list(word)
12
13 # Offset needed to shift the first character up to 'z'.
14 # This normalizes every word relative to its first letter, so that
15 # words equal under a uniform cyclic alphabet shift collapse together.
16 shift = ord("z") - ord(chars[0])
17
18 # Apply the same cyclic shift to the remaining characters.
19 for i in range(1, len(chars)):
20 shifted_value = (ord(chars[i]) - ord("a") + shift) % 26
21 chars[i] = chr(shifted_value + ord("a"))
22
23 # Anchor the first character to 'z' as the fixed reference point.
24 chars[0] = "z"
25
26 # Accumulate the count for this canonical form.
27 canonical_count["".join(chars)] += 1
28
29 # Within each group of size v, the number of valid pairs is C(v, 2).
30 return sum(group_size * (group_size - 1) // 2 for group_size in canonical_count.values())
311class Solution {
2 public long countPairs(String[] words) {
3 // Map to store the count of each normalized word form
4 Map<String, Integer> countMap = new HashMap<>();
5
6 // Total number of matching pairs
7 long answer = 0;
8
9 // Normalize every word and accumulate its occurrence count
10 for (String word : words) {
11 char[] chars = word.toCharArray();
12
13 // Compute the shift that maps the first character to 'z'
14 int shift = 'z' - chars[0];
15
16 // Apply the same circular shift to all remaining characters,
17 // so that words differing only by a uniform shift collapse
18 // to the same canonical (normalized) representation
19 for (int i = 1; i < chars.length; i++) {
20 chars[i] = (char) ('a' + (chars[i] - 'a' + shift) % 26);
21 }
22
23 // Fix the first character to 'z' as the anchor of the normalized form
24 chars[0] = 'z';
25
26 // Increase the count for this normalized word
27 countMap.merge(new String(chars), 1, Integer::sum);
28 }
29
30 // For each group of identical normalized words of size v,
31 // the number of pairs is C(v, 2) = v * (v - 1) / 2
32 for (int value : countMap.values()) {
33 answer += 1L * value * (value - 1) / 2;
34 }
35
36 return answer;
37 }
38}
391class Solution {
2public:
3 long long countPairs(vector<string>& words) {
4 // Map from a normalized string form to the number of words sharing it
5 unordered_map<string, int> groupCount;
6 long long ans = 0;
7
8 for (auto& word : words) {
9 string normalized = word;
10
11 // Compute the shift that maps the first character to 'z'
12 int shift = 'z' - normalized[0];
13
14 // Apply the same circular shift to every subsequent character,
15 // preserving the relative pattern between characters
16 for (int i = 1; i < static_cast<int>(normalized.size()); i++) {
17 normalized[i] = 'a' + (normalized[i] - 'a' + shift) % 26;
18 }
19
20 // Anchor the first character to 'z' for all words
21 normalized[0] = 'z';
22
23 // Group words by their normalized representation
24 groupCount[normalized]++;
25 }
26
27 // For each group of size v, the number of valid pairs is v * (v - 1) / 2
28 for (auto& [key, count] : groupCount) {
29 ans += 1LL * count * (count - 1) / 2;
30 }
31
32 return ans;
33 }
34};
351/**
2 * Counts the number of pairs of words that share the same "shift pattern".
3 *
4 * Two words are considered a matching pair if one can be transformed into the
5 * other by applying a single uniform Caesar shift (mod 26) to every character.
6 * To detect this, each word is normalized: a shift `k` is chosen so the first
7 * character maps to 'z', and the same shift is applied to the remaining
8 * characters. Words producing the same normalized key belong to the same group.
9 *
10 * @param words - The list of input words.
11 * @returns The number of valid pairs.
12 */
13function countPairs(words: string[]): number {
14 // Maps each normalized key to the count of words producing that key.
15 const keyCount = new Map<string, number>();
16 let answer = 0;
17
18 for (const word of words) {
19 const chars = word.split('');
20
21 // Shift amount needed to move the first character to 'z'.
22 const shift = 'z'.charCodeAt(0) - chars[0].charCodeAt(0);
23
24 // Apply the same shift (mod 26) to every character after the first.
25 for (let i = 1; i < chars.length; i++) {
26 const shifted = (chars[i].charCodeAt(0) - 97 + shift) % 26;
27 chars[i] = String.fromCharCode(97 + shifted);
28 }
29
30 // The first character always normalizes to 'z'.
31 chars[0] = 'z';
32
33 // Build the normalized key and increment its occurrence count.
34 const key = chars.join('');
35 keyCount.set(key, (keyCount.get(key) || 0) + 1);
36 }
37
38 // For each group of size v, there are C(v, 2) = v * (v - 1) / 2 pairs.
39 for (const count of keyCount.values()) {
40 answer += (count * (count - 1)) / 2;
41 }
42
43 return answer;
44}
45Time and Space Complexity
-
Time Complexity:
O(n × m), wherenis the length of the string arraywords, andmis the maximum length of the strings. For each of thenstrings, we iterate over all its characters to construct the normalized formt(the loopfor i in range(1, len(t))and the"".join(t)operation both costO(m)). Computing the dictionary key and updating the count viacnt[...] += 1requires hashing the string of lengthm, which also costsO(m). The final summationsum(v * (v - 1) // 2 for v in cnt.values())iterates over at mostnentries, costingO(n). Therefore the overall time complexity isO(n × m). -
Space Complexity:
O(n × m). The dictionarycntmay store up tondistinct keys, each being a normalized string of length up tom, requiringO(n × m)space in the worst case. The temporary listtand the joined string occupyO(m)space per iteration, which is dominated by the dictionary. Hence the overall space complexity isO(n × m).
Pattern Learn more about how to find time and space complexity quickly.
Common Pitfalls
Pitfall: Computing the shift offset based on the wrong anchor or mixing up the shift direction
The most frequent mistake in this solution is inconsistently choosing the anchor, which breaks the whole normalization scheme. The canonical-form approach only works if every word is normalized using the same rule. A subtle bug appears when you anchor the first character to 'z' but then forget that the offset shift must be the same value applied to all characters, including the first.
Consider this buggy variation:
for word in words:
chars = list(word)
shift = ord("z") - ord(chars[0])
# BUG: applying shift starting from index 0 but using a different formula,
# or computing shift relative to 'a' instead of the first char
for i in range(len(chars)):
shifted_value = (ord(chars[i]) - ord("a") + shift) % 26
chars[i] = chr(shifted_value + ord("a"))
canonical_count["".join(chars)] += 1
This actually works and is arguably cleaner — but the danger is when people mix the two styles: they compute the loop over range(1, len) (skipping index 0) and also forget to set chars[0] = "z". In that case index 0 keeps its original (un-shifted) value, so two genuinely similar words map to different canonical forms and pairs are undercounted.
Why it happens
The original code splits the work into two parts:
- Loop shifts indices
1..m-1. - Index
0is set directly withchars[0] = "z".
If a refactor drops step 2 (a very common edit), the first character is left untouched and the canonical form is corrupted.
Concrete failing example
Take words = ["ab", "bc"]. These are similar (shift "ab" by 1 → "bc").
- Correct: both normalize to
"zb"-style anchored form → 1 pair. - Buggy (missing
chars[0] = "z"):"ab"→chars[0]='a', and"bc"→chars[0]='b', giving different canonical strings → 0 pairs (wrong).
Solution
Make the normalization a single, self-consistent transformation so there is no separate step that can be forgotten. Shift all characters (including index 0) with one uniform loop:
from collections import defaultdict
from typing import List
class Solution:
def countPairs(self, words: List[str]) -> int:
canonical_count: defaultdict[str, int] = defaultdict(int)
for word in words:
# Offset that maps the first character to 'a' (any fixed anchor works,
# as long as it is identical for every word).
shift = (ord("a") - ord(word[0])) % 26
# One uniform loop over ALL characters — no special-casing index 0.
canonical = "".join(
chr((ord(ch) - ord("a") + shift) % 26 + ord("a"))
for ch in word
)
canonical_count[canonical] += 1
return sum(v * (v - 1) // 2 for v in canonical_count.values())
Why this is safer
- No special case: Index
0is handled by the same formula as every other character, so there is no separate assignment to forget. - Uniform offset:
shiftis computed once and applied to all positions, exactly mirroring the problem's "shift every letter by the same amount" rule. - Anchor choice is irrelevant: Whether you anchor the first char to
'a'or'z', the grouping is identical — what matters is that the relative differences between characters are preserved, and the same anchor is used for every word.
Related pitfall: Integer overflow on the pair count
In languages like C++ or Java, v * (v - 1) / 2 can overflow a 32-bit integer when many identical words exist (e.g., v ≈ 10^5 gives ~5 * 10^9). Always use a 64-bit type (long/long long) for the accumulator. In Python this is a non-issue since integers are arbitrary precision, but it is worth flagging when porting the solution.
Ready to land your dream job?
Unlock your dream job with a 5-minute quiz for a personalized study roadmap!
Get My RoadmapWhat's the output of running the following function using input 56?
1KEYBOARD = {
2 '2': 'abc',
3 '3': 'def',
4 '4': 'ghi',
5 '5': 'jkl',
6 '6': 'mno',
7 '7': 'pqrs',
8 '8': 'tuv',
9 '9': 'wxyz',
10}
11
12def letter_combinations_of_phone_number(digits):
13 def dfs(path, res):
14 if len(path) == len(digits):
15 res.append(''.join(path))
16 return
17
18 next_number = digits[len(path)]
19 for letter in KEYBOARD[next_number]:
20 path.append(letter)
21 dfs(path, res)
22 path.pop()
23
24 res = []
25 dfs([], res)
26 return res
271private static final Map<Character, char[]> KEYBOARD = Map.of(
2 '2', "abc".toCharArray(),
3 '3', "def".toCharArray(),
4 '4', "ghi".toCharArray(),
5 '5', "jkl".toCharArray(),
6 '6', "mno".toCharArray(),
7 '7', "pqrs".toCharArray(),
8 '8', "tuv".toCharArray(),
9 '9', "wxyz".toCharArray()
10);
11
12public static List<String> letterCombinationsOfPhoneNumber(String digits) {
13 List<String> res = new ArrayList<>();
14 dfs(new StringBuilder(), res, digits.toCharArray());
15 return res;
16}
17
18private static void dfs(StringBuilder path, List<String> res, char[] digits) {
19 if (path.length() == digits.length) {
20 res.add(path.toString());
21 return;
22 }
23 char next_digit = digits[path.length()];
24 for (char letter : KEYBOARD.get(next_digit)) {
25 path.append(letter);
26 dfs(path, res, digits);
27 path.deleteCharAt(path.length() - 1);
28 }
29}
301const KEYBOARD = {
2 '2': 'abc',
3 '3': 'def',
4 '4': 'ghi',
5 '5': 'jkl',
6 '6': 'mno',
7 '7': 'pqrs',
8 '8': 'tuv',
9 '9': 'wxyz',
10}
11
12function letter_combinations_of_phone_number(digits) {
13 let res = [];
14 dfs(digits, [], res);
15 return res;
16}
17
18function dfs(digits, path, res) {
19 if (path.length === digits.length) {
20 res.push(path.join(''));
21 return;
22 }
23 let next_number = digits.charAt(path.length);
24 for (let letter of KEYBOARD[next_number]) {
25 path.push(letter);
26 dfs(digits, path, res);
27 path.pop();
28 }
29}
30Recommended Readings
Math for Technical Interviews How much math do I need to know for technical interviews The short answer is about high school level math Computer science is often associated with math and some universities even place their computer science department under the math faculty However the reality is that you
Coding Interview Patterns Your Personal Dijkstra's Algorithm to Landing Your Dream Job The goal of AlgoMonster is to help you get a job in the shortest amount of time possible in a data driven way We compiled datasets of tech interview problems and broke them down by patterns This way
Recursion If you prefer videos here's a video that explains recursion in a fun and easy way Recursion is one of the most important concepts in computer science Simply speaking recursion is the process of a function calling itself Using a real life analogy imagine a scenario where you invite your friends to lunch https assets algo monster recursion jpg You first call Ben and ask him
Want a Structured Path to Master System Design Too? Don’t Miss This!