269. Alien Dictionary
Problem Description
In the given problem, we have a set of strings that are words from an alien language that uses the same characters as the English alphabet, but with an unknown order. We are tasked with determining the order of characters in the alien language. The input is a list of words from the alien language. These words are supposed to be sorted lexicographically according to the rules of the alien language. We need to verify if this claim of sorted order is true, and if so, deduce the order of the characters in the alien language based on the words provided.
If we cannot determine a valid character order that would result in the words being sorted lexicographically, we return an empty string. Otherwise, we should return any valid string that represents the characters in the alien language in lexicographically increasing order according to the rules of the language. There may be more than one correct order, in which case any one of them is an acceptable answer.
Flowchart Walkthrough
Let's analyze the approach for solving leetcode 269. Alien Dictionary using the Flowchart. Here's a detailed breakdown:
Is it a graph?
- Yes: The problem involves determining the order of characters based on a sequence of words, where dependencies between characters form a graph.
Is it a tree?
- No: The relationship between characters forms a graph but not necessarily a tree, as it may have multiple "roots" or entry points, based on the first differing characters in words.
Is the problem related to directed acyclic graphs (DAGs)?
- Yes: The character sequence dependencies form a DAG, where an edge from character
u
to characterv
meansu
must come beforev
in the alien language.
Does the graph involve topological sorting?
- Yes: Since the task is to sort characters according to their dependencies (i.e., the order dictated by the alien language), it perfectly suits a topological sort approach, which is a common method to order elements in a DAG.
Conclusion: Following the flowchart points us towards using Topological Sort, which is implemented using DFS for this kind of problem. Hence, DFS along with methods to check for cycles and order dependencies are needed to solve the Alien Dictionary problem.
Intuition
The solution to this problem lies in treating the problem as a graph problem. Each character can be considered as a node in a graph, and a directed edge between two nodes indicates that one character precedes another in the alien language's lexicographical order. This can be derived from two consecutive words in the list: if the first differing characters between the two words are c1
and c2
, then c1
must come before c2
in the alien language order.
Our goal is to build this directed graph and then find a topological ordering of the characters that satisfies all the constraints imposed by the consecutive words. This problem, therefore, boils down to the detection of cycles in a directed graph, and if no cycles are present, finding a topological sort of the graph's nodes.
The steps in arriving at the solution are as follows:
- Initialize a graph data structure to represent the ordering rules between characters.
- Populate the graph with nodes and directed edges by comparing consecutive words from the provided dictionary. If words are sorted lexicographically, the first different character from word[i] and word[i+1] sets an ordering rule.
- Detect if there is any contradiction in the ordering rules, such as a cycle in the graph which would make it impossible to determine the order, or an improper ordering like a word being a prefix of a previous word but appearing later in the list.
- Assuming the ordering rules do not contain contradictions, perform a topological sort on the graph to find the alien language character order.
- If we can complete the topological sort with all characters accounted for, we have successfully identified a possible alien language order. Otherwise, we return an empty string.
Learn more about Depth-First Search, Breadth-First Search, Graph and Topological Sort patterns.
Solution Approach
The implementation of the solution starts by constructing a directed graph and initializing some auxiliary data structures.
-
We begin by creating a graph
g
which is represented as a 2D boolean list of size 26x26, initializing every element toFalse
. Here,g[i][j]
beingTrue
represents there is a directed edge fromi
toj
, implying letteri
comes before letterj
in the alien language. -
An array
s
of size 26 is created to keep track of which characters are present in the givenwords
, so we don't have to consider characters not in the language. Initially, every element is set toFalse
. -
A variable
cnt
is used to count the unique letters that have appeared so far as we process the list of words. -
We process every word. Within each word, we mark the characters in
s
asTrue
to signify their existence and incrementcnt
appropriately. -
The comparison between consecutive words is done character by character. We look for the first non-matching character pair
c1
andc2
between wordsi
andi+1
. Ifc1
andc2
differ, it meansc1
must come beforec2
in the alphabet order. We encode this by settingg[ord(c1) - ord('a')][ord(c2) - ord('a')] = True
. -
If
c2
appears beforec1
in any subsequent words, it contradicts the previously established order, and in such a case, we immediately return an empty string''
, indicating that there's no valid order that can sort the words lexicographically. -
An array
indegree
of size 26 is maintained to keep track of the number of incoming edges for each node in the graph, which is crucial for topological sorting. -
We iterate through the graph to populate
indegree
by counting how many times a letter has been placed after another letter. -
We employ a queue to perform a topological sort. We start by adding all characters with zero
indegree
(no incoming edges) to the queue, which means these characters could appear at the beginning of the alien language's order. -
While the queue is not empty, we continuously pop elements from the queue, append their corresponding character to the answer, and reduce the
indegree
of their connected nodes by 1. If any node'sindegree
becomes zero, we add it to the queue, as it could now come next in order. -
Finally, we check if the length of our answer matches the count of unique letters (
cnt
). If it does not, that means it was impossible to establish a topological sort due to a cycle or other issues, and we return an empty string''
. Otherwise, we join the answer list into a string and return it as our character order.
Here's an example of how this works:
Consider the words ["wrt", "wrf", "er", "ett", "rftt"]
. By comparing consecutive words, we conclude the order: w
before e
, t
before f
and r
before t
. When we try to find a topological order of the nodes, we correctly deduce the order to be wertf
(or any order that follows the deduced rules, such as wetrf
).
This approach is essentially a graph creation followed by a topological sort - a classic pattern for deducing order-based relationships between entities, often used in scheduling, course prerequisite structures, and similarly structured problems.
Ready to land your dream job?
Unlock your dream job with a 2-minute evaluator for a personalized learning plan!
Start EvaluatorExample Walkthrough
Let's consider a small example to illustrate the solution approach. We are provided with the following alien words:
["z", "x", "z"]
According to the problem, these words are sorted according to the alien language's lexicographical order. We need to determine if this order is correct and, if so, deduce the character order in the alien language.
-
We create a graph
g
as a 2D boolean list with a size of 26x26 representing the charactersa
toz
. We also create an arrays
of size 26 to track present characters and a variablecnt
to count unique letters. -
As we process our first word, "z", we mark
z
as present by settings[ord('z') - ord('a')]
toTrue
. Then we do the same for "x" in the next word. At this point,cnt
has been incremented twice, forz
andx
. -
Next, we compare consecutive words to establish ordering rules. We try to compare "z" and "x", but since there is only one letter in each word, we cannot define any order here and move on to the next pair of words.
-
Comparing "x" and "z" from the second and third words, we would expect
x
to come beforez
if the words are sorted correctly. However, here we have the same situation as in the previous step: with only one letter, no relative order can be determined. -
At this point, we realize that according to the given words,
z
was supposed to come beforex
based on the first and the last word. However, there is no such rule established between them since the second word is 'x', which shows that we cannot determine a valid character order, and thus we need to return an empty string. -
For contradictions in the case where
c2
appears beforec1
, like in our example ("z" appears before and after "x"), we would return an empty string''
.
Since this example demonstrates that the given words are not sorted lexicographically (the first and the last word are the same, "z", yet there's a word "x" in between), our function would return an empty string, indicating an invalid character order. There is no need to complete a topological sort because a contradiction in the order has already been detected.
In conclusion, this example illustrates how the given solution approach can process each word and compare consecutive words to establish a directed graph representing the alien language order. However, if a contradiction is found—such as the given words are not sorted or a word acts as both predecessor and successor of another word—the algorithm detects it and returns an empty string without proceeding to topological sorting.
Solution Implementation
1from typing import List
2from collections import deque
3
4class Solution:
5 def alienOrder(self, words: List[str]) -> str:
6 # Graph representation - 26 characters for 26 lowercase letters
7 graph = [[False] * 26 for _ in range(26)]
8
9 # Seen array to keep track of which characters are present in the words list
10 seen = [False] * 26
11
12 # Counter for the number of unique characters
13 unique_char_count = 0
14
15 # Length of the words list
16 words_count = len(words)
17
18 # Process all the words for character presence and ordering
19 for i in range(words_count - 1):
20 # Record the characters in current word as seen
21 for char in words[i]:
22 if unique_char_count == 26:
23 break
24 index = ord(char) - ord('a')
25 if not seen[index]:
26 unique_char_count += 1
27 seen[index] = True
28 word_length = len(words[i])
29
30 # Compare characters and establish ordering
31 for j in range(word_length):
32 # If the next word is shorter than the current word, there is an issue with the order
33 if j >= len(words[i + 1]):
34 return ''
35
36 char1, char2 = words[i][j], words[i + 1][j]
37 if char1 == char2:
38 continue
39 index1, index2 = ord(char1) - ord('a'), ord(char2) - ord('a')
40 if graph[index2][index1]: # If there is a cycle, return empty string
41 return ''
42 graph[index1][index2] = True
43 break
44
45 # Check the last word for new characters
46 for char in words[words_count - 1]:
47 if unique_char_count == 26:
48 break
49 index = ord(char) - ord('a')
50 if not seen[index]:
51 unique_char_count += 1
52 seen[index] = True
53
54 # Calculate in-degree for each character that is seen
55 in_degree = [0] * 26
56 for i in range(26):
57 for j in range(26):
58 if i != j and seen[i] and seen[j] and graph[i][j]:
59 in_degree[j] += 1
60
61 # Initialize the queue for topological sort
62 queue = deque()
63 # Hold characters of the alien language in order
64 ordered_chars = []
65
66 # Find characters with in-degree of 0 to start topological sort
67 for i in range(26):
68 if seen[i] and in_degree[i] == 0:
69 queue.append(i)
70
71 # Run topological sort
72 while queue:
73 current_char_index = queue.popleft()
74 ordered_chars.append(chr(current_char_index + ord('a')))
75
76 # Decrease the in-degree of adjacent characters and add new 0 in-degree characters to the queue
77 for i in range(26):
78 if seen[i] and i != current_char_index and graph[current_char_index][i]:
79 in_degree[i] -= 1
80 if in_degree[i] == 0:
81 queue.append(i)
82
83 # If the length of the ordered characters is less than total unique characters, ordering is not possible
84 return '' if len(ordered_chars) < unique_char_count else ''.join(ordered_chars)
85
1class Solution {
2 public String alienOrder(String[] words) {
3 // Graph represented as a 2D boolean array; g represents the edges.
4 boolean[][] graph = new boolean[26][26];
5 // Seen array to keep track of the characters encountered in the words.
6 boolean[] seen = new boolean[26];
7 // Counter for the number of unique letters.
8 int uniqueLettersCount = 0;
9 int totalWords = words.length;
10
11 // First, process all the words except the last one for building the graph.
12 for (int i = 0; i < totalWords - 1; ++i) {
13 for (char c : words[i].toCharArray()) {
14 if (uniqueLettersCount == 26) {
15 break;
16 }
17 int index = c - 'a';
18 if (!seen[index]) {
19 ++uniqueLettersCount;
20 seen[index] = true;
21 }
22 }
23
24 // Build the graph based on the given dictionary (words array).
25 int wordLength = words[i].length();
26 for (int j = 0; j < wordLength; ++j) {
27 if (j >= words[i + 1].length()) {
28 return ""; // If the next word is a prefix of the current word, it is invalid.
29 }
30
31 char currentChar = words[i].charAt(j), nextChar = words[i + 1].charAt(j);
32 if (currentChar == nextChar) {
33 continue; // No specific order is extracted from this character comparison.
34 }
35
36 // Detect cycle or invalid sequence.
37 if (graph[nextChar - 'a'][currentChar - 'a']) {
38 return "";
39 }
40 // Establish the edge relationship in the graph.
41 graph[currentChar - 'a'][nextChar - 'a'] = true;
42 break;
43 }
44 }
45
46 // Process the last word to account for its letters.
47 for (char c : words[totalWords - 1].toCharArray()) {
48 if (uniqueLettersCount == 26) {
49 break;
50 }
51 int index = c - 'a';
52 if (!seen[index]) {
53 ++uniqueLettersCount;
54 seen[index] = true;
55 }
56 }
57
58 // Compute the in-degree for each node/character in the graph.
59 int[] inDegree = new int[26];
60 for (int i = 0; i < 26; ++i) {
61 for (int j = 0; j < 26; ++j) {
62 if (i != j && seen[i] && seen[j] && graph[i][j]) {
63 ++inDegree[j];
64 }
65 }
66 }
67
68 // Queue for topological sorting (Kahn's algorithm).
69 Deque<Integer> queue = new LinkedList<>();
70 for (int i = 0; i < 26; ++i) {
71 if (seen[i] && inDegree[i] == 0) {
72 queue.offerLast(i); // Enqueue characters with in-degree of 0.
73 }
74 }
75
76 // Perform topological sort to determine the alien language order.
77 StringBuilder alienOrder = new StringBuilder();
78 while (!queue.isEmpty()) {
79 int current = queue.pollFirst();
80 alienOrder.append((char) (current + 'a'));
81 for (int i = 0; i < 26; ++i) {
82 if (i != current && seen[i] && graph[current][i]) {
83 // Decrement in-degree and enqueue if in-degree becomes 0.
84 if (--inDegree[i] == 0) {
85 queue.offerLast(i);
86 }
87 }
88 }
89 }
90 // If the final sorted string has a length less than the count of unique letters, there is an inconsistency.
91 return alienOrder.length() < uniqueLettersCount ? "" : alienOrder.toString();
92 }
93}
94
1class Solution {
2public:
3 string alienOrder(vector<string>& words) {
4 // Use a graph (represented as a 2D array) to represent the letter precedence
5 vector<vector<bool>> graph(26, vector<bool>(26, false));
6 // An array indicating whether each letter is present in the dictionary or not.
7 vector<bool> seen(26, false);
8 // Total number of unique characters seen so far
9 int uniqueCharsCount = 0;
10 int wordCount = words.size();
11
12 // Build the graph
13 for (int i = 0; i < wordCount - 1; ++i) {
14 for (char ch : words[i]) {
15 int charIndex = ch - 'a';
16 if (!seen[charIndex]) {
17 ++uniqueCharsCount;
18 seen[charIndex] = true;
19 if (uniqueCharsCount == 26) break;
20 }
21 }
22 int wordLength = words[i].size();
23 for (int j = 0; j < wordLength; ++j) {
24 if (j >= words[i + 1].size()) {
25 return ""; // Next word cannot be prefix of the current word
26 }
27 char char1 = words[i][j], char2 = words[i + 1][j];
28 if (char1 == char2) continue;
29 if (graph[char2 - 'a'][char1 - 'a']) {
30 return ""; // Invalid ordering found
31 }
32 graph[char1 - 'a'][char2 - 'a'] = true;
33 break; // Found the first difference, so move on to next word
34 }
35 }
36
37 // Register the characters seen in the last word
38 for (char ch : words[wordCount - 1]) {
39 int charIndex = ch - 'a';
40 if (!seen[charIndex]) {
41 ++uniqueCharsCount;
42 seen[charIndex] = true;
43 if (uniqueCharsCount == 26) break;
44 }
45 }
46
47 // Count the indegrees of each node/letter
48 vector<int> indegree(26, 0);
49 for (int i = 0; i < 26; ++i) {
50 for (int j = 0; j < 26; ++j) {
51 if (i != j && seen[i] && seen[j] && graph[i][j]) {
52 ++indegree[j];
53 }
54 }
55 }
56
57 // Start with all nodes/letters with 0 indegree
58 queue<int> queue;
59 for (int i = 0; i < 26; ++i) {
60 if (seen[i] && indegree[i] == 0) {
61 queue.push(i);
62 }
63 }
64
65 // Perform topological sort
66 string result;
67 while (!queue.empty()) {
68 int current = queue.front();
69 result += (current + 'a');
70 queue.pop();
71 for (int i = 0; i < 26; ++i) {
72 // Reduce the indegree of the neighbors and add to queue if indegree becomes 0
73 if (i != current && seen[i] && graph[current][i]) {
74 if (--indegree[i] == 0) {
75 queue.push(i);
76 }
77 }
78 }
79 }
80
81 // If the number of characters in the result does not match the unique count
82 // it means there are some nodes left in the graph due to a cycle,
83 // hence no valid ordering
84 if (result.size() < uniqueCharsCount) {
85 return "";
86 }
87
88 return result;
89 }
90};
91
1// Initialize type alias for readability
2type AlienGraph = boolean[][];
3type SeenArray = boolean[];
4
5// Abstracting the letter-to-index conversion into a function
6function charToIndex(ch: string): number {
7 return ch.charCodeAt(0) - 'a'.charCodeAt(0);
8}
9
10// Build the graph from the given list of words
11function buildGraph(words: string[], graph: AlienGraph, seen: SeenArray): number {
12 let uniqueCharsCount = 0;
13 const wordCount = words.length;
14
15 for (let i = 0; i < wordCount - 1; ++i) {
16 for (const ch of words[i]) {
17 const charIndex = charToIndex(ch);
18 if (!seen[charIndex]) {
19 ++uniqueCharsCount;
20 seen[charIndex] = true;
21 if (uniqueCharsCount === 26) break;
22 }
23 }
24 const wordLength = words[i].length;
25 for (let j = 0; j < wordLength; ++j) {
26 if (j >= words[i + 1].length) return -1; // Next word cannot be prefix of the current word
27
28 const char1 = words[i][j], char2 = words[i + 1][j];
29 if (char1 === char2) continue;
30 if (graph[charToIndex(char2)][charToIndex(char1)]) return -1; // Invalid ordering found
31
32 graph[charToIndex(char1)][charToIndex(char2)] = true;
33 break; // Found the first difference, so move on to next word
34 }
35 }
36
37 for (const ch of words[wordCount - 1]) {
38 const charIndex = charToIndex(ch);
39 if (!seen[charIndex]) {
40 ++uniqueCharsCount;
41 seen[charIndex] = true;
42 if (uniqueCharsCount === 26) break;
43 }
44 }
45
46 return uniqueCharsCount;
47}
48
49// Perform topological sort to get alien language order
50function alienOrder(words: string[]): string {
51 const graph: AlienGraph = Array(26).fill(null).map(() => Array(26).fill(false));
52 const seen: SeenArray = Array(26).fill(false);
53 let uniqueCharsCount = buildGraph(words, graph, seen);
54
55 // Return empty string if invalid prefix or ordering found
56 if (uniqueCharsCount === -1) return "";
57
58 // Initialize indegree array and calculate indegrees
59 const indegree: number[] = Array(26).fill(0);
60 for (let i = 0; i < 26; ++i) {
61 for (let j = 0; j < 26; ++j) {
62 if (i !== j && seen[i] && seen[j] && graph[i][j]) {
63 ++indegree[j];
64 }
65 }
66 }
67
68 // Create a queue and add all the characters with 0 indegree
69 const queue: number[] = [];
70 for (let i = 0; i < 26; ++i) {
71 if (seen[i] && indegree[i] === 0) {
72 queue.push(i);
73 }
74 }
75
76 // Perform the topological sort
77 let result = "";
78 while (queue.length) {
79 const current = queue.shift();
80 result += String.fromCharCode(current + 'a'.charCodeAt(0));
81
82 for (let i = 0; i < 26; ++i) {
83 if (i !== current && seen[i] && graph[current][i]) {
84 if (--indegree[i] === 0) {
85 queue.push(i);
86 }
87 }
88 }
89 }
90
91 // Check for cycles (if result size doesn't match unique char count)
92 if (result.length < uniqueCharsCount) {
93 return "";
94 }
95
96 return result;
97}
98
Time and Space Complexity
Time Complexity
The main time complexity of the provided code comes from multiple nested loops, iterating over the given words and then characters within those words, as well as operations related to constructing and traversing the graph representation of the alien dictionary. Analyzing the various components:
-
Constructing the Character Graph: The code iterates through each pair of consecutive words and compares their characters until it finds a differing pair. This takes at most
O(n * l)
time, wheren
is the number of words andl
is the average length of the words since it stops comparing once there is a discrepancy or the end of the word is reached. -
Populating visit and graph arrays: Similarly, the loop goes through each word's characters, and so does so in
O(n * l)
time as well. This also includes the check for early termination if an invalid order is detected. -
Calculating In-degrees: The nested loop has a fixed size of 26, so it takes
O(26^2)
which is essentiallyO(1)
since it's a constant not based on input size. -
The Breadth-First Search (BFS) on the character graph involves dequeuing and enqueuing each character once. Since every character is enqueued and dequeued exactly once, and there are at most 26 characters, this step takes
O(26)
time, which is again constant.
Overall, the primary time complexity driver is determined by the comparison of characters in the given words, which is O(n * l)
. The rest of the steps are either constant time or linear time with a fixed upper bound unrelated to the input size. Hence, the time complexity can be summarized as O(n * l)
.
Space Complexity
For space complexity, we account for the data structures used in relation to the input size:
-
The graph
g
is a 26x26 matrix that takes upO(26^2)
which simplifies toO(1)
space since it's constant space, irrespective of the input. -
The visit array
s
takes upO(26)
space, again a constantO(1)
. -
The indegree array is also of size 26, hence
O(1)
. -
The queue and answer list will at most hold all 26 characters, so they both take
O(1)
space.
Like the time complexity, the largest space consideration is the character graph, which is fixed and not dependent on input size. Thus, the space complexity is O(1)
- constant space complexity.
Learn more about how to find time and space complexity quickly using problem constraints.
Which of the tree traversal order can be used to obtain elements in a binary search tree in sorted order?
Recommended Readings
https algomonster s3 us east 2 amazonaws com cover_photos dfs svg Depth First Search Prereqs Recursion Review problems recursion_intro Trees problems tree_intro With a solid understanding of recursion under our belts we are now ready to tackle one of the most useful techniques in coding interviews Depth First Search DFS
https algomonster s3 us east 2 amazonaws com cover_photos bfs svg Breadth First Search on Trees Hopefully by this time you've drunk enough DFS Kool Aid to understand its immense power and seen enough visualization to create a call stack in your mind Now let me introduce the companion spell
https algomonster s3 us east 2 amazonaws com cover_photos graph svg Graph Fundamentals Tree with 0 cycle At this point you should be pretty familiar with trees A tree is a special kind of graph a connected acyclic cycle less graph A graph may contain cycle s and nodes could
Want a Structured Path to Master System Design Too? Don’t Miss This!