Leetcode 433. Minimum Genetic Mutation

This problem is about determining the minimum number of mutations needed to change one gene string to another. A gene string is represented as an 8-character long string, which each character being either "A", "C", "G", or "T". A mutation is defined as changing one character in a gene string. We are given a "start" gene string, an "end" gene string, and a bank of valid gene strings. To make it a valid mutation, the gene must be found in the bank. Our aim is to determine the minimum number of mutations needed to change the "start" gene string to the "end" gene string. We return -1 if no such mutation chain can exist.

Let's walk through an example.

Given: start: "AACCGGTT" end: "AAACGGTA" bank: ["AACCGGTA", "AACCGCTA", "AAACGGTA"]

In this case, we can mutate "AACCGGTT" to "AACCGGTA" which is 1 mutation, then from "AACCGGTA" to "AAACGGTA" which is another mutation. Thus, the minimum number of mutations in this case will be 2 mutations.

The solution uses a breadth-first search algorithm for this problem. We use a queue to store the intermediate gene strings and an unordered set to store the bank for quick access. We loop through each character of the current gene string, and try to mutate it to any one of the 4 possible charaters. If the new mutated gene is in the bank, we add it to the queue. We also remove it from the bank to avoid making multiple mutations to the same gene string.

If we find the target gene in the bank, we return the number of mutations so far. If no mutation path is found, we return -1.

C++ Solution

1
2cpp
3class Solution {
4public:
5    int minMutation(string start, string end, vector<string>& bank) {
6        unordered_set<string> bankSet{bank.begin(), bank.end()};
7        if (!bankSet.count(end))
8            return -1;
9
10        int ans = 0;
11        queue<string> q{{start}};
12
13        while (!q.empty()) {
14            ++ans;
15            for (int sz = q.size(); sz > 0; --sz) {
16                string word = q.front();
17                q.pop();
18                for (int j = 0; j < word.length(); ++j) {
19                    const char cache = word[j];  
20                    for (const char c : {'A', 'C', 'G', 'T'}) {
21                        word[j] = c; 
22                        if (word == end)
23                            return ans; 
24                        if (bankSet.count(word)) {
25                            bankSet.erase(word);
26                            q.push(word);
27                        }
28                    }
29                    word[j] = cache;     
30                }
31            }
32        }
33
34        return -1;
35    }
36};

A similar approach can be applied to other languages like Java, Python, Javascript and C#.

Note that the answer variable is incremented at the start of each level of BFS. This counts the depth level, or in this case, the number of mutations. We cache the original character before making a mutation to remember it for future mutations. If the mutated word is the target gene, we return the number of mutations immediately. Otherwise, if the word is in the bank, we add it to the queue to consider its mutations in the next level of BFS and remove it from the bank.## Java Solution

1
2java
3class Solution {
4    public int minMutation(String start, String end, String[] bank) {
5        HashSet<String> bankSet = new HashSet<>(Arrays.asList(bank));
6        
7        if (!bankSet.contains(end))
8            return -1;
9        
10        Queue<String> queue = new LinkedList<>();
11        queue.offer(start);
12        int mutations = 0;
13        
14        while (!queue.isEmpty()) {
15            int size = queue.size();
16            mutations++;
17            
18            while (size-- > 0) {
19                String current = queue.poll();
20                char[] characters = current.toCharArray();
21                
22                for (int i = 0; i<characters.length; i++) {
23                    char old = characters[i];
24                    for (char c : new char[]{'A', 'C', 'G', 'T'}) {
25                        characters[i] = c;
26                        String next = new String(characters);
27                        if (end.equals(next))
28                            return mutations;
29                        if (bankSet.contains(next)){
30                            queue.offer(next);
31                            bankSet.remove(next);
32                        }
33                    }
34                    characters[i] = old;
35                }
36            }
37        }
38        
39        return -1;
40    }
41}

Python Solution

1
2python
3class Solution:
4    def minMutation(self, start: str, end: str, bank: List[str]) -> int:
5        bank = set(bank)
6        if end not in bank:
7            return -1
8
9        queue = [(start, 0)]
10        while queue:
11            current, mutations = queue.pop(0)
12            if current == end:
13                return mutations
14
15            for i in range(len(current)):
16                for c in 'ACGT':
17                    mutation = current[:i] + c + current[i+1:]
18                    if mutation in bank:
19                        bank.remove(mutation)
20                        queue.append((mutation, mutations + 1))
21                        
22        return -1

JavaScript Solution

1
2javascript
3var minMutation = function(start, end, bank) {
4    let mutations = 0, bankSet = new Set(bank), queue = [start];
5    
6    if (!bankSet.has(end)) return -1;
7    
8    while (queue.length) {
9        let nextQ = [];
10        mutations++;
11        for (let gene of queue) {
12            for (let i=0; i<gene.length; i++) {
13                for (let c of "ACGT") {
14                    let mutation = gene.slice(0,i) + c + gene.slice(i+1);
15                    if (mutation === end) return mutations;
16                    if (bankSet.has(mutation)) {
17                        nextQ.push(mutation);
18                        bankSet.delete(mutation);
19                    }
20                }
21            }
22        }
23        queue = nextQ;
24    }
25    
26    return -1;
27};

In the above code snippets, we have used same logic in all languages with the use of queue and set for BFS and maintaining the bank of gene strings.


Got a question? Ask the Teaching Assistant anything you don't understand.

Still not clear? Ask in the Forum,  Discord or Submit the part you don't understand to our editors.


TA 👨‍🏫