PROBLEM DESCRIPTION
A gene string can be represented by an 8-character long string, with choices from ‘A’, ‘C’, ‘G’, and ‘T’. Suppose we need to investigate a mutation from a gene string startGene to a gene string endGene where one mutation is defined as one single character changed in the gene string.
For example, “AACCGGTT” –> “AACCGGTA” is one mutation.
There is also a gene bank bank that records all the valid gene mutations. A gene must be in bank to make it a valid gene string.
Given the two gene strings startGene and endGene and the gene bank bank, return the minimum number of mutations needed to mutate from startGene to endGene. If there is no such a mutation, return -1.
Note that the starting point is assumed to be valid, so it might not be included in the bank.
SOLUTION
Starting from startGene, we will try to find other “reachable” genes. We can make them as visited and remove them from the list/set so that we do not visit them again. Keep doing this, until we have reached endGene out exhausted all the genes which are reachable. This is basically BFS. The answer is the shorted path to the destination, number of edges. So, we can keep a count of the “levels” and return it as our answer.
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
class Solution {
//method to check if two string a and b representing two genes are valid mutations
//they must have 1 character different at least and at most.
public boolean valid(String a, String b){
if(a.equals(b)) return false;
int diff = 0;
for(int i=0; i<a.length(); i++){
if(a.charAt(i) != b.charAt(i)) diff++;
if(diff >= 2) return false;
}
return true;
}
public int minMutation(String startGene, String endGene, String[] bank) {
//add all possible genes to a hashset
Set<String> possibleMutations = new HashSet<>();
for(String s: bank)
possibleMutations.add(s);
//startGene is already visited, so we can remove it from the next possible set of mutations possible
possibleMutations.remove(startGene);
//queue for BFS
//add the start gene
Queue<String> q = new LinkedList<>();
q.offer(startGene);
//number of steps to reach endGene
int mutations = 0;
//while there are no more genes left to check
while(!q.isEmpty()){
//increase number of steps
mutations++;
//get the number of elements/genes in the current "level"
int elements = q.size();
//get the genes in the current level and find their next possible mutations
for(int i=0; i<elements; i++){
//get current gene from the queue
String currentGene = q.poll();
//create a set of visited genes
//we will use this to remove it from the previous hashset to avoid going back to the same string gene again
Set<String> visited = new HashSet<>();
//go through the list of genes and look for valid mutations
for(String s: possibleMutations){
//if the mutation is valid, add it to the queue
//mark it as visited
//if the next gene is found to be the endGene, we have can return the number of mutations here
if(valid(currentGene, s)){
q.offer(s);
visited.add(s);
if(s.equals(endGene)) return mutations;
}
}
//remove the visited genes from the list of genes
possibleMutations.removeAll(visited);
}
}
return -1;
}
}