LeetCode - Repeated DNA Sequences
Problem description
All DNA is composed of a series of nucleotides abbreviated as A, C, G, and T, for example: “ACGAATTCCG”. When studying DNA, it is sometimes useful to identify repeated sequences within the DNA.
Write a function to find all the 10-letter-long sequences (substrings) that occur more than once in a DNA molecule.
Example:
1
2
3
Input: s = "AAAAACCCCCAAAAACCCCCCAAAAAGGGTTT"
Output: ["AAAAACCCCC", "CCCCCAAAAA"]
Analysis
The first idea is to use hashmap and substring to iterate the whole string for all substring with length of 10 and get the answer.
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
class Solution {
public List<String> findRepeatedDnaSequences(String s) {
HashMap<String, Integer> map = new HashMap<>();
for (int i = 0; i<=s.length() -10;i++){
String str = s.substring(i, i+10);
map.put(str, map.getOrDefault(str, 0) + 1);
}
LinkedList<String> res = new LinkedList<>();
for (Map.Entry<String,Integer> entry:map.entrySet()){
if (entry.getValue() > 1){
res.add(entry.getKey());
}
}
return res;
}
}