I have a list of coordinates:
coordinates = [[1,5], [10,15], [25, 35]]
I have a string as follows:
line = 'ATCACGTGTGTGTACACGTACGTGTGNGTNGTTGAGTGKWSGTGAAAAAKCT'
I want to replace intervals indicated in pairs in coordinates as start and end with character 'N'.
The only way I can think of is the following:
for element in coordinates:
length = element[1] - element[0]
line = line.replace(line[element[0]:element[1]], 'N'*length)
The desired output would be:
line = 'ANNNNGTGTGNNNNNACGTACGTGTNNNNNNNNNNGTGKWSGTGAAAAAKCT'
where intervals, [1,5), [10,15) and [25, 35) are replaced with N in line.
This requires me to loop through the coordinate list and update my string line, every time. I was wondering if there is another way that one can replace a list of intervals in a string?
Note: There is a problem with the original solution in this question. In line.replace(line[element[0]:element[1]], 'N'*length), replace will replace all other instances of string identical to the one in line[element[0]:element[1]] from the sequence and for people working with DNA, this is definitely not what you want! I however, keep the solution as it is to not disturb the flow of comments and discussion following.
for start, end in coordinates: line = line[:start] + "N" * (end - start) + line[end:]-- if I've correctly understood.replacereplaces all occurrences of the sub-string so it might not only replace the indices you give it1and5(TCAC) appears somewhere else in the string, so it will be replaced as well. That might not be what you want