Replace multiple elements in string with str methods

Question

I am trying to write a function that takes a string of DNA and returns the compliment. I have been trying to solve this for a while now and looked through the Python documentation but couldn't work it out. I have written the docstring for the function so you can see what the answer should look like. I have seen a similar question asked on this forum but I could not understand the answers. I would be grateful if someone can explain this using only str formatting and loops / if statements, as I have not yet studied dictionaries/lists in detail.

I tried str.replace but could not get it to work for multiple elements, tried nested if statements and this didn't work either. I then tried writing 4 separate for loops, but to no avail.

def get_complementary_sequence(dna):

    """ (str) -> str

    Return the DNA sequence that is complementary 
    to the given DNA sequence.

    >>> get_complementary_sequence('AT')
    TA
    >>> get_complementary_sequence('GCTTAA')
    CGAATT

    """

    for char in dna:
        if char == A:
            dna = dna.replace('A', 'T')
        elif  char == T:
            dna = dna.replace('T', 'A')
        # ...and so on

its supposed to find the compliment strand on a dna sequence. there are 4 nucleotides on a dna strand. so A on one strand compliments to T on the other strand. T with A, C with G and G with C — Hom Bahrani
– Hom Bahrani, Commented Dec 26, 2014 at 16:53

nneonneo · Accepted Answer · 2014-12-26 16:58:41Z

5

For a problem like this, you can use string.maketrans (str.maketrans in Python 3) combined with str.translate:

import string
table = string.maketrans('CGAT', 'GCTA')
print 'GCTTAA'.translate(table)
# outputs CGAATT

edited Dec 26, 2014 at 16:58

answered Dec 26, 2014 at 16:51

nneonneo

181k37 gold badges331 silver badges412 bronze badges

Sign up to request clarification or add additional context in comments.

2 Comments

Hom Bahrani Over a year ago

will this work on any dna strand, say something much longer with a random sequence?

nneonneo Over a year ago

@DataScienceAcademy: Yes. This just translates each character according to the translation table. See the documentation for more details.

Shahriar · Accepted Answer · 2014-12-26 17:24:22Z

1

You can map each letter to another letter.

You probably need not create translation table with all possible combination.

>>> M = {'A':'T', 'T':'A', 'C':'G', 'G':'C'}
>>> STR = 'CGAATT'
>>> S = "".join([M.get(c,c) for c in STR])
>>> S
'GCTTAA'

How this works:

# this returns a list of char according to your dict M
>>> L = [M.get(c,c) for c in STR]  
>>> L
['G', 'C', 'T', 'T', 'A', 'A']

The method join() returns a string in which the string elements of sequence have been joined by str separator.

>>> str = "-"
>>> L = ['a','b','c']
>>> str.join(L)
'a-b-c'

edited Dec 26, 2014 at 17:24

answered Dec 26, 2014 at 16:59

Shahriar

13.9k11 gold badges83 silver badges97 bronze badges

3 Comments

Hom Bahrani Over a year ago

thank you this works, just for my learning though why do you have "".join in line 3?

martineau Over a year ago

Using str.translate() is simpler and would be much faster at doing the replacements -- the creation of translation tables is trivial with string.maketrans() (or str.maketrans() in Python 3).

martineau Over a year ago

Since there are 4 nucleotides, one would need to pass two 4 letter strings to string.maketrans() to create a translation table that could be used to complement any sequence -- see @nneonneo's answer.

Collectives™ on Stack Overflow

Replace multiple elements in string with str methods

2 Answers 2

2 Comments

3 Comments

Your Answer

Linked

Hot Network Questions

Collectives™ on Stack Overflow

2 Answers 2

2 Comments

3 Comments

Your Answer

Sign up or log in

Post as a guest

Linked

Related