Regular expression in python

Question

I have text file (seq.fasta) which contains sequence as follows

M1

MPMILGYWNVRGLTHPIRMLLEYTDSSYDEKRYTMGDAPDFDRSQWLNEKFKLGLDFPNL
PYLIDGSHKITQSNAILRYLARKHHLDGETEEERIRADIVENQVMDTRMQLIMLCYNPDF
EKQKPEFLKTIPEKMKLYSEFLGKRPWFAGDKVTYVDFLAYDILDQYRMFEPKCLDAFPN
LRDFLARFEGLKKISAYMKSSRYIATPIFSKMAHWSNK

I have to extract motif PXXP exactly 4 characters (XX can be any characters).

I tried following code:

import re

infile=open("seq.fasta",'r')

out=open("out.csv",'w')

for line in infile:

   line = line.strip("\n")

   if line.startswith('>'):

      name=line

   else:

      motif = re.compile(r"(\bP{2}P\b)")

      c = line.count('motif')

      print '%s:%s' %(name,c)

      out.write('%s:%s\n' %(name,c))

But it is not finding motif.

There is no string P..P in the provided input above (here . stands for "any character"). Don't get the question. Please update with expected output. You're regexp say to look for a wordboundary, followed by 2Ps, followed by a P and then a wordboundary — Fredrik Pihl
– Fredrik Pihl, Commented Sep 8, 2011 at 9:52
@Fredrik The P..P string appears split across the first two lines. Presumably, the entire string is intended to represent a line from the file. — Michael J. Barber
– Michael J. Barber, Commented Sep 8, 2011 at 11:06

Arnaud Le Blanc · Accepted Answer · 2011-09-08 09:54:23Z

5

Try with this one:

 re.compile(r"(P..P)")

. means any character.

{2} means that the last token must be repeated twice times (in your regex, this means PP.

\b matches word boundaries

answered Sep 8, 2011 at 9:54

Arnaud Le Blanc

100k24 gold badges211 silver badges196 bronze badges

Sign up to request clarification or add additional context in comments.

Comments

Artem O. · Accepted Answer · 2011-09-08 10:00:05Z

3

You can use this:

re.compile( r"(P[\w]{2}P)" )

or

re.compile( r"(P[A-Z]{2}P)" )

Meta \w - means alphanumeric characters, similar to [A-Z0-9_]

answered Sep 8, 2011 at 10:00

Artem O.

3,4871 gold badge17 silver badges11 bronze badges

Collectives™ on Stack Overflow

Regular expression in python

2 Answers 2

Comments

Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

2 Answers 2

Comments

Comments

Your Answer

Sign up or log in

Post as a guest

Related