Python Regex "object has no attribute"

Question

I've been putting together a list of pages that we need to update with new content (we're switching media formats). In the process I'm cataloging pages that correctly have the new content.

Here's the general idea of what I'm doing:

Iterate through a file structure and get a list of files
For each file read to a buffer and, using regex search, match a specific tag
If matched, test 2 more regex matches
write the resulting matches (one or the other) into a database

Everything works fine up until the 3rd regex pattern match, where I get the following:

'NoneType' object has no attribute 'group'

# only interested in embeded content
pattern = "(<embed .*?</embed>)"

# matches content pointing to our old root
pattern2 = 'data="(http://.*?/media/.*?")'

# matches content pointing to our new root
pattern3 = 'data="(http://.*?/content/.*?")'

matches = re.findall(pattern, filebuffer)
for match in matches:
    if len(match) > 0:

    urla = re.search(pattern2, match)
    if urla.group(1) is not None:
        print filename, urla.group(1)

    urlb = re.search(pattern3, match)
    if urlb.group(1) is not None:
        print filename, urlb.group(1)

thank you.

oggy · Accepted Answer · 2009-09-29 09:13:49Z

18

Your exception means that urla has a value of None. Since urla's value is determined by the re.search call, it follows that re.search returns None. And this happens when the string doesn't match the pattern.

So basically you should use:

urla = re.search(pattern2, match)
if urla is not None:
    print filename, urla.group(1)

instead of what you have now.

answered Sep 29, 2009 at 9:13

oggy

3,5911 gold badge23 silver badges24 bronze badges

Sign up to request clarification or add additional context in comments.

Comments

cblab · Accepted Answer · 2012-07-25 17:53:37Z

3

The reason for TypeError is that search or match usually return either a MatchObject or a None. Only one of these has a group method. And it's not a None. So you need to do:

url = re.search(pattern2, match)
if url is not None:
    print(filename, url.group(0))

P.S. PEP-8 suggests using 4 spaces for indentation. It's not just an opinion, it's a good practice. Your code is fairly hard to read.

edited Jul 25, 2012 at 17:53

cblab

1,95515 silver badges25 bronze badges

answered Sep 29, 2009 at 9:14

SilentGhost

322k67 gold badges312 silver badges294 bronze badges

1 Comment

ives Over a year ago

ah. thank you. i use tabs in the code, which got reformatted / reinterpreted by the formatting engine for this site. "url is not None fixed it"

antonjs · Accepted Answer · 2011-04-17 09:06:03Z

2

I got the same problem.

Using python2.6, you can solve it in this way:

for match in matches:
 if len(match) > 0:

  urla = re.search(pattern2, match)
  try:  
   urla.group(1):
   print filename, urla.group(1)
  excpet:
   print "Problem with",pattern2


  urlb = re.search(pattern3, match)
  try:
   urlb.group(1)
   print filename, urlb.group(1)
  except:
   print "Problem with",pattern3

answered Apr 17, 2011 at 9:06

antonjs

14.4k15 gold badges70 silver badges91 bronze badges

1 Comment

Jean-Francois T. Over a year ago

Small typo: except: instead of "excpet:" for urla block.

holdenweb · Accepted Answer · 2009-09-30 03:50:00Z

0

Please also note your mistaken assumption that the error was in the third match, when it was in fact in the second. This seems to have led to the mistaken assumption that the second match was doing something to invalidate the third, sending you way off track.

answered Sep 30, 2009 at 3:50

holdenweb

37.8k7 gold badges62 silver badges80 bronze badges

Collectives™ on Stack Overflow

Python Regex "object has no attribute"

4 Answers 4

Comments

1 Comment

1 Comment

Comments

Your Answer

Linked

Hot Network Questions

Collectives™ on Stack Overflow

4 Answers 4

Comments

1 Comment

1 Comment

Comments

Your Answer

Sign up or log in

Post as a guest

Linked

Related