1

I need a regex in python to match and return the integer after the string "id": in a text file.

The text file contains the following:

{"page":1,"results": [{"adult":false,"backdrop_path":"/ba4CpvnaxvAgff2jHiaqJrVpZJ5.jpg","id":807,"original_title":"Se7en","release_date":"1995-09-22","p

I need to get the 807 after the "id", using a regular expression.

2 Answers 2

3

Is this what you mean?

#!/usr/bin/env python
import re

subject = '{"page":1,"results": [{"adult":false,"backdrop_path":"/ba4CpvnaxvAgff2jHiaqJrVpZJ5.jpg","id":807,"original_title":"Se7en","release_date":"1995-09-22","p'

match = re.search('"id":([^,]+)', subject)
if match:
    result = match.group(1)
else:
    result = "no result"
print result    

The Output: 807

Edit:

In response to your comment, adding one simple way to ignore the first match. If you use this, remember to add something like "id":809,"etc to your subject so that we can ignore 807 and find 809.

n=1
for match in re.finditer('"id":([^,]+)', subject):
    if n==1:
        print "ignoring the first match"
    else:
        print match.group(1)
    n+=1
Sign up to request clarification or add additional context in comments.

9 Comments

works perfectly. how would i make it find the 2nd instance onward and ignore the first instance?
@user3552978 One way is to iterate through the matches: for match in re.finditer('"id":([^,]+)', subject): then disregard the first one.
thanks. i'm having some issues in finding how to ignore the first instance. i have the following and it works fine, but returns everything including the first instance:
f = open('temp.txt', 'r') subject = f.read() for match in re.finditer('"id":([^,]+)', subject): print match.group(1) f.close()
@user3552978 I'm going to add that bit of code to the answer so it formats properly. By the way, I see you're recent on the site, so in case you are not aware of this, if you find someone's answer useful, you can upvote it. You can even upvote several answers in one question. Of course you're under no obligation to do so.
|
2

Assuming that there is more to the file than that:

import json

with open('/path/to/file.txt') as f:
    data = json.loads(f.read())
    print(data['results'][0]['id'])

If the file is not valid JSON, then you can get the value of id with:

from re import compile, IGNORECASE

r = compile(r'"id"\s*:\s*(\d+)', IGNORECASE)

with open('/path/to/file.txt') as f:
    for match in r.findall(f.read()):
        print(match(1))

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.