0

I am using a small utility to find the type a file under Windows.

TrID/32 - File Identifier v2.10 - (C) 2003-11 
By M.Pontello Definitions found:  5295  Analyzing... 

Collecting data from file: april_error.wmv

94.1% (.WMV/WMA) Windows Media (generic) (16018/6)
 5.8% (.CAT) Microsoft Security Catalog (1000/1)

In Python, how can I capture the (.WMV/WMA) cause it seems that I currently get a wrong matching group. For instance re.search('\((.*?)\)', stdout).group(1) returns 'C'

Thanks in advance.

1
  • Do you only want to capture this: (.WMV/WMA) ? Commented Apr 23, 2014 at 9:06

2 Answers 2

2

Try using findall instead:

a = re.findall('\((.*?)\)', stdout)

>>> print a
['C','.WMV/WMA','generic','16018/6','.CAT','1000/1']

>>> print a[1]
.WMV/WMA

Or as @tobias_k suggested, do the following to only capture the file extension matches:

a = re.findall('\((\..*?)\)', stdout)

>>> print a
['.WMV/WMA', '.CAT']
Sign up to request clarification or add additional context in comments.

3 Comments

+1 Maybe add a \. to find only the groups with actual file extensions.
@sshashank124 I believe you've got to cut out the .group part cause I get the following error: AttributeError: 'list' object has no attribute 'group'
@sshashank124 Yes. It does. Thank you. I'll accept your answer in 5'.
1

Based on your comment above, this is what you need:

match = re.search(r"% \([a-z.]+/[a-z.]+\)", subject, re.IGNORECASE)

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.