0

I have this as my input

content = 'abc.zip'\n

I want to take out abc out of it . How do I do it using regex in python ?

Edit :

No this is not a homework question . I am trying to automate something and I am stuck at a certain point so that I can make the automate generic to any zip file I have .

os.system('python unzip.py -z data/ABC.zip -o data/')

After I take in the zip file , I unzip it . I am planning to make it generic , by getting the filename from the directory the zip file was put in and then provide the file name to the upper stated syntax to unzip it

8
  • 1
    More details please. Is it always the string 'abc' you want removed? Or is it everything before the dot? Is this a filename, or just a random string? The best tool for the job (almost certainly not regular expressions) depends on these details. Commented Jun 12, 2011 at 23:13
  • If this is a homework assignment, you should tag it as homework. Commented Jun 12, 2011 at 23:25
  • To be clear, is it content = 'abc.zip'\n (with that newline outside the quotes) or input = r"content = 'abc.zip'\n"? Commented Jun 12, 2011 at 23:50
  • @Blair I have edited my question . A non re approach can also do . Commented Jun 12, 2011 at 23:56
  • 1
    @Blair's answer solves the problem of extracting the filename. Calling ZipFile.extractall() sounds like a better plan! Commented Jun 13, 2011 at 0:21

3 Answers 3

4

As I implied in my comment, regular expressions are unlikely to be the best tool for the job (unless there is some artificial restriction on the problem, or it is far more complex than your example). The standard string and/or path libraries provide functions which should do what you are after. To better illustrate how these work, I'll use the following definition of content instead:

>>> content = 'abc.def.zip'

If its a file, and you want the name and extension:

>>> import os.path
>>> filename, extension = os.path.splitext(content)
>>> print filename
abc.def
>>> print extension
.zip

If it is a string, and you want to remove the substring 'abc':

>>> noabc = content.replace('abc', '')
>>> print noabc
.def.zip

If you want to break it up on each occurrence of a period;

>>> broken = content.split('.')
>>> print broken
['abc', 'def', 'zip']

If it has multiple periods, and you want to break it on the first or last one:

>>> broken = content.split('.', 1)
>>> print broken
['abc', 'def.zip']
>>> broken = content.rsplit('.', 1)
>>> print broken
['abc.def', 'zip']
Sign up to request clarification or add additional context in comments.

1 Comment

How do I insert the filename in this statement os.system('python unzip.py -z data/Abc.zip -o data/') . In the place of Abc.zip , that file name
1

Edit: Changed the regexp to match for "content = 'abc.zip\n'" instead of the string "abc.zip".

import re 

#Matching for "content = 'abc.zip\n'"
matches = re.match("(?P<filename>.*).zip\n'$", "content = 'abc.zip\n'")
matches = matches.groupdict()
print matches

#Matching for "abc.zip"    
matches = re.match("(?P<filename>.*).zip$", "abc.zip")
matches = matches.groupdict()
print matches

Output:

{'filename': 'abc'}

This will print the matches of everything before .zip. You can access everything like a regular dictionary.

2 Comments

I thought that content was a variable. And that the string was "abc.zip". I'll edit my answer after that the string is what you wrote.
@Johnysweb Yeah, it's very unclear :) I'll have to edit my answer to match for both.
0

If you're trying to break up parts of a path, you may find the os.path module to be useful. It has nice abstractions with clear semantics that are easy to use.

2 Comments

Ok then I would tell the bigger problem here . My file structure is this data/Abc.zip . My script doesn't really know the name of the zip file . How do I unzip the file using os.path and also extract the name of the zip file
@user794916: Please update your question to include all the pertinent information.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.