0

I have a list of paths in a .txt file and I'm trying to parse out one folder in the path name using python.

9999\New_folder\A\23818\files\  
9999\New_folder\A\18283_HO\files\  
...

What I'm interested in doing is pulling the string between 9999\New_folder\A\ and \files\ so that I end up with:

23818  
18283_HO

Any help would be appreciated!

EDIT: Thanks a lot everyone! Came up with the following code with your input.

input_text = open('C:\\Python\\textintolist\\Document1.txt', 'r')
output_text = open('output.txt', 'w')

paths =[]


for line in input_text:
    paths.append(line)

for path in paths:
        output_text.write(str(path.split('\\')[3])+"\n")
1
  • use regex regex Commented Aug 13, 2012 at 21:12

4 Answers 4

1
>>> s = '9999\\New_folder\\A\\23818\\files\\'
>>> s.split('9999\\New_folder\\A\\')[1].split('\\')[0]
'23818'
Sign up to request clarification or add additional context in comments.

Comments

0

If your paths are always in this format:

>>> paths
['9999\\New_folder\\A\\23818\\files\\', '9999\\New_folder\\A\\18283_HO\\files']
>>> for path in paths:
...     print path.split('\\')[3]
...
23818
18283_HO

Comments

0

There are many solutions. If all paths are like 9999\New_folder\A#number#\files\ then your could simply take a substring by finding the third last and seconds last "\". You can do that by using rfind() (http://docs.python.org/library/string.html#string.rfind)

Another, more general way is the use of regular expressions. http://docs.python.org/library/re.html

Comments

0
#sm.th. like this should work:
file_handler = open("file path")
for line in file_handler:   
    re.search(r'\\(.[^\\]+)\\files', line).groups(0)[0]

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.