2

I am new to Pyhthon, and have been stuck with some work. I have a .csv file, with 7 columns. The first column contains path to certain files, eg:

Name
a/b/c.xyz
m/n/o/p.sad
p/q/r/s/t/u.asas

I need to separate the directories(path) to the files, and owners being 'a', 'm', 'p' for their respective rows. I have imported the .csv file using pandas. I have read that os.path could be of some help. Any suggestions would be much appreciated.Also, the data I'm working on is pretty big, so need to take care about the overhead in executing the script.

Thanks.

2
  • do you want to split owners and paths into two columns? Commented Apr 7, 2016 at 13:23
  • Yes sir, that is what I intend to do. Commented Apr 7, 2016 at 13:34

3 Answers 3

1

the os module you mentioned knows three different split variants

os.path.split os.path.splitdrive os.path.splitext

If you just want the first part of your string just use <str>.split('/')[0]

>>> 'p/q/r/s/t/u.asas'.split('/')
>>> 'p'

I'd also recommend to use the build-in csv module to read your file. pandas seems like overkill.

Here a good source on how to use the module. I especially like the csv.DictReader class.

Sign up to request clarification or add additional context in comments.

2 Comments

os.path.dirname and os.path.basename could also be useful.
Thanks for the help :)
0

if you just want to find the owner and file name, it could be done by split.

import csv

owner,file =[],[]
with open(filePath,'rb') as f:
    reader = csv.reader(f)
    for line in reader:
        owner.append(line.split('/')[0])
        file.append(line.split('/')[-1])

if you need to find the file path and remove the owner, it can be achieved by split and os.path.join

import csv
import os
owner,file =[],[]
with open(filePath,'rb') as f:
    reader = csv.reader(f)
    for line in reader:
        owner.append(line.split('/')[0])
        file.append(os.path.join( *line.split('/')[1:] ))

os.path.join example:

 string = 'p/q/r/s/t/u.asas'
    os.path.join( *string.split('/')[1:] )
    output:
    'q\\r\\s\\t\\u.asas'

Comments

0

is that what you want?

In [64]: df
Out[64]:
               Name
0         a/b/c.xyz
1       m/n/o/p.sad
2  p/q/r/s/t/u.asas

In [66]: df.Name.str.extract(r'(?P<owner>.)(?P<path>/.*)', expand=True)
Out[66]:
  owner             path
0     a         /b/c.xyz
1     m       /n/o/p.sad
2     p  /q/r/s/t/u.asas

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.