0

I have a variable that reads in a datafile

dfPort = pd.read_csv("E:...\Portfolios\ConsDisc_20160701_Q.csv")

I was hoping to create three variables: portName, inceptionDate, and frequency that would read the string of the "E:..." above and take out the wanted parts of the string using the underscore as a indicator to go to next variable. Example after parsing string:

portName = "ConsDisc"
inceptionDate: "2016-07-01"
frequency: "Q"

Any tips would be appreciated!

1
  • 1
    list = "E:...\Portfolios\ConsDisc_20160701_Q.csv".split("\\")[-1].split(".")[0].split("_"); portName, inceptionDate, frequency = list[0], list[1], list[2] Commented May 21, 2018 at 13:51

3 Answers 3

2

You can use os.path.basename, os.path.splitext and str.split:

import os

filename = r'E:...\Portfolios\ConsDisc_20160701_Q.csv'
parts = os.path.splitext(os.path.basename(filename.replace('\\', os.sep)))[0].split('_')
print(parts)

outputs ['ConsDisc', '20160701', 'Q']. You can then manipulate this list as you like, for example extract it into variables with port_name, inception_date, frequency = parts, etc.

The .replace('\\', os.sep) there is used to "normalize" Windows-style backslash-separated paths into whatever is the convention of the system the code is being run on (i.e. forward slashes on anything but Windows :) )

Sign up to request clarification or add additional context in comments.

2 Comments

as the next step, is there a way of implementing this if I had the filename go only until "E...\Portfolios" where I have several of these different 'ConsDisc_20160701_Q.csv' (different variables) and it would iterate through however many files there are. (I have a program that runs one file and does something with it, then moves onto the next) Thank you so much for your help though!
Yes, that'd be the glob function from the similarly named module. import glob; for file in glob.glob('e:\portfolios\*.csv'):...
1
import os

def parse_filename(path):
    filename = os.path.basename(path)
    filename_no_ext =  os.path.splitext(filename)[0]
    return filename_no_ext.split("_")

path = r"Portfolios\ConsDisc_20160701_Q.csv"
portName, inceptionDate, frequency = parse_filename(path)

Comments

0

How about an alternative solution just in case if you want to store them into a dictionary and use them like so,

import re

str1 = "E:...\Portfolios\ConsDisc_20160701_Q.csv"

re.search(r'Portfolios\\(?P<portName>.*)_(?P<inceptionDate>.*)_(?P<frequency>.)', str1).groupdict()

# result
# {'portName': 'ConsDisc', 'inceptionDate': '20160701', 'frequency': 'Q'}

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.