1

I want to load multiple CSV files into one dataframe. Each CSV contains stock data with 6 columns ( 'Open', 'High', 'Low', 'Close', 'Adj Close', 'Volume' ) . I managed to load the CSV files, but I'm missing the column name ( each ticker, from CSV ).

sp500 =  os.listdir(os.path.splitext(os.getcwd()+'/spy500')[0])

combined = pd.concat([pd.read_csv('spy500/'+i, parse_dates=True, index_col='Date') for i in sp500], axis=1)

output:

Open    | High  |Low    |Close| Adj Close   |Volume|    Open|   High|   Low Close|  Adj Close   |Volume

desire output:

AAPL                                            | GOOG                  
Open |High  |Low    |Close  |Adj Close  |Volume |Open   |High   |Low    |Close  |Adj Close  |Volume

the output is correct, the only thing I need is to add a multi level column: 5986 rows × 3030 columns

4
  • stackoverflow.com/questions/52289386/… this helps? Commented Sep 23, 2019 at 11:23
  • Can you put an example of the columns in the different csv and in the expected output pls Commented Sep 23, 2019 at 11:23
  • What is print (sp500[:5]) ? Commented Sep 23, 2019 at 11:53
  • 1
    ['A.csv', 'AAL.csv', 'AAP.csv', 'AAPL.csv', 'ABBV.csv'] Commented Sep 23, 2019 at 11:54

1 Answer 1

2

Use dictionary comprehension:

comp = {i.split('.')[0]: 
        pd.read_csv('spy500/'+i, parse_dates=True, index_col='Date') for i in sp500}
combined = pd.concat(comp, axis=1)
Sign up to request clarification or add additional context in comments.

1 Comment

Can you suggest a solution like this that will do the same job but in parallel computation? I know that it is possible for example to use read_csv("/*.csv") which will read those files into a single dataframe using multiple cores (speaking about Dask specifically).

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.