0

I have several CSV files (tables) in a directory (all tables have different schemas) and want to loop over the files and read each table into a separate dataframe.

Is there any way to do this in Python/Pandas - to read the different tables into a dataframe array? How multiple tables (with different schema) be imported into a multiple separate data frames?

3
  • do you want one dataframe or several dataframes? Commented Aug 8, 2013 at 12:09
  • Ideally separate dataframes... Commented Aug 8, 2013 at 12:20
  • 2
    Guys its not really a duplicate since the other q says '# Assemble all of the data files into a single DataFrame & add a year field' - which is not what this question asks... Commented Aug 8, 2013 at 13:07

1 Answer 1

2

try this;

import os
import pandas as pd
import glob
os.chdir("E:/") # change this to the directory where your csv files are stored
csv_files = {} # we store the dataframes in a dictionary
for file in glob.glob("*.csv"): 
    csv_files[file] = pd.read_csv(file)

for dataframe in csv_files.values():
    print dataframe
Sign up to request clarification or add additional context in comments.

4 Comments

Thanks for this. How does this work exactly? It seems to read a variable into the df and then overwrite it? I am probably not reading it correctly (while looping over the CSV files in the dir).
I've just removed df = 'data' + str(i). It dint make sense there. Correct. The df is being overwritten.
I think this is a good solution. Would be nicer without the df variable (and the printing). Some explanation of glob.glob would make this excellent :)
Thanks for this @Richie - I get error 'Error tokenizing data. C error: Expected 24 fields in line 6927, saw 26' - how do I overcome this?

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.