0

I would like to create a scalable code to import multiple CSV files, standardize the order of the colnumns based on the colnames and re-write CSV files.

import glob
import pandas as pd

# Get a list of all the csv files
csv_files = glob.glob('*.csv')

# List comprehension that loads of all the files
dfs = [pd.read_csv(x,delimiter=";") for x in csv_files]

A=pd.DataFrame(dfs[0])
B=pd.DataFrame(dfs[1])
alpha=A.columns.values.tolist()
print([pd.DataFrame(x[alpha]) for x in dfs])    

I would like to be able to split this object and write CSV for each of the file and rename them with the original names. is that easily possible with python? Thansk for your help.

1
  • do all the files have the same columns but just in different order? or do you want to rename all the column names with one of the csv's? Commented May 31, 2019 at 22:56

1 Answer 1

1

If you want to reorder columns by a consistent order, assuming that all csv's have the same column names but in a different order, you can sort one of the column name lists and then order the other ones by that list. Using your example:

csv_files = glob.glob('*.csv')
sorted_columns = []
for e,x in enumerate(csv_files):
    df = pd.read_csv(x,delimiter=";")
    if e==0:
        sorted_columns = sorted(df.columns.values.tolist())
    df[sorted_columns].to_csv(x, sep=";")
Sign up to request clarification or add additional context in comments.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.