Suppose these dataframes:
import pandas as pd
df_one = pd.DataFrame({'col_1':[1, 2, 3, 4], 'col_2':[5,6,7,8], 'col_3':[9,10,11,12]})
df_two = pd.DataFrame({'col_1':[1, 2, 3, 4], 'col_3': [9,10,11,12], '2_col':[5, 6, 7, 8]})
In reality these dataframes come from different txt files so the concept of each column is the same but the order of columns is not, and some of the columns have a slightly different name. Both datasets have 33 columns representing the same concepts but in different order.
How can I reorder the second df with the same structure as the first df? Meaning same order of columns and same column names as df_one...
The final objective is to merge both df into a single consolidated one.
I have tried this:
cols = df_one.columns.to_list() # get columns names from df_one
df_two = df_two.reindex(columns=cols)
but this gets NaN values in 'col_2':
col_1 col_2 col_3
0 1 NaN 9
1 2 NaN 10
2 3 NaN 11
3 4 NaN 12
I also tried to first change col names in df_two and then reorder:
df_two.columns = cols
df_two = df_two.reindex(columns=cols)
but this also is wrong (col_2 now have the values of col_3):
col_1 col_2 col_3
0 1 9 5
1 2 10 6
2 3 11 7
3 4 12 8
Thanks for your suggestions.
EDIT BASED ON COMMENTS:
Actual Column names are more like: 'Date' & 'iDate', 'Contract' & 'nContract', 'Premium' & 'iPremium'. I exemplified with numbers in the question (probably bad idea), but correlated numbers are not part of the names.
How can I map the order of columns in df_two ? (say, col 1 of df_1 is the same as col 1 in df_2, col 2 of df_1 is col_3 of df_2, col_3 of df_1 is col_2 of df_2) - And then I would rename the columns in df_2 as in df_1.
df_oneanddf_twohave the same column names (usingdf_one.rename(columns={'col_one':'col_two', ...})). Thendf_one[df_two.columns]will do the job.pd.concat.