I'm needing to merge columns in a dataframe.
The headers will have a similar name with a different suffix, e.g.
A1 | A2 | A3 | B1 | B2 | B3
I want to end up with all of them merged:
A | B
I have this line that successfully merges a defined set of columns into a single column:
df['A'] = df[['A1','A2','A3]].apply(' '.join, axis=1)
The problem is that the headers are inconsistent in that there might be any combination of '1','2',or '3' - e.g.
A1 | A2 | A3 | B2 | C1 | C2
From the solutions I've looked at, pandas doesn't like to reference columns that don't exist, so I can't use apply statement as a blanket command.
I'm having trouble visualizing a solution beyond a list of nested Try/Except steps. If anyone has an idea, I would appreciate it!
Update
Thanks for the solutions!!! If anyone is interested, here's what worked for me:
Solution 1
for h in headers:
cols = [col for col in df.columns if col.split('[')[0] == h]
if cols == []:
cols = [col for col in df.columns if col == h and col.split('[')[0] not in headers] `
Solution 2
df.groupby(df.columns.str.split('[').str[0],axis=1).agg(lambda x :' '.join(x.values.tolist()))