0

I am having Dataframe which has multiple columns in which some columns are equal (Same key in trailing end eg: column1 = 'a/first', column2 = 'b/first'). I want to merge these two columns. Please help me out to solve the problem.

My Dataframe looks like

name   g1/column1  g1/column2 g1/g2/column1  g2/column2
AAAA   10             20          nan           nan
AAAA   nan            nan         30            40

My result will be like as follows

name   g1/column1  g1/column2
AAAA   10             20          
AAAA   30             40      

Thanks in advance

2
  • what if both the columns have value for the same row ? Commented Dec 6, 2018 at 5:02
  • This is not possible. one should have one value. and others are nan Commented Dec 6, 2018 at 5:05

4 Answers 4

2

Use:

#create index by all columns with no merge
df = df.set_index('name')
#MultiIndex by split last /
df.columns = df.columns.str.rsplit('/', n=1, expand=True)
#aggregate first no NaN values per second level of MultiIndex
df = df.groupby(level=1, axis=1).first()
print (df)
      column1  column2
name                  
AAAA     10.0     20.0
AAAA     30.0     40.0
Sign up to request clarification or add additional context in comments.

5 Comments

nice and small solution. However, if you can put more explanation that will be great. Like for df = df.groupby(level=1, axis=1).first()
If the column which has no separator ('/'). it will ignore those columns. How to avoid this ?
@MOHAMEDAZARUDEEN - There is some rule for grouping? What is print (df.columns) ?
[u'Additional_type_of_g_business_enterprise', u'version', u'_attachments', u'formhub/uuid', u'group_bf8zc97/Female', u'group_bf8zc97/Female', u'group_bf8zc97/To', u'group_bf8zc97/CIG_expenses', u'group_gu8hn21/CIG_expenses', u'group_gu8hn22/group_gu8hn20/CIG_expenses']
@MOHAMEDAZARUDEEN - Thank you. So what columns need merge together? I cannot see your real data, so need mapping like u'village',u'county', ...=> u'village_code',u'sub_county' ... (maybe I ma wrong with this mapping, lease correct it if necessary and also add all columns names for both sides)
1

you need df.combine_first,

col1=['g1/column1', 'g1/column2']
col2=['g1/g2/column1', 'g2/column2']

df[col1]=df[col1].combine_first(pd.DataFrame(df[col2].values,columns=col1))

df=df.drop(col2,axis=1)

print(df)
#   name  g1/column1    g1/column2
#0  AAAA  10.0      20.0
#1  AAAA  30.0      40.0

1 Comment

If I am having another column g1/g2/g3/column1, it wont added under g1/column1.
0

One of the solution:

df = pd.DataFrame([[10, 20, np.nan, np.nan],
                  [np.nan, np.nan, 30, 40]],
                 columns=['g1/column1', 'g1/column2', 'g1/g2/column1', 'g2/column2'])
df

   g1/column1   g1/column2  g1/g2/column1   g2/column2
0   10.0        20.0        NaN             NaN
1   NaN         NaN         30.0            40.0

df = df.fillna(0)  # <- replacing all NaN with 0

ndf = pd.DataFrame() 

unique_cols = ['column1', 'column2']

for i in range(len(unique_cols)):
    val = df.columns[df.columns.str.contains(unique_cols[i])]
    ndf[val[0]] = df.loc[:,val].sum().reset_index(drop=True)

ndf  # <- You can add index if you need (AAAA, AAAA)

    g1/column1  g1/column2
0   10.0        20.0
1   30.0        40.0

Comments

0
import pandas as pd
import numpy as np

g1 = [20, np.nan, 30, np.nan]
g1_2 = [10, np.nan, 20, np.nan]
g2 = [np.nan, 30, np.nan, 40]
g2_2 = [np.nan, 10, np.nan, 30]

dataList = list(zip(g1, g1_2, g2, g2_2))
df = pd.DataFrame(data = dataList, columns=['g1/column1', 'g1/column2', 'g1/g2/column1', 'g2/column2'])

df.fillna(0, inplace=True)

df['g1Combined'] = df['g1/column1'] + df['g1/g2/column1']
df['g2Combined'] = df['g1/column2'] + df['g2/column2']
df.drop('g1/column1', axis=1, inplace=True)
df.drop('g1/column2', axis=1, inplace=True)
df.drop('g1/g2/column1', axis=1, inplace=True)
df.drop('g2/column2', axis=1, inplace=True)
df

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.