3

I have the following dataframe:-

import pandas as pd
df = pd.read_csv('filename.csv')

print(df)

date        organic         paid        source_type
4/1/2018    39911909.19     38575924.75 Search
4/1/2018    5085939.952     882.608927  Social
4/1/2018    16227439.73     0           Mail
4/1/2018    0               5671871.24  Display Ads
4/1/2018    91215520.23     0           Direct
4/1/2018    15743479.56     0           Referrals

I want to add a column total_sum for all the source types except when source type is "Search". If the source_type is search I want to break down the single row into two and source type becomes organic search and paid search. Inshort a df like below. The summing part is easy to handle i am just stuck with the breaking of rows and conditional column prefix part. Dataframe I need:-

date        source_type     Total Sum
4/1/2018    Organic Search  39911909.19
4/1/2018    Paid Search     38575924.75
4/1/2018    Social          5086822.561
4/1/2018    Mail            16227439.73
4/1/2018    Display Ads     5671871.24
4/1/2018    Direct          91215520.23
4/1/2018    Referrals       15743479.56
2
  • What have u tried so far ? Commented May 26, 2020 at 6:48
  • As i said i can sum it up easily. df['total_sum'] = df['organic'] + df['paid'] But I am struck with the other part. Commented May 26, 2020 at 6:51

1 Answer 1

3

You can split DataFrame by boolean indexing with Series.eq for ==, then reshape first by DataFrame.melt with new column with Series.str.capitalize, filter second by invert mask by ~, sum values with DataFrame.pop for remove column after and last use concat:

mask = df['source_type'].eq('Search')

df1 = df[mask].melt(['date','source_type'], value_name='Total Sum')
df1['source_type'] =  df1.pop('variable').str.capitalize() + ' Search'

df2 = df[~mask].copy()
df2['Total Sum'] = df2.pop('organic').add(df2.pop('paid'))
df = pd.concat([df1, df2], ignore_index=True)
print (df)
       date     source_type     Total Sum
0  4/1/2018  Organic Search  3.991191e+07
1  4/1/2018     Paid Search  3.857592e+07
2  4/1/2018          Social  5.086823e+06
3  4/1/2018            Mail  1.622744e+07
4  4/1/2018     Display Ads  5.671871e+06
5  4/1/2018          Direct  9.121552e+07
6  4/1/2018       Referrals  1.574348e+07
Sign up to request clarification or add additional context in comments.

2 Comments

this is nice answer.
this is brilliant thank you!! I was struggling for so much time!

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.