Add column to Pandas dataframe, merging repeated rows

Question

I have the some data that looks like this:

City         1994 1995  1996  1997                           
Rome            X  NaN   NaN   NaN
New York      NaN  NaN   NaN   NaN
Paris         NaN    X   NaN   NaN
Sidney        NaN  NaN   NaN   NaN

The crosses represent that an event happened in that city in that specific year. I want to add new years to my data frame, where the data looks like this:

City         1998                           
Rome            X
Sidney          X
London          X
Madrid          X

Basically the new data indicates where events happened in a new different year (so it'll always have 'X'). The result I'm looking for should look like this:

City         1994 1995  1996  1997   1998                          
Rome            X  NaN   NaN   NaN      X
New York      NaN  NaN   NaN   NaN    NaN
Paris         NaN    X   NaN   NaN    NaN
Sidney        NaN  NaN   NaN   NaN      X
London        NaN  NaN   NaN   NaN      X
Madrid        NaN  NaN   NaN   NaN      X

So it should add the new non existing cities to the city list and put NaN for the already present years for that city and also put an X in the newly added year for the cities where the event occurred.

I tried using df.append() but I get repeaded values

City         1994 1995  1996  1997   1998                           
Rome            X  NaN   NaN   NaN    NaN
New York      NaN  NaN   NaN   NaN    NaN
Paris         NaN    X   NaN   NaN    NaN
Sidney        NaN  NaN   NaN   NaN    NaN
Rome          NaN  NaN   NaN   NaN      X
Sidney        NaN  NaN   NaN   NaN      X
London        NaN  NaN   NaN   NaN      X
Madrid        NaN  NaN   NaN   NaN      X

I also tried using df.merge() but I didn't get any better results. Any suggestions on how to do it using pandas? I can write some code to manually does it but it defeats the purpose of using such a powerful tool.

Irfanuddin · Accepted Answer · 2022-01-09 13:33:14Z

1

You can use df.merge() to achieve this.

import pandas as pd

data = [['Rome','X',  None,   None,   None],['New York' ,None,  None,   None,   None], ['Paris',None,  'X',   None,   None]]
df = pd.DataFrame(data, columns = ['City','1994', '1995',  '1996' , '1997' ])

df.head()

       City  1994  1995  1996  1997
0      Rome     X  None  None  None
1  New York  None  None  None  None
2     Paris  None     X  None  None

df2 = pd.DataFrame([['Rome', 'X', ],['Sydney','X'], ['London','X']], columns = ['City',         '1998'])

     City 1998
0    Rome    X
1  Sydney    X
2  London    X

Then you can merge the two DataFrame

df.merge(df2, how='outer')

       City  1994  1995  1996  1997 1998
0      Rome     X  None  None  None    X
1  New York  None  None  None  None  NaN
2     Paris  None     X  None  None  NaN
3    Sydney   NaN   NaN   NaN   NaN    X
4    London   NaN   NaN   NaN   NaN    X

You need to specify outer as type of merge to be performed.

outer: use union of keys from both frames, similar to a SQL full outer join; sort keys lexicographically.

edited Jan 9, 2022 at 13:33

answered Jan 9, 2022 at 13:30

Irfanuddin

2,6353 gold badges18 silver badges29 bronze badges

Sign up to request clarification or add additional context in comments.

3 Comments

Irfanuddin Over a year ago

@MR I have updated the answer, can you check? If that is giving you desired output, you can mark is as accepted answer.

M R Over a year ago

That does the trick, thank you very much!

Lucas Barbosa Over a year ago

This is a good answer. Just to add a little bit more context to it. merge is a versatile option to work with pandas, so if both dataframes have the same index/structure you could merge both dataframes based on the same reference index/column (like a left join). This is also an important remark. (external reference ) @lrfanuddin

Collectives™ on Stack Overflow

Add column to Pandas dataframe, merging repeated rows

1 Answer 1

3 Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

1 Answer 1

3 Comments

Your Answer

Sign up or log in

Post as a guest

Related