How to change values in a dataframe to values from another dataframe, with closest match?

Question

I have a dataframe with real estate object parameters ('Rooms', 'Square' etc). I want to replace rows where 'Rooms' parameter equals to 0 to the corresponding number of rooms from a ('Rooms' - 'mean Square') dataframe I created from the same initial dataset.

I would use the .replace method, but the problem is - the actual Square values for 0 Room rows don't exactly match the mean values.

I'm new to pandas, so all solutions I would try are based on putting the column values into python lists and using cycles, which is a nightmare. All other similar topics I've seen on stackoverflow are good only for exact matches.

This is the slice and part of the initial dataframe where I want the values ('Rooms') to be changed:

data.loc[data['Rooms'] == 0][['Rooms', 'Square']]

      Rooms Square
1397  0.0   138.427694
1981  0.0   212.932361
2269  0.0   41.790881
3911  0.0   49.483501
4366  0.0   81.491446
4853  0.0   2.377248
6149  0.0   38.697117
8834  0.0   87.762616

This is the code that creates the 'Rooms' - 'mean Square' dataframe:

mean_square = data.loc[(data['Rooms'] < 6) & (data['Rooms'] > 0)].groupby('Rooms', as_index=False)['Square'].mean()

This is the result:

    Rooms   Square
0   1.0     41.323277
1   2.0     56.788214
2   3.0     76.903234
3   4.0     98.377544
4   5.0     122.614941

For example, for item 1397 I would expect 0.0 changed to 5.0 (~138 sqm is closest to ~122 sq mean for 4 rooms).

ALollz · Accepted Answer · 2019-05-02 18:40:04Z

2

This is pd.merge_asof. Taking your first slice as df1 and the second as df2, we can find the nearest match, 'Rooms_y', and align the result on the original index.

import pandas as pd

df1['Rooms'] = (pd.merge_asof(df1.reset_index().sort_values('Square'), 
                              df2.sort_values('Square'), 
                              on='Square', direction='nearest')
                   .set_index('index')['Rooms_y']
                )

Output:

      Rooms      Square
1397    5.0  138.427694
1981    5.0  212.932361
2269    1.0   41.790881
3911    2.0   49.483501
4366    3.0   81.491446
4853    1.0    2.377248
6149    1.0   38.697117
8834    4.0   87.762616

edited May 2, 2019 at 18:40

answered May 2, 2019 at 18:18

ALollz

59.7k7 gold badges73 silver badges97 bronze badges

Sign up to request clarification or add additional context in comments.

Collectives™ on Stack Overflow

How to change values in a dataframe to values from another dataframe, with closest match?

This is the slice and part of the initial dataframe where I want the values ('Rooms') to be changed:

This is the code that creates the 'Rooms' - 'mean Square' dataframe:

This is the result:

1 Answer 1

Output:

Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

This is the slice and part of the initial dataframe where I want the values ('Rooms') to be changed:

This is the code that creates the 'Rooms' - 'mean Square' dataframe:

This is the result:

1 Answer 1

Output:

Comments

Your Answer

Sign up or log in

Post as a guest

Related