3

I have data frame and I try to replace it from other df

I use:

df['term_code'] = df.search_term.map(rep_term.set_index('search_term')['code_action'])

But I get an error:

File "C:/Users/����� �����������/Desktop/projects/find_time_before_buy/graph (2).py", line 36, in <module>
df['term_code'] = df.search_term.map(rep_term.set_index('search_term')['code_action'])
 File "C:\Python27\lib\site-packages\pandas\core\series.py", line 2101, in map
indexer = arg.index.get_indexer(values)
 File "C:\Python27\lib\site-packages\pandas\indexes\base.py", line 2082, in get_indexer
   raise InvalidIndexError('Reindexing only valid with uniquely'
pandas.indexes.base.InvalidIndexError: Reindexing only valid with uniquely valued Index objects

What should I change? Where search_term is

729948                               None  
729949                               None  
729950                               None  
729951  пансионат джемете отдых 2016 цены  
729952                               None  
729953                               None  
729954                               купить телефон  
729955                               None  
729956                               вк  
729957                               None  
729958                               яндекс  

And rep_term looks like

search_term code_action
авито   6
вк  9
яндекс  12
мтс 7
связной 8
ситилинк    8

1 Answer 1

4

There is problem with duplicates in DataFrame rep_term column search_term.

I simulate it:

import pandas as pd

df = pd.DataFrame({'search_term':[1,2,3]})

print (df)
   search_term
0            1
1            2
2            3

For value 1 in search_term you have 2 values in code_action:

rep_term = pd.DataFrame({'search_term':[1,2,1], 'code_action':['ss','dd','gg']})
print (rep_term)
  code_action  search_term
0          ss            1
1          dd            2
2          gg            1


df['term_code'] = df.search_term.map(rep_term.set_index('search_term')['code_action'])
print (df)
#InvalidIndexError: Reindexing only valid with uniquely valued Index objects

So first identify rows where are duplicated vaues by duplicated:

print (rep_term[rep_term.duplicated(subset=['search_term'], keep=False)])
  code_action  search_term
0          ss            1
2          gg            1

Then you can drop duplicity with keeping last or first values by drop_duplicates

rep_term1 = rep_term.drop_duplicates(subset=['search_term'], keep='first')
print (rep_term1)
  code_action  search_term
0          ss            1
1          dd            2

rep_term2 = rep_term.drop_duplicates(subset=['search_term'], keep='last')
print (rep_term2)
  code_action  search_term
1          dd            2
2          gg            1
Sign up to request clarification or add additional context in comments.

2 Comments

Should I rename column? When name of this column was another, it returned this error too.
Give me a sec, I add what can you do to answer.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.