2

I have a long Pandas dataset that contains a column called 'id' and another column called 'species', among other columns. I have to perform a change on the 'species' column, based on specific values of the 'id' column.

For example, if the 'id' is '5555555' (as a string), then I want that the 'species' value change its current value 'dove' (also a string) to 'hummingbird'. So far I have been using the method:

df.loc[df["id"] == '5555555', "species"] = 'hummingbird'

Here is short sample data frame:

import pandas as pd
        
#Starting dataset
d = {'id': ['11111111', '22222222', '33333333', '44444444', '55555555', '66666666', '77777777', '88888888'], 'species': ['dove', 'dove', 'dove', 'hummingbird', 'hummingbird', 'dove', 'hummingbird', 'dove']}
df = pd.DataFrame(data=d)
df
    
    id          species
0   11111111    dove
1   22222222    dove        #wants to replace
2   33333333    dove        #wants to replace
3   44444444    hummingbird
4   55555555    hummingbird
5   66666666    dove
6   77777777    hummingbird
7   88888888    dove        #wants to replace        
     
#Expected outcome
d = {'id': ['11111111', '22222222', '33333333', '44444444', '55555555', '66666666', '77777777', '88888888'], 'species': ['dove', 'hummingbird', 'hummingbird', 'hummingbird', 'hummingbird', 'dove', 'hummingbird', 'hummingbird']}
df = pd.DataFrame(data=d)
df
    
    id          species
0   11111111    dove
1   22222222    hummingbird #replaced
2   33333333    hummingbird #replaced
3   44444444    hummingbird
4   55555555    hummingbird
5   66666666    dove
6   77777777    hummingbird
7   88888888    hummingbird #replaced

This is ok for a small number of lines, but I have to do this to about 1000 lines with individual 'id' each, so I thought that maybe a loop that I could feed it the list of 'id', but I honestly do not know how to even start.

Thanks in advance!!

and thanks to Scott Boston for pointing me out in the right direction to ask better questions!

6
  • kindly add sample dataset with expected output Commented Jul 7, 2021 at 0:56
  • Will do, thank you for the advice! Commented Jul 7, 2021 at 1:16
  • Why does 2222 and 3333 change but not 1111? Commented Jul 7, 2021 at 2:23
  • 1
    Okay... for this question it is best if you create a small toy dataset and a list of id's you want to change and the values you want to change. You question here as stated is pretty broad hence you have no current answers. I suspect what you are trying to accomplish is pretty easy just that we are not sure what your inputs nor your expected output is. See this post to help use help you. Commented Jul 7, 2021 at 16:17
  • 1
    @ScottBoston Thank you so much for pointing me in the right direction to ask better questions! I'm sure I'll get better with time, but I hope this new version of my question can clarify what I want to accomplish. Commented Jul 7, 2021 at 17:08

1 Answer 1

1

Use isin

humming_ids = [44444444, 5555555, 88888888]
df.loc[df.id.isin(humming_ids), "species"] = 'hummingbird'
Sign up to request clarification or add additional context in comments.

1 Comment

Thank you so much @Vishnudev! This worked great. I also want to mention that I combined your answer with an answer from another post (stackoverflow.com/questions/41768196/…) since I also needed to extract the id from an excel sheet and then convert them into a python list to use your answer. You have no idea how much this will help me!

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.