Python Pandas - Replacing values of a part of data frame column based on index

Question

I am fetching the first occurrence of a particular value in a Panda column based on its index as shown below :

first_idx = df1.loc[df1.Column1.isin(['word1','word2'])].index.tolist()[0]

This will give me the index of first occurrence of either 'word1' or 'word2'

Then I am replacing old values of the records until the determined index with new values as shown below :

df1.head(first_idx)['Column1'].replace({'10': '5'}, inplace=True)

This will replace all '10's that are present until the first_idx of the dataframe with '5's. All the remaining '10's present after the first_idx value will not be replaced.

Now I have to replace all '10's present after the first_idx value with '3's. I have tried the below by calculating the length of data frame and then subtracting it with the first_idx value.

len(df1)                         # This will show the actual length / total number of records of a dataframe column.
temp = (len(df1)-first_idx)-1    # This will determine the remaining count of records barring the count of records until first_idx value.
df1.tail(temp)                   # This will show all records that are present after the first_idx value.
df1.tail(temp)['Column1'].replace({'10': '3'}, inplace=True)

But is there any other better / efficient / simple way to achieve the same ?

Eran · Accepted Answer · 2017-09-26 16:54:59Z

1

From the way you used

df1.head(first_idx)

I assume your indices are numeric. Thus, a simple

df1.iloc[first_idx + 1:, :]['Column1'].replace({'10': '3'}, inplace=True)

Should do.

answered Sep 26, 2017 at 16:54

Eran

8446 silver badges21 bronze badges

Sign up to request clarification or add additional context in comments.

2 Comments

JKC Over a year ago

Thank you @Eran. It works. but I tried the same with df1.loc. It also does the same job. If possible, can you please explain what's the difference between the two as both of them achieve the same results

Eran Over a year ago

Sure @JKC. iloc is used for actual line numbers. df1.iloc[2:4] will slice the row 2 and 3, regardless of their indices. loc slices using the indices of your dataframe. They can be numbers, or non-numbers. If your indices are ordered numbers (as in your case), both would behave exactly the same. Read also about df.idx[] which combines both. I don't use it much though, I prefer the more explicit way of loc and iloc.

Collectives™ on Stack Overflow

Python Pandas - Replacing values of a part of data frame column based on index

1 Answer 1

2 Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

1 Answer 1

2 Comments

Your Answer

Sign up or log in

Post as a guest

Related