0

I am fetching the first occurrence of a particular value in a Panda column based on its index as shown below :

first_idx = df1.loc[df1.Column1.isin(['word1','word2'])].index.tolist()[0]

This will give me the index of first occurrence of either 'word1' or 'word2'

Then I am replacing old values of the records until the determined index with new values as shown below :

df1.head(first_idx)['Column1'].replace({'10': '5'}, inplace=True)

This will replace all '10's that are present until the first_idx of the dataframe with '5's. All the remaining '10's present after the first_idx value will not be replaced.

Now I have to replace all '10's present after the first_idx value with '3's. I have tried the below by calculating the length of data frame and then subtracting it with the first_idx value.

len(df1)                         # This will show the actual length / total number of records of a dataframe column.
temp = (len(df1)-first_idx)-1    # This will determine the remaining count of records barring the count of records until first_idx value.
df1.tail(temp)                   # This will show all records that are present after the first_idx value.
df1.tail(temp)['Column1'].replace({'10': '3'}, inplace=True)

But is there any other better / efficient / simple way to achieve the same ?

1 Answer 1

1

From the way you used

df1.head(first_idx)

I assume your indices are numeric. Thus, a simple

df1.iloc[first_idx + 1:, :]['Column1'].replace({'10': '3'}, inplace=True)

Should do.

Sign up to request clarification or add additional context in comments.

2 Comments

Thank you @Eran. It works. but I tried the same with df1.loc. It also does the same job. If possible, can you please explain what's the difference between the two as both of them achieve the same results
Sure @JKC. iloc is used for actual line numbers. df1.iloc[2:4] will slice the row 2 and 3, regardless of their indices. loc slices using the indices of your dataframe. They can be numbers, or non-numbers. If your indices are ordered numbers (as in your case), both would behave exactly the same. Read also about df.idx[] which combines both. I don't use it much though, I prefer the more explicit way of loc and iloc.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.