2

I have a dataset that has a variable named 'EntrySec' and I want to replace the values if it falls in a certain range.

Entrysec
1
21
32
9
43
66

Expectation: replace all values by

10 if it falls in the range between 1-10
20 if it falls in the range of 11-20
30 if it falls in the range of 21-30 and so on

2
  • 1
    What have you tried so far? Commented Dec 28, 2019 at 9:36
  • i was creating a new variable df['entrysec_0-10']=df['EntrySec].between(0,10) and so on for diff ranges. But i dont want it that way, i want to replace it within df['EntrySec']. Commented Dec 28, 2019 at 9:45

2 Answers 2

6

Here is a very simple solution that works for any number (it doesn't matter in which range a number is). It rounds the values of a Pandas DataFrame's column to the next ten:

df["Entrysec"] = df["Entrysec"]//10*10+10

How does it work?

  1. Perform the integer division by 10 (it basically cuts off the fractional part of the normal division). In this way you obtain the tens of a number. For example:

    • 43/10=4.3 (normal division)
    • 43//10=4 (integer division)
  2. Multiply by 10, getting the original number without its ones. For example: 4*10=40.

  3. Add 10 to get the desired result. For example, 40+10=50.

Edit

While my solution rounds a value to its next ten, the user wants to round e.g. 20 to 20 (and not 30). This can be achieved by slightly modifying my approach:

df["Entrysec"] = (df["Entrysec"]-1)//10*10+10

In this way it is possible to get the desired output. Here are some corner cases:

  • 9 is rounded to 10
  • 10 is rounded to 10
  • 11 is rounded to 20

Note that with this approach 0 is rounded to 0, as implicitly asked.

Sign up to request clarification or add additional context in comments.

1 Comment

Good Answer. An explanation on the math would be great.
3

Try using df.loc

import pandas as pd
df = pd.DataFrame({'Entrysec': [1, 21, 32, 9, 43, 66]})

and then

df.loc[(df["Entrysec"] >= 1) & (df["Entrysec"] <= 10), "Entrysec"] = 10
df.loc[(df["Entrysec"] >= 11) & (df["Entrysec"] <= 20), "Entrysec"] = 20
df.loc[(df["Entrysec"] >= 21) & (df["Entrysec"] <= 30), "Entrysec"] = 30

for range of 100 we can have:

j = 1
for i in range(1,10):
    df.loc[(df["Entrysec"] >= j) & (df["Entrysec"] <= i*10), "Entrysec"] = i*10
    i = i + 1
    j = j + 10

Entrysec
0   10
1   30
2   40
3   10
4   50
5   70

4 Comments

What about larger numbers? You can't hardcode all the possibilities...
What about bigger ranges??
Then we only change loop upper limit for i in range(1,10) to whatever upper limit is say 1000 then we change upper limit to 100 for i in range(1,100)
This approach works but is much much complex for large ranges...

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.