0

I have a large Dataframe with the following columns:

The data used as the example here can be found here

import pandas 

x = pd.read_csv('example1_csv.)
x.head()

ID  Year    Y
22445   1991    40.0
29925   1991    43.333332
76165   1991    403.0
223725  1991    65.0
280165  1991    690.5312

I want to change the numbers in the column Y to the categories low, mid, high, where each category is specific to a range of numbers in Y:

  1. Low replaces any number within the range of -3000 to 600 in Y.

  2. Mid replaces any number within the range of 601 to 1500 in Y.

  3. High replaces any number within the range of 1501 to 17000 in Y.

For example, if an ID has a Y value between -3000 and 600 then that ID will have the numeric value in Y replaced as Low.

How does one make these replacements? I have tried several ways but have run into str and int type errors every time. The data file used in this question is in the Github link above. Many thanks in advance for the help.

2 Answers 2

1

use numpy.select

import numpy as np
x.Y = np.select([x.Y.lt(601), x.Y.lt(1501), x.Y.lt(17000)], ['Low', 'Mid', 'High'])
Sign up to request clarification or add additional context in comments.

1 Comment

Glad I've been of help
1

This should work too.

x['Y'] = x['Y'].apply(lambda i : 'Low' if i > -3000 and i < 600 else ('Mid' if i >601 and i < 1500 else 'High'))

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.