2

I have a dataframe that looks like the below. What I'd like to do is create another column that is based on the VALUE of the index (so anything less the 10 would have another column and be labeled as "small"). I can do something like lengthDF[lengthDF.index < 10] to get the values I want, but I'm sure how to get the additional column I want. I've tried this Create Column with ELIF in Pandas but can't get it to read the index...

             LengthFirst  LengthOthers
0             1           NaN
4           NaN             1
9           NaN             1
13          NaN             1
17            1             1
18          NaN             1
19          NaN             1
20            1           NaN
21            1             1
22            3             4
23            1           NaN
24            7             6
25            1             2
26           16            19
27            1             2
28           24             8
29            9            12
30           73            65
31           15            12
32           55            60
33           28            21
34           29            31

1 Answer 1

4

Something like this?

lengthDF['size'] = 'large'
lengthDF['size'][lengthDF.index < 10] = 'small'
Sign up to request clarification or add additional context in comments.

3 Comments

Yep that works, thanks. I'm still trying to figure out Pandas slicing & indexing. Question: is there a way to do this as a one liner or function? Say I wanted to make three buckets of small, medium, large all at once?
You could use the map function. Something like lengthDF['size'] = lengthDF.index.map(lambda x: 'small' if x < 10 else 'large'). Obviously, instead of the lambda, you can have a more complex named function that returns any number of values (e.g., 'small', 'medium', 'large') based on what each index value is. If performance is a concern, you may want to run some tests: I'm not sure the map function is vectorized, so it may be faster to do a couple of passes as in the answer above rather than use map to do it in one line.
Check out numpy.where, is a somewhat more concise way to do this sort of thing. docs.scipy.org/doc/numpy/reference/generated/numpy.where.html

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.