1

I am using Pandas to process my CSV file for machine learning.

The CSV file contains a columns of tags, written in English, such as "math" and "literature". I want to map those tags to integers like "math":1, "literature":2. How can I do this with Pandas?

1 Answer 1

1

You can feed a dict with the strings as keys ('math', etc) and the integers as values into the map method. For example:

>>> df

   x     subject
0  1        math
1  2  literature
2  3        math
3  4        math
4  5     science

>>> df['num'] = df.subject.map({'math':0,'literature':1,'science':2})
>>> df

   x     subject  num
0  1        math    0
1  2  literature    1
2  3        math    0
3  4        math    0
4  5     science    2

You could also use factorize to accomplish much the same thing but you wouldn't control the mapping from string to integers (although in this example it ends up being the same):

>>> df['num'] = pd.factorize(df.subject)[0]
>>> df

   x     subject  num
0  1        math    0
1  2  literature    1
2  3        math    0
3  4        math    0
4  5     science    2
Sign up to request clarification or add additional context in comments.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.