1

I have a dataframe and it's a part of a column

category
Search
Search
Онлайн-магазин
Онлайн-магазин
Форумы и отзывы
Онлайн-магазин
Форумы и отзывы
Агрегатор
Информационный ресурс
Онлайн-магазин
Телеком
Онлайн-магазин

I need to create column with category, converted to numeric. I mean

category   numeric_category
Search     1
Search     1
Онлайн-магазин    2
Онлайн-магазин    2
Форумы и отзывы   3
Онлайн-магазин    2
Форумы и отзывы   3
Агрегатор   4
Информационный ресурс   5
Онлайн-магазин   2
Телеком   6
Онлайн-магазин   2

How can I do that? using numpy?

2 Answers 2

1

Use factorize:

df['numeric_category'] = pd.factorize(df.category)[0] + 1

Then you can also convert it to category for saving memory:

df['numeric_category'] = pd.Categorical(pd.factorize(df.category)[0] + 1)

Sample:

df = pd.DataFrame({'category':['a','s','a']})
print (df)
  category
0        a
1        s
2        a

df['numeric_category'] = pd.Categorical(pd.factorize(df.category)[0] + 1)
print (df)
  category numeric_category
0        a                1
1        s                2
2        a                1
Sign up to request clarification or add additional context in comments.

Comments

1
dict={}
for item in df.category:
    if item not in dict:
        dict[item]=len(dict)+1

print "category\t"+"numeric_category"

for item in df.category:
    print "%s\t%s"%(item,dict[item])

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.