10

My Pandas data frame contains the following data:

product,values
 a1,     10
 a5,     20
 a10,    15
 a2,     45
 a3,     12
 a6,     67

I have to sort this data frame based on the product column. Thus, I would like to get the following output:

product,values
 a10,     15
 a6,      67
 a5,      20
 a3,      12
 a2,      45
 a1,      10

Unfortunately, I'm facing the following error:

ErrorDuringImport(path, sys.exc_info())

ErrorDuringImport: problem in views - type 'exceptions.Indentation

2 Answers 2

17

You can first extract digits and cast to int by astype. Then sort_values of column sort and last drop this column:

df['sort'] = df['product'].str.extract('(\d+)', expand=False).astype(int)
df.sort_values('sort',inplace=True, ascending=False)
df = df.drop('sort', axis=1)
print (df)
  product  values
2     a10      15
5      a6      67
1      a5      20
4      a3      12
3      a2      45
0      a1      10

It is necessary, because if use only sort_values:

df.sort_values('product',inplace=True, ascending=False)
print (df)
  product  values
5      a6      67
1      a5      20
4      a3      12
3      a2      45
2     a10      15
0      a1      10

Another idea is use natsort library:

from natsort import index_natsorted, order_by_index

df = df.reindex(index=order_by_index(df.index, index_natsorted(df['product'], reverse=True)))
print (df)
  product  values
2     a10      15
5      a6      67
1      a5      20
4      a3      12
3      a2      45
0      a1      10
Sign up to request clarification or add additional context in comments.

22 Comments

i m using python 2.7 version
I think copy text of error, under tags in question give edit and paste text under text of question. Thanks.
k now i will add that
It looks like your pandas is broken - see link.
|
0
import pandas as pd
df = pd.DataFrame({
   "product": ['a1,', 'a5,', 'a10,', 'a2,','a3,','a6,'],
   "value": [10, 20, 15, 45, 12, 67]
})
df
==>
  product   value
0   a1,      10
1   a5,      20
2   a10,     15
3   a2,      45
4   a3,      12
5   a6,      67


df.sort_values(by='product', key=lambda col: col.str[1:-1].astype(int), ascending=False)
==>
  product   value
2   a10,     15
5   a6,      67
1   a5,      20
4   a3,      12
3   a2,      45
0   a1,      10

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.