1

I have the following dataset where I am trying to sort the row value in each column, but have not been able to find an efficient way of performing the operation. I was hoping someone would be able to point me in a more optimized way of sorting:

|Column_to_Sort|Desired_Output| | a, x, z,c | a, c, x, z | | ball, apple | apple, ball |

Essentially, I am trying to rearrange the list of items in the Column_to_sort alphabetically and separated by a comma.

I wrote the following code to perform the operation, however I don't believe it is the best way of performing the operation:

 def sort_val(x):
    String_ = x.split(",")
    String_.sort()
    return (String_)

df['Desired_Output'] = df['Column_to_Sort'].apply(lambda x: sort_val(x))
4
  • Your code looks fine to me . Commented Nov 20, 2019 at 14:57
  • ah perfect, for some reason I ran it on my machine twice and it was exceptionally slow. Just tried it again and no issues. I should have waited before posting the question. Sorry about that Commented Nov 20, 2019 at 14:59
  • 1
    Is it possible to close the question? Commented Nov 20, 2019 at 15:00
  • @Raptor776 you can always delete your question if you choose to. Commented Nov 20, 2019 at 15:07

1 Answer 1

2

Solution by use dot and get_dummies

s=df['Column_to_Sort'].str.get_dummies(', ').sort_index(axis=1)
s.dot(s.columns+',').str[:-1]
Out[547]: 
0       a,c,x,z
1    apple,ball
dtype: object
Sign up to request clarification or add additional context in comments.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.