I have a dataframe (the sample looks like this)
Type SKU Description FullDescription Size Price
Variable 2 Boots Shoes on sale XL,S,M
Variation 2.5 Boots XL XL 330
Variation 2.6 Boots S S 330
Variation 2.7 Boots M M 330
Variable 3 Helmet Helmet Sizes E42,E41
Variation 3.8 Helmet E42 E42 89
Variation 3.2 Helmet E41 E41 89
What I want to do is sort the values based on Size so the final data frame should look like this:
Type SKU Description FullDescription Size Price
Variable 2 Boots Shoes on sale S,M,XL
Variation 2.6 Boots S S 330
Variation 2.7 Boots M M 330
Variation 2.5 Boots XL XL 330
Variable 3 Boots Helmet Sizes E41,E42
Variation 3.2 Helmet E41 E41 89
Variation 3.8 Helmet E42 E42 89
I can just use sort_values() but I can't seem to find anything to retain the order of Type and SKU.
out = df.groupby(df.Type.eq('Variable').cumsum()).\
apply(lambda x : pd.concat([x.iloc[[0]].assign(Size=lambda y : y['Size'].str.split(',').str[::-1].str.join(',')),
x.iloc[1:,].iloc[::-1]]))
I have tried this code but it's printing variations before variables and that too in reverse order (on the large dataset). Please note 'Size' has different variations and not just limited to 'XL,M,S' and 'E42,E41' it also has values like 5XXL, 39mm etc. Any help would be appreciated
Any help would be appreciated.
Edit:
grp=(df.groupby('Type')).cumcount()
Type SKU Description FullDescription Size Price
0 variable 2.0 Boots Shoes on sale S,M,XL NaN
2 variation 2.6 Boots S NaN S 330.0
4 variable 3.0 Helmet Helmet Sizes E41,E42 NaN
3 variation 2.7 Boots M NaN M 330.0
1 variation 2.5 Boots XL NaN XL 330.0
6 variation 3.2 Helmet E41 NaN E41 123.0
5 variation 3.8 Helmet E42 NaN E42 112.0