0

I have the following Dataframe:

ORD_DATE   Weekday   ALL Sales       ALL Pred          Other Sales   Other Pred  
1/1/2020   2        48386.27        20815269.83        15386.27       5643158.509

I would have approx. 1000 different Sale/Pred combos. I want it this way:

ORD_DATE  Weekday   Sales            Pred          Filter  
1/1/2020   2        48386.27        20815269.83    All  
1/1/2020   2        15386.27        5643158.509    Other

I have tried Transpose but did not get the results I wanted.

3

3 Answers 3

3

To do this, first move all unaltered columns into the index, then split the remaining column names and create a MultiIndex out of their components, and finally stack one of the column levels.

df = pd.DataFrame.from_records(
    [
        {
            "ORD_DATE": "1/1/2020",
            "Weekday": 2,
            "ALL_Sales": 48386.27,
            "ALL_Pred": 20815269.83,
            "Other_Sales": 15386.27,
            "Other_Pred": 5643158.509,
        }
    ]
)

df = df.set_index(["ORD_DATE", "Weekday"])
split_cols = [c.split("_") for c in df.columns]
df.columns = pd.MultiIndex.from_tuples(split_cols, names=["Filter", None])
df = df.stack("Filter").reset_index()
print(df)

   ORD_DATE  Weekday Filter          Pred     Sales
0  1/1/2020        2    ALL  2.081527e+07  48386.27
1  1/1/2020        2  Other  5.643159e+06  15386.27
Sign up to request clarification or add additional context in comments.

Comments

0

I would use pandas.concat with slicing to do this:

import pandas as pd

df = pd.DataFrame({
    'ORD_DATE':['1/1/2020'],
    'Weekday':[2],
    'ALL Sales':[48386.27],
    'ALL Pred':[20815269.83],
    'Other Sales':[15386.27],
    'Other Pred':[5643158.509],
})

df2 = pd.concat([
    df[['ORD_DATE','Weekday','ALL Sales','ALL Pred']].rename(columns={
        'ALL Sales':'Sales',
        'ALL Pred':'Pred',
    }),
    df[['ORD_DATE','Weekday','Other Sales','Other Pred']].rename(columns={
        'Other Sales':'Sales',
        'Other Pred':'Pred',
    }),
])

print(df2)

Output:

   ORD_DATE  Weekday     Sales          Pred
0  1/1/2020        2  48386.27  2.081527e+07
0  1/1/2020        2  15386.27  5.643159e+06

Example in python tutor

Edit With the Filter Column added:

import pandas as pd

df = pd.DataFrame({
    'ORD_DATE':['1/1/2020'],
    'Weekday':[2],
    'ALL Sales':[48386.27],
    'ALL Pred':[20815269.83],
    'Other Sales':[15386.27],
    'Other Pred':[5643158.509],
})

df_all = df[['ORD_DATE','Weekday','ALL Sales','ALL Pred']].rename(columns={
        'ALL Sales':'Sales',
        'ALL Pred':'Pred',
})
df_all['Filter'] = 'All'
df_other = df[['ORD_DATE','Weekday','Other Sales','Other Pred']].rename(columns={
        'Other Sales':'Sales',
        'Other Pred':'Pred',
})
df_other['Filter'] = 'Other'

df2 = pd.concat([df_all,df_other,])

print(df2)

Output:

   ORD_DATE  Weekday     Sales          Pred Filter
0  1/1/2020        2  48386.27  2.081527e+07    All
0  1/1/2020        2  15386.27  5.643159e+06  Other

python tutor link

Comments

0

Adjust the positions of ALL and Other in the column names, and then reshape the dataframe using pandas' wide to long method.

#reshape column position
#allows manipulation with pd wide to long
df.columns = [' '.join(reversed(ent.split()))
              if ('ALL' in ent) or ('Other' in ent) else ent 
              for ent in df.columns ]

df

    ORD_DATE    Weekday Sales ALL   Pred ALL    Sales Other Pred Other
0   1/1/2020    2   48386.27    20815269.83 15386.27    5643158.509

#reshape data using pandas wide to long
res = (pd.wide_to_long(df,
                       stubnames = ['Sales','Pred'],
                       i = ['ORD_DATE','Weekday'],
                       j = 'Filter',
                       sep = ' ',
                       suffix = '[a-zA-Z]+'
                      )
       .reset_index()
      )

res

    ORD_DATE    Weekday Filter   Sales        Pred
0   1/1/2020       2     ALL    48386.27    2.081527e+07
1   1/1/2020       2    Other   15386.27    5.643159e+06

Another route, inspired by @BallPointPen's solution :

#set first two columns as index
df = df.set_index(['ORD_DATE','Weekday'])

#split column on whitespace
#using the str partition creates a multiindex 
#drop the 1st level, since it contains the whitespace, and is not needed
#assign back to columns
df.columns = df.columns.str.partition(' ').droplevel(1)

#stack the dataframe
res = df.stack(0).reset_index().rename(columns={'level_2':'Filter'})
res

    ORD_DATE    Weekday Filter     Pred         Sales
0   1/1/2020    2       ALL     2.081527e+07    48386.27
1   1/1/2020    2       Other   5.643159e+06    15386.27

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.