1

I have to convert this to a binary table wherein the index values of the binary table are the order IDs and the column values of the binary table are bread,cheese,eggs,flour,and jam. The values of the binary table are either 1 or 0: 1 if that specific order contains the product and 0 if it doesn't.

OrderNum
1000                   [eggs]
1001                  [bread]
1002     [eggs, bread, flour]
1003       [eggs, jam, bread]
1004                   [eggs]
                ...          
1495     [eggs, bread, flour]
1496    [eggs, cheese, bread]
1497                    [jam]
1498                  [bread]
1499            [eggs, bread]
Length: 500, dtype: object

It should look like this:

        bread cheese eggs flour jam
1000      0     0     1    0    0     
1001      1     0     0    0    0
1002      1     0     1    1    0  
1003      1     0     1    0    1          
1004      0     0     1    0    0
                  ...          
1495      1     0     1    1    0     
1496      1     1     1    0    0    
1497      0     0     0    0    1                    
1498      1     0     0    0    0                  
1499      1     0     1    0    0            

Does anyone know how to do this?

2 Answers 2

4

You can use a combination of explode, pandas.get_dummies and groupby:

pd.get_dummies(df['OrderNum'].explode()).groupby(level=0).sum()
Sign up to request clarification or add additional context in comments.

Comments

-1
  • simulated the data, having a list of items against an OrderNo
  • use a dict comprehension to turn into a series. Then it's simple isna() logic
import pandas as pd
import numpy as np

items = ["bread","cheese","eggs","flour","jam"]

df = pd.DataFrame({"OrderNo":range(1000,1020), 
              "Items":[np.unique(np.random.choice(items, 8)) for x in range(20)]})

df.join(df["Items"].apply(lambda l: ~pd.Series({i:i for i in l}).isna()).fillna(False).astype(int))
OrderNo Items bread cheese eggs flour jam
1000 ['bread' 'cheese' 'eggs' 'flour' 'jam'] 1 1 1 1 1
1001 ['bread' 'cheese' 'eggs' 'jam'] 1 1 1 0 1
1002 ['bread' 'cheese' 'flour' 'jam'] 1 1 0 1 1
1003 ['bread' 'cheese' 'eggs' 'flour' 'jam'] 1 1 1 1 1
1004 ['cheese' 'eggs' 'flour' 'jam'] 0 1 1 1 1
1005 ['bread' 'cheese' 'flour' 'jam'] 1 1 0 1 1
1006 ['bread' 'eggs' 'jam'] 1 0 1 0 1
1007 ['bread' 'cheese' 'eggs' 'flour' 'jam'] 1 1 1 1 1
1008 ['bread' 'cheese' 'eggs' 'flour' 'jam'] 1 1 1 1 1
1009 ['bread' 'eggs' 'flour'] 1 0 1 1 0
1010 ['cheese' 'eggs' 'flour' 'jam'] 0 1 1 1 1
1011 ['cheese' 'eggs' 'flour' 'jam'] 0 1 1 1 1
1012 ['bread' 'cheese' 'eggs' 'flour' 'jam'] 1 1 1 1 1
1013 ['bread' 'eggs' 'flour' 'jam'] 1 0 1 1 1
1014 ['bread' 'cheese' 'eggs' 'jam'] 1 1 1 0 1
1015 ['bread' 'eggs' 'flour' 'jam'] 1 0 1 1 1
1016 ['bread' 'cheese' 'eggs' 'flour' 'jam'] 1 1 1 1 1
1017 ['bread' 'cheese' 'eggs' 'flour' 'jam'] 1 1 1 1 1
1018 ['bread' 'cheese' 'eggs' 'flour' 'jam'] 1 1 1 1 1
1019 ['bread' 'cheese' 'flour' 'jam'] 1 1 0 1 1

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.