3

I am trying to write a small python application that creates a csv file that contains data for a recipe system,

Imagine the following structure of excel data

Manufacturer    Product Data 1  Data 2  Data 3
Test 1  Product 1   1   2   3
Test 1  Product 2   4   5   6
Test 2  Product 1   1   2   3
Test 3  Product 1   1   2   3
Test 3  Product 1   4   5   6
Test 3  Product 1   7   8   9

When merged i woudl like the data to be displayed in following format,

Test 1  Product 1   1   2   3   0   0   0   0   0   0
Test 2  Product 2   4   5   6   0   0   0   0   0   0
Test 2  Product 1   1   2   3   0   0   0   0   0   0
Test 3  Product 1   1   2   3   4   5   6   7   8   9

Any help would be greatfully recieved, so far i can read the panda dataset and convert to a CSV

Regards Lee

1
  • would anyone be open to a private message to help further, if i was to send an example spreadsheet that contains example data, as i am still struggling dispite the amazing help you guys are offering. Commented May 22, 2018 at 15:52

3 Answers 3

2

Use melt, groupby, pd.Series, and unstack:

(df.melt(['Manufacturer','Product'])
  .groupby(['Manufacturer','Product'])['value']
  .apply(lambda x: pd.Series(x.tolist()))
  .unstack(fill_value=0)
  .reset_index())

Output:

  Manufacturer    Product  0  1  2  3  4  5  6  7  8
0       Test 1  Product 1  1  2  3  0  0  0  0  0  0
1       Test 1  Product 2  4  5  6  0  0  0  0  0  0
2       Test 2  Product 1  1  2  3  0  0  0  0  0  0
3       Test 3  Product 1  1  4  7  2  5  8  3  6  9
Sign up to request clarification or add additional context in comments.

Comments

2

With groupby

df.groupby(['Manufacturer','Product']).agg(tuple).sum(1).apply(pd.Series).fillna(0)
Out[85]: 
                         0    1    2    3    4    5    6    7    8
Manufacturer Product                                              
Test1        Product1  1.0  2.0  3.0  0.0  0.0  0.0  0.0  0.0  0.0
             Product2  4.0  5.0  6.0  0.0  0.0  0.0  0.0  0.0  0.0
Test2        Product1  1.0  2.0  3.0  0.0  0.0  0.0  0.0  0.0  0.0
Test3        Product1  1.0  4.0  7.0  2.0  5.0  8.0  3.0  6.0  9.0

Comments

2
cols = ['Manufacturer', 'Product']
d = df.set_index(cols + [df.groupby(cols).cumcount()]).unstack(fill_value=0)
d

Gets me

                       Data 1       Data 2       Data 3      
                            0  1  2      0  1  2      0  1  2
Manufacturer Product                                         
Test 1       Product 1      1  0  0      2  0  0      3  0  0
             Product 2      4  0  0      5  0  0      6  0  0
Test 2       Product 1      1  0  0      2  0  0      3  0  0
Test 3       Product 1      1  4  7      2  5  8      3  6  9

Followed up wtih

d.sort_index(1, 1).pipe(lambda d: d.set_axis(range(d.shape[1]), 1, False).reset_index())

  Manufacturer    Product  0  1  2  3  4  5  6  7  8
0       Test 1  Product 1  1  2  3  0  0  0  0  0  0
1       Test 1  Product 2  4  5  6  0  0  0  0  0  0
2       Test 2  Product 1  1  2  3  0  0  0  0  0  0
3       Test 3  Product 1  1  2  3  4  5  6  7  8  9

Or

cols = ['Manufacturer', 'Product']
pd.Series({
    n: d.values.ravel() for n, d in df.set_index(cols).groupby(cols)
}).apply(pd.Series).fillna(0, downcast='infer').rename_axis(cols).reset_index()

  Manufacturer    Product  0  1  2  3  4  5  6  7  8
0       Test 1  Product 1  1  2  3  0  0  0  0  0  0
1       Test 1  Product 2  4  5  6  0  0  0  0  0  0
2       Test 2  Product 1  1  2  3  0  0  0  0  0  0
3       Test 3  Product 1  1  2  3  4  5  6  7  8  9

With defaultdict and itertools.count

from itertools import count
from collections import defaultdict

c = defaultdict(count)
pd.Series({(
    m, p, next(c[(m, p)])): v
    for _, m, p, *V in df.itertuples()
    for v in V
}).unstack(fill_value=0)

                  0  1  2  3  4  5  6  7  8
Test 1 Product 1  1  2  3  0  0  0  0  0  0
       Product 2  4  5  6  0  0  0  0  0  0
Test 2 Product 1  1  2  3  0  0  0  0  0  0
Test 3 Product 1  1  2  3  4  5  6  7  8  9

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.