1

I am working with google or tools and in one of the examples a data structure is given. I would like to import this data structure based on an Excel sheet.

This is the given data structure:

jobs = [[[(3, 0), (1, 1), (5, 2)], 
         [(2, 0), (4, 1), (6, 2)],
         [(2, 0), (3, 1), (1, 2)]],
        [[(2, 0), (3, 1), (4, 2)], 
         [(1, 0), (5, 1), (4, 2)],
         [(2, 0), (1, 1), (4, 2)]], 
        [[(2, 0), (1, 1), (4, 2)],
         [(2, 0), (3, 1), (4, 2)],
         [(3, 0), (1, 1), (5, 2)]]]

What I like to do is to import jobs based on an Excel sheet with data given as:

Job Task    M1  M2  M3
1    1      3   1   5
1    2      2   4   6
1    3      2   3   1
2    1      2   3   4
2    2      2   5   4
2    3      2   1   4
3    1      2   3   4
3    2      3   1   5

2
  • 1
    What have you tried so far? Did you consider pandas.read_excel? Commented Apr 1, 2019 at 13:45
  • Yes I tried pandas.read_excel (which works fine). But I am not sure how to import it in order to receive the mentioned data type (list with tuples?) Commented Apr 1, 2019 at 14:07

2 Answers 2

1

You should reorganize all your data, grouping by Job. For example:

import pandas as pd

df = pd.read_excel('bb.xlsx')

jobs = set(df['Job'])      #remove duplicates
result = [[[ (df['M1'][i],0), (df['M2'][i],1), (df['M3'][i],2) ] for i in df.index if df['Job'][i] == job] for job in jobs]
print(result)

WARNING: the result is not exactly what you wrote. I think you mispelled some data. Tell me if I am wrong.

Sign up to request clarification or add additional context in comments.

4 Comments

Yes you are right. M1 of Task2 at Job2 should be '1' instead of '2'. I tried to run your code but unfortunately I receive a syntax error print [[[ (df['M1'][i],0), (df['M2'][i],1), (df['M3'][i],2) ] for i in df.in dex if df['Job'][i] == job] for job in jobs] ^ SyntaxError: invalid syntax
are you using python 2 or 3? with version 3, the syntax of print is different, print (...)
Using print ([[ (df['M1'][i],0), (df['M2'][i],1), (df['M3'][i],2) ] for i in df.index if df['Job'][i] == job] for i in jobs) returns <generator object <genexpr> at 0x00000000094E4750>. How can I receive the data structure as seen in in the initial question? The data structure is being used within a couple of loops for example.
I have edited my answer to work both in Python 2 and 3. You missed some parenthesis.
0

I give another answer, using pandas' groupby API:

import pandas as pd

df = pd.read_excel('bb.xlsx')

result = [[[ (row['M1'],0), (row['M2'],1), (row['M3'],2) ] for idx, row in grpdf.iterrows()] for grpname, grpdf in df.groupby('Job')]
print(result)

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.