2

I've got a dictionary that look like this:

data = {'function_name': ['func1', 'func2', 'func3'],
        'argument': [('func1_arg1', 'func1_arg2'), 
                     ('func2_arg1',), 
                     ('func3_arg1', 'func3_arg2', 'func3_arg3')],
        'A': ['value_a1', 'value_a2', 'value_a3'],
        'B': 'b',
        'types': [('func1_type1', 'func1_type2'), 
                  ('func2_type1',),
                  ('func3_type1', 'func3_type2', 'func3_type3')]}

I'd like to convert it into a pandas DataFrame and make it look like this:

function_name    argument    types         A          B

func1            func1_arg1  func1_type1   value_a1   b
func1            func1_arg2  func1_type2   value_a1   b
func2            func2_arg1  func2_type1   value_a2   b
func3            func3_arg1  func3_type1   value_a3   b
func3            func3_arg2  func3_type2   value_a3   b
func3            func3_arg3  func3_type3   value_a3   b

As it follows from here if there would be one column of tuples, I would have to do this:

import pandas as pd


data_frame = pd.DataFrame(data)
new_frame = data_frame.set_index(['function_name','A','B'])['argument'].apply(pd.Series).stack().to_frame('argument').reset_index().drop('level_3',1)

But how do I go about it if I've got a few columns of tupples?

EDIT:

There seems to be a little problem with the approved solution. Namely, if there's a tuppled column consisting entirely of Nones or just empty tuples then in the process of forming the new_frame they get dropped. Is it possible to make pandas avoid dropping the columns.

The initial data looks like this:

data = {'function_name': ['func1', 'func2', 'func3'],
        'argument': [('func1_arg1', 'func1_arg2'), 
                     ('func2_arg1',), 
                     ('func3_arg1', 'func3_arg2', 'func3_arg3')],
        'A': ['value_a1', 'value_a2', 'value_a3'],
        'B': 'b',
        'types': [('func1_type1', 'func1_type2'), 
                  ('func2_type1',),
                  ('func3_type1', 'func3_type2', 'func3_type3')],
        'info': [(None, None), (None,), (None, None, None)]}

The 'info' columns could be [(), (), ()], the outcome would still be the same.

2 Answers 2

3

Since there are multiple columns to expand I dont think this can be in single line but you can use apply with pd.DataFrame constructor. The default value of dropna for stack method is True so set it to false to keep the None values. i.e

index = ['function_name','A','B']
new_frame = data_frame.set_index(index)
            .apply(lambda x:pd.DataFrame(x.values.tolist()).stack(dropna=False),1)
            .stack(dropna=False).reset_index().drop('level_3',1)
new_frame.columns = index + [x for x in data_frame.columns if x not in index]
   function_name A        B    argument         types
0  func1  value_a1        b    func1_arg1  func1_type1
1  func1  value_a1        b    func1_arg2  func1_type2
2  func2  value_a2        b    func2_arg1  func2_type1
3  func3  value_a3        b    func3_arg1  func3_type1
4  func3  value_a3        b    func3_arg2  func3_type2
5  func3  value_a3        b    func3_arg3  func3_type3

With three columns to expand

data = {'function_name': ['func1', 'func2', 'func3'],
    'argument': [('func1_arg1', 'func1_arg2'), 
                 ('func2_arg1',), 
                 ('func3_arg1', 'func3_arg2', 'func3_arg3')],
    'A': ['value_a1', 'value_a2', 'value_a3'],
    'B': 'b',
    'types': [('func1_type1', 'func1_type2'), 
              ('func2_type1',),
              ('func3_type1', 'func3_type2', 'func3_type3')],
    'info': [(None, None), (None,), (None, None, None)]}
  function_name         A  B    argument  info        types
0         func1  value_a1  b  func1_arg1  None  func1_type1
1         func1  value_a1  b  func1_arg2  None  func1_type2
2         func2  value_a2  b  func2_arg1  None  func2_type1
3         func3  value_a3  b  func3_arg1  None  func3_type1
4         func3  value_a3  b  func3_arg2  None  func3_type2
5         func3  value_a3  b  func3_arg3  None  func3_type3

Hope it helps.

Sign up to request clarification or add additional context in comments.

6 Comments

Yep, seems like it works like charm! Thank you ever so much for your help!
I've just come across an issue with the solution. If one of the tuppled columns consists entirely of Nones it gets dropped in the process of forming new_frame and the second line errors out with "Length mismatch: expected axis has n elements, new values have n + k elements", where k is the number of dropped (noned) columns. I tried resolving it but couldn't do that. Is it possible to avoid dropping the columns of they consists entirely of Nones?
Can you update the data dict with the following case ?
Done! I thought it would be better to write it below my initial question cos other people answered it.
@BigBear the default value of dropna for stack method is true so set it to false. Hope it helps
|
2

Consider a nested list and dict comprehensions if all items are equal length (i.e., 3) using the DataFrame constructor. Only challenge is the scalar item 'B':'b' which can be assigned at end if known in advance:

dfs = [pd.DataFrame([{k:v[i] for k,v in data.items() if len(data[k])>1}][0]) \
             for i in range(len(data['function_name']))]

df = pd.concat(dfs).reset_index(drop=True).assign(B='b') 

print(df)
#           A    argument function_name        types  B
# 0  value_a1  func1_arg1         func1  func1_type1  b
# 1  value_a1  func1_arg2         func1  func1_type2  b
# 2  value_a2  func2_arg1         func2  func2_type1  b
# 3  value_a3  func3_arg1         func3  func3_type1  b
# 4  value_a3  func3_arg2         func3  func3_type2  b
# 5  value_a3  func3_arg3         func3  func3_type3  b

1 Comment

Can you try your solution with the three columns to be expanded? data i provided in my solution. Your solution demands the types column to be of equal length.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.