1

I'm working on a problem involving Pandas in Python 3.4. I'm stuck at one small subsection which involves re-organizing my data frames. I shall be more specific.

I have a table called "model" in the format of:

Model Input

I wish to get the desired output in the form equivalent to:

I wish to get the output similar to:

Desired Output

I have looked into Convert a python dataframe with multiple rows into one row using python pandas? and How to combine multiple rows into a single row with pandas. I am getting confused on whether I should use groupby, or pivot table. I tried using both but I either get a KeyError or not the right format I wanted. Is there any specific library that can help achieve the above task?

2
  • 1
    Please read up on how to write a good pandas question; images are not very useful. Commented Feb 27, 2018 at 1:34
  • I apologise. Thank you for the resource. Commented Feb 27, 2018 at 1:53

1 Answer 1

0

You can use groupby and apply:

num_V = 5
max_row = df.groupby('ID').ID.count().max()
df2= (
        df.groupby('ID')
        .apply(lambda x: x.values[:,1:].reshape(1,-1)[0])
        .apply(pd.Series)
        .fillna(0)
)

df2.columns = ['V{}_{}_{}'.format(i+1,j,i) for j in range(max_row) for i in range(num_V)]
Sign up to request clarification or add additional context in comments.

8 Comments

I typed the code exactly as you described, and I get a syntax error at the f'V{I+1}_{j}_{I}' line. I'm running on Python 3.4 on a Linux terminal.
@VinayAshokkumar , that's because your python version is lower than 3.6 and does not support that syntax. Please try again with the updated answer.
the syntax is accepted. But I get a new TypeError: set_axis() got multiple values for argument 'axis'. I tried running the program without the set_axis() command the tables are reformated structurally how I wanted. Thank you for that. Is there a way to combat this error?
Again that's because the version incompatibility. I don't have a lower version installed and please try now.
I'm getting a new error called length mismatch: Expected axis has 30 elements, new values have 24 elements. But I think I know why I'm getting that error. When I output the table earlier (without set_axis()), I noticed that after four columns the ID column (holding values 1 and 2) get repeated. This happens after every four columns and that's why the result is 30 when it should be 24. (ID is printed unnecessarily 6 times). Is there a way to get rid of the duplicate ID columns and and show only the numerical results? Thank you for your efforts so far. Really appreciate the help!
|

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.