1

I have a dataframe where the rows are different cases and the columns are possible events, in the form:

df_prob
index  colA colB colC ColD
  0     NaN  0.9  NaN  0.1
  1     NaN  NaN  0.3  0.7
  2       1  NaN  NaN  NaN

I need to build a df where each case is listed with the possible events for that case:

df_order
index case event prob
  0    0    colB  0.9
  1    0    colD  0.7
  2    1    colC  0.3
  3    1    colD  0.7
  4    2    colA   1


The added difficulty is that it is a very scattered matrix so most of its values are NAN and I have been trying to find some method without using loops, as it is a df of 30000 x 30000 approx.

1
  • 1
    Try with df.stack().reset_index() Commented May 24, 2019 at 4:00

1 Answer 1

1

Use stack and then reset the index:

(df.set_index('index')
   .stack()
   .reset_index()
   .set_axis(['case', 'event', 'prob'], axis=1, inplace=False))

   case event  prob
0     0  colB   0.9
1     0  ColD   0.1
2     1  colC   0.3
3     1  ColD   0.7
4     2  colA   1.0
Sign up to request clarification or add additional context in comments.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.