1

I've got a data frame structured as below:

dict1 = {'id': {0: 11, 1: 12, 2: 13, 3: 14, 4: 15, 5: 16, 6: 19, 7: 18, 8: 17},
 'var1': {0: 20.272108843537413,
  1: 21.088435374149658,
  2: 20.68027210884354,
  3: 23.945578231292515,
  4: 22.857142857142854,
  5: 21.496598639455787,
  6: 39.18367346938776,
  7: 36.46258503401361,
  8: 34.965986394557824},
 'var2': {0: 27.731092436974773,
  1: 43.907563025210074,
  2: 55.67226890756303,
  3: 62.81512605042017,
  4: 71.63865546218487,
  5: 83.40336134453781,
  6: 43.48739495798319,
  7: 59.243697478991606,
  8: 67.22689075630252},
 'var3': {0: 1, 1: 1, 2: 1, 3: 1, 4: 1, 5: 1, 6: 2, 7: 2, 8: 2}}
ex = pd.DataFrame(dict1.to_dict()).set_index('id')

id is set as an index, but now I would like to create a MultiIndex from var3 and id. But my following attempt fails:

ex.set_index(['var3', 'id'])

How can I then set a MultiIndex straight from Index? I know I can reset_index first and then set a MultiIndex, but it feels there has to be more elegant way.

1
  • long way - ex.groupby('var3').apply(lambda x: x.assign()) Commented Nov 1, 2019 at 19:38

2 Answers 2

4

DataFrame.set_index has an append argument, which is False by default.

If you have a DataFrame already indexed by "id", and you'd like to append "var3" to that, simply invoke:

new_df = ex.set_index("var3", append=True)

As suggested by @piRSquared in the comments, you can also swap the order if you would like "var3" to come first by method chaining a call to swaplevel. I.e.:

new_df = ex.set_index("var3", append=True).swaplevel(0, 1)
Sign up to request clarification or add additional context in comments.

2 Comments

ex.set_index('var3', append=True).swaplevel(0, 1)
@piRSquared Yeah, that would work, but in that case I'd say my solution is easier to read.
4

Like this:

ex.set_index(['var3', ex.index])

4 Comments

For the record. This might be more readable to some and it should be quicker.
@piRSquared heck, even ex.reset_index().set_index(['var3', ex.index]) is more readable than your pro solution :)
I'd not say that is my recommendation. I just wanted to show that swaplevel was necessary to make the answer accurate. BTW, you could do this in place df.index = [df.var3, df.index]
@piRSquared hehe, just kidding. Yeah well not really as var3 would still exist in the frame.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.