1

Apologies for the image but I'm not sure how to reproduce this as I was given this data...

Multi index dataframe

I simply want to convert this into a simple dataframe where I have index columns time, lon, lat and the values in respective rows like so:

| time | lat | lon | data |

I tried doing .reset_index() but the time axis still goes across and not downward. How can I "explode" all the index values to get a simple dataframe with all index in columns?

Edit:

Dictionary of test data for reproducing:

{Timestamp('2001-01-01 00:00:00'): {(50.18000030517578,
   -5.6199951171875): -1.68,
  (50.18000030517578, -4.9200439453125): -1.88,
  (50.18000030517578, -4.219970703125): -2.08},
 Timestamp('2001-01-02 00:00:00'): {(50.18000030517578,
   -5.6199951171875): -1.95,
  (50.18000030517578, -4.9200439453125): -2.25,
  (50.18000030517578, -4.219970703125): -2.55},
 Timestamp('2001-01-03 00:00:00'): {(50.18000030517578,
   -5.6199951171875): -0.76,
  (50.18000030517578, -4.9200439453125): -0.91,
  (50.18000030517578, -4.219970703125): -1.06},
 Timestamp('2001-01-04 00:00:00'): {(50.18000030517578,
   -5.6199951171875): -2.9,
  (50.18000030517578, -4.9200439453125): -3.01,
  (50.18000030517578, -4.219970703125): -3.11},
 Timestamp('2001-01-05 00:00:00'): {(50.18000030517578,
   -5.6199951171875): -2.06,
  (50.18000030517578, -4.9200439453125): -2.29,
  (50.18000030517578, -4.219970703125): -2.52}}
3
  • 1
    Print result of df.head(5).to_dict() from this I can pd.DataFrame(...) Commented Oct 20, 2019 at 9:31
  • Edited - although I couldn't work out how to preserve index names (to_dict() automatically removes them (?)) - thanks for offering to help. Commented Oct 20, 2019 at 9:51
  • I've added solution proposal. Commented Oct 20, 2019 at 10:23

1 Answer 1

3

Check this:

import pandas as pd
from pandas import Timestamp


d = {Timestamp('2001-01-01 00:00:00'): {(50.18000030517578,
   -5.6199951171875): -1.68,
  (50.18000030517578, -4.9200439453125): -1.88,
  (50.18000030517578, -4.219970703125): -2.08},
 Timestamp('2001-01-02 00:00:00'): {(50.18000030517578,
   -5.6199951171875): -1.95,
  (50.18000030517578, -4.9200439453125): -2.25,
  (50.18000030517578, -4.219970703125): -2.55},
 Timestamp('2001-01-03 00:00:00'): {(50.18000030517578,
   -5.6199951171875): -0.76,
  (50.18000030517578, -4.9200439453125): -0.91,
  (50.18000030517578, -4.219970703125): -1.06},
 Timestamp('2001-01-04 00:00:00'): {(50.18000030517578,
   -5.6199951171875): -2.9,
  (50.18000030517578, -4.9200439453125): -3.01,
  (50.18000030517578, -4.219970703125): -3.11},
 Timestamp('2001-01-05 00:00:00'): {(50.18000030517578,
   -5.6199951171875): -2.06,
  (50.18000030517578, -4.9200439453125): -2.29,
  (50.18000030517578, -4.219970703125): -2.52}}


df = pd.DataFrame(d)
df = df.stack().to_frame().reset_index()
df.columns = ['lat', 'lon', 'time', 'data']

Output:

      lat       lon       time  data
0   50.18 -5.619995 2001-01-01 -1.68
1   50.18 -5.619995 2001-01-02 -1.95
2   50.18 -5.619995 2001-01-03 -0.76
3   50.18 -5.619995 2001-01-04 -2.90
4   50.18 -5.619995 2001-01-05 -2.06
5   50.18 -4.920044 2001-01-01 -1.88
6   50.18 -4.920044 2001-01-02 -2.25
7   50.18 -4.920044 2001-01-03 -0.91
8   50.18 -4.920044 2001-01-04 -3.01
9   50.18 -4.920044 2001-01-05 -2.29
10  50.18 -4.219971 2001-01-01 -2.08
11  50.18 -4.219971 2001-01-02 -2.55
12  50.18 -4.219971 2001-01-03 -1.06
13  50.18 -4.219971 2001-01-04 -3.11
14  50.18 -4.219971 2001-01-05 -2.52
Sign up to request clarification or add additional context in comments.

2 Comments

.stack() seems like what I was looking for to flatten the row index! Thanks.
There is one thing to improve: columns name handling, it should done more auto-magically ;-)

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.