1

I have a dataframe with one index as datetime like below and I am looking to add a first columns index (see "target" below) where any dates are crossed to it (First_column).

First_column = ['s0000', 's0001', 's0002', 's0003', 's0004', ...]

Has someone any idea on how to proceed ?

Thank you very much. Alexis

My dataframe :

<class 'pandas.core.frame.DataFrame'>
DatetimeIndex: 17544 entries, 2015-01-01 00:00:00 to 2016-12-31 23:00:00
Data columns (total 12 columns):

Target:

<class 'pandas.core.frame.DataFrame'>
MultiIndex: 996000 entries, (s0000, 2015-01-01 00:00:00) to (s0999, 2012-12-31 00:00:00)
Data columns (total 8 columns):

SCENARIO DATE

s0000    2015-02-28
         2015-03-03 
         2015-03-04
         2015-03-05
         2015-03-06
         2015-03-07
         2015-03-10
         2015-03-11
         2015-03-12
         2015-03-13
s0001    2015-02-28
         2015-03-03 
         2015-03-04
         2015-03-05
         2015-03-06
         2015-03-07
         2015-03-10
         2015-03-11
         2015-03-12
         2015-03-13
s0002    2015-02-28
         2015-03-03 
         2015-03-04
         2015-03-05
         2015-03-06
         2015-03-07
         2015-03-10
         2015-03-11
         2015-03-12
         2015-03-13
s0003    ...

2 Answers 2

1

You could use pd.concat with the keys parameter:

import pandas as pd
df = pd.DataFrame(range(10), index=pd.date_range('2015-2-27', freq='B', periods=10))
#             0
# 2015-02-27  0
# 2015-03-02  1
# 2015-03-03  2
# 2015-03-04  3
# 2015-03-05  4
# 2015-03-06  5
# 2015-03-09  6
# 2015-03-10  7
# 2015-03-11  8
# 2015-03-12  9
first_col = ['s{:04d}'.format(i) for i in range(1,5)]
# ['s0001d', 's0002d', 's0003d', 's0004d']

newdf = pd.concat([df]*len(first_col), keys=first_col)
print(newdf)

yields

                  0
s0001 2015-02-27  0
      2015-03-02  1
      2015-03-03  2
      2015-03-04  3
      2015-03-05  4
      2015-03-06  5
      2015-03-09  6
      2015-03-10  7
      2015-03-11  8
      2015-03-12  9
s0002 2015-02-27  0
      2015-03-02  1
      2015-03-03  2
      2015-03-04  3
      2015-03-05  4
      2015-03-06  5
      2015-03-09  6
      2015-03-10  7
      2015-03-11  8
      2015-03-12  9
s0003 2015-02-27  0
      2015-03-02  1
      2015-03-03  2
      2015-03-04  3
      2015-03-05  4
      2015-03-06  5
      2015-03-09  6
      2015-03-10  7
      2015-03-11  8
      2015-03-12  9
s0004 2015-02-27  0
      2015-03-02  1
      2015-03-03  2
      2015-03-04  3
      2015-03-05  4
      2015-03-06  5
      2015-03-09  6
      2015-03-10  7
      2015-03-11  8
      2015-03-12  9

Happily, I just learned this yesterday from Joris.

Sign up to request clarification or add additional context in comments.

Comments

0

You could do something like this...

import pandas as pd

first_col = ['s0001', 's0002', 's0003', 's0004']

# Make your datetime index
dt_index = pd.date_range('2015-2-27', freq='B', periods=10)

# Make your first_col index - must be same length as dt_index 
first_col_index = len(dt_index)*first_col
first_col_index.sort()

# Make a dateframe with a hierarchical index
df = pd.DataFrame(range(len(first_col)*len(dt_index)), index=[first_col_index,
                  dt_index.repeat(len(first_col))])

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.