Repeat rows in pandas and add a sequence column

Question

I have a dataframe which gets da Date from a calendar and extracts some feature out of the date.

def processDate(self,date):
    WEEKDAY_MAP = {0: 1, 1: 2, 2: 3, 3: 4, 4: 5, 5: 6, 6: 7}
    df = pandas.DataFrame(data=[date], columns = ['DATE'])
    df['DATE'] = pandas.to_datetime(df['DATE'])
    df['DATE'] = df['DATE'].astype(str)
    df['MONTH'] = pandas.DatetimeIndex(df['DATE']).month
    df['WEEKDAY'] = pandas.DatetimeIndex(df['DATE']).dayofweek
    df['WEEKDAY'] = df['WEEKDAY'].map(WEEKDAY_MAP)
    df['HOLIDAY'] = '0'
    set_holiday(df)
    df['INTERVALL'] = '1'
    df.append([df]*5,ignore_index=True)
    print(df)

Console Log:

     DATE        MONTH  WEEKDAY HOLIDAY INTERVALL
     2017-09-13     9     3      0      1

What i need: duplicate the entry 48 times and increase the INTERVALL Value.

Outcome should be like this:

Console Log:

     DATE        MONTH  WEEKDAY HOLIDAY INTERVALL
     2017-09-13     9     3      0      1
     2017-09-13     9     3      0      2
     2017-09-13     9     3      0      3
     2017-09-13     9     3      0      4
     2017-09-13     9     3      0      5
     ...
     2017-09-13     9     3      0      48

I tried df.append([df]*48,ignore_index=True) but that didnt work.

cs95 · Accepted Answer · 2017-09-15 14:49:38Z

3

Use np.repeat and create a new dataframe.

df = pd.DataFrame(df.values.repeat(48, axis=0), columns=df.columns)
df['INTERVALL'] = df.index + 1

df.head(10)

         DATE MONTH WEEKDAY HOLIDAY  INTERVALL
0  2017-09-13     9       3       0          1
1  2017-09-13     9       3       0          2
2  2017-09-13     9       3       0          3
3  2017-09-13     9       3       0          4
4  2017-09-13     9       3       0          5

df.shape
(48, 5)

edited Sep 15, 2017 at 14:49

answered Sep 15, 2017 at 14:43

cs95

406k106 gold badges744 silver badges797 bronze badges

Sign up to request clarification or add additional context in comments.

4 Comments

miradulo Over a year ago

Don't think they want two interval columns but that's minor.

miradulo Over a year ago

@cᴏʟᴅsᴘᴇᴇᴅ Or rather a non-misspelling :)

BENY Over a year ago

:) Just follow the out put you have , so I got a 'misspelling' too...:)

cs95 Over a year ago

@Wen LOL... 1 sec... need to undo some votes to +1 you

BENY · Accepted Answer · 2017-09-15 14:47:00Z

3

Or using pd.concat

df = pd.concat([df]*48,axis=0).reset_index()
df['INTERVAL'] = df.index+ 1

answered Sep 15, 2017 at 14:47

BENY

324k22 gold badges176 silver badges250 bronze badges

Comments

Cedric Zoppolo · Accepted Answer · 2017-09-15 14:58:38Z

1

You can use your own idea and then assign a range to the INTERVALL column

df= df.append([df]*47,ignore_index=True)
df["INTERVALL"] = range(1,49)

Note that you need to duplicate 47 times and then use the range from 1 till 48.

answered Sep 15, 2017 at 14:58

Cedric Zoppolo

4,7937 gold badges34 silver badges63 bronze badges

Collectives™ on Stack Overflow

Repeat rows in pandas and add a sequence column

3 Answers 3

4 Comments

Comments

Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

3 Answers 3

4 Comments

Comments

Comments

Your Answer

Sign up or log in

Post as a guest

Related