1

i need to create a dataframe such that i have the output as follows

day hour cal_hr
1   6    106
1   7    107
1   8    108
..
..
1   24   124
..
7   1    701
7   2    702
..
..
7   24   724

i want to loop through day and then hour and then do a concat of day and hour. With preceding 0 for 106(say)

something like

for i in range(1,8):
    for j in range(6,25):
        df.append(i,j)
 df=pd.dataFrame(df)

can df.append create two variables simaltaneously

1
  • unfortunately i need to create a dataframe , so that i can merge it back with another table. Cal_hr can be create later on if i have day and hour created. coz it is just a concatenation of day and hour. id day = 1 and hour =6 then cal_hr = 106 and so on Commented Jul 23, 2016 at 16:33

2 Answers 2

4

Append to a list then convert to a dataframe. It would be much more efficient.

df = pd.DataFrame([(i, j, 100*i+j) 
                    for i in range(1, 8) 
                    for j in range(6, 25)], 
                    columns=['day', 'hour', 'cal_hr'])

df.head()
Out[143]: 
   day  hour  cal_hr
0    1     6     106
1    1     7     107
2    1     8     108
3    1     9     109
4    1    10     110

df.tail()
Out[144]: 
     day  hour  cal_hr
128    7    20     720
129    7    21     721
130    7    22     722
131    7    23     723
132    7    24     724
Sign up to request clarification or add additional context in comments.

2 Comments

hi ayhan, can this logic be extended , say i want to add another column 'week' which goes from 201501 to 201552 and then 201601 to 201652, can i do the following ? core_hours = pd.DataFrame([(i, j, k, 100*i+j) for i in range(1, 8) for j in range(start_hour, end_hour+1) for k in range(201601, 201652)], columns = ['day', 'hour_i', 'week', 'hour']) but in that case how do i make the loop to restart from 1 when the week hits 201652 (say)
@Mukul Can you edit the example in your post to include the week column. I don't understand exactly.
0

This isn't as fast or as intuitive as @ayhan's answer, but I think it's an interesting way to think about it.

day = pd.Series(np.arange(1, 8), name='day')
hour = pd.Series(np.arange(6, 25), name='hour')
df = pd.DataFrame(np.add.outer(day * 100, hour), day, hour)
df = df.stack().rename('cal_hr').reset_index()

df.head()

enter image description here

df.tail()

enter image description here

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.