3

I have the following code, trying to find the hour of the 'Dates' column in a data frame:

print(df['Dates'].head(3))
df['hour'] = df.apply(lambda x: find_hour(x['Dates']), axis=1)

def find_hour(self, input):
    return input[11:13].astype(float)

where the print(df['Dates'].head(3)) looks like:

0    2015-05-13 23:53:00
1    2015-05-13 23:53:00
2    2015-05-13 23:33:00

However, I got the following error:

    df['hour'] = df.apply(lambda x: find_hour(x['Dates']), axis=1)
NameError: ("global name 'find_hour' is not defined", u'occurred at index 0')

Does anyone know what I missed? Thanks!


Note that if I put the function directly in the lambda line like below, everything works fine:

df['hour'] = df.apply(lambda x: x['Dates'][11:13], axis=1).astype(float)
1
  • You can also extract the hour directly from x if it is a datetime object and what is self supposed to be? Commented Apr 1, 2016 at 18:08

2 Answers 2

10

You are trying to use find_hour before it has yet been defined. You just need to switch things around:

def find_hour(self, input):
    return input[11:13].astype(float)

print(df['Dates'].head(3))
df['hour'] = df.apply(lambda x: find_hour(x['Dates']), axis=1)

Edit: Padraic has pointed out a very important point: find_hour() is defined as taking two arguments, self and input, but you are giving it only one. You should define find_hour() as def find_hour(input): except that defining the argument as input shadows the built-in function. You might consider renaming it to something a little more descriptive.

Sign up to request clarification or add additional context in comments.

Comments

7

what is wrong with old good .dt.hour?

In [202]: df
Out[202]:
                 Date
0 2015-05-13 23:53:00
1 2015-05-13 23:53:00
2 2015-05-13 23:33:00

In [217]: df['hour'] = df.Date.dt.hour

In [218]: df
Out[218]:
                 Date  hour
0 2015-05-13 23:53:00    23
1 2015-05-13 23:53:00    23
2 2015-05-13 23:33:00    23

and if your Date column is of string type you may want to convert it to datetime first:

df.Date = pd.to_datetime(df.Date)

or just:

df['hour'] = int(df.Date.str[11:13])

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.