0

I want to generate time/date format strings from the input data I got. Is there an easy way to do this?

My input data looks like this:

'01.12.2016 23:30:59,123'

So my code should generate the following format string:

'%d.%m.%Y %H:%M:%S,%f'

Background:

I used pandas.to_datetime() to generate datetime object for further processing. This works great but this function gets slow (uses dateutil.parser.parse here) with a lot of data (>~50k). At the moment I'm providing the format string above hardcoded within my code to speed up to_datetime() which also works great. Now I wanted to generate the format string within code to be more flexible regaring the input data.

edit (because the first two answers do not fit to my question):

I want to generate the format string not the datetime string.

edit2:

New approch to formulate the question: I'm reading in a file with a lot of data. Every line of data has got a timestamp with the following format: '01.12.2016 23:30:59,123'. I want to convert these timestamps into datetime objects. For this I'm using pandas.to_datetime() at the moment. This function works perfectly but it get slow since I got some files with over 50k datasets. To speed this process up I'm passing a format string within the function pandas.to_datetime(format='%d.%m.%Y %H:%M:%S,%f'). This speeds up the process but it is less flexible. Therefore I want to evaluate the format string only for the first dataset and use it for the rest of the 50k or more datasets.

How is this possible?

14
  • so you a re going to reinvent format guessing, which is already implemented in pandas.to_datetime()? ;) Do you know beforehand what formats are you going to have? Commented Jul 8, 2016 at 11:38
  • how do you know whether 01.12.2016 is 1 Dec or 12 Jan? Commented Jul 8, 2016 at 12:01
  • @MaxU: No I don't want to reinvent it because of that I'm asking. At the moment I know the format that's why I hard coded the format string into my code. But I want to make it more flexible and keep it fast. Maybe you should read the question... Commented Jul 8, 2016 at 12:08
  • 1
    You should consider restating your question. It is apparent many people are confused by what you are asking. It reads as if you want to format the datetime string, despite your edit. Instead of refuting what others are saying to try and help you, maybe you should take a different approach. Just my $0.02 Commented Jul 8, 2016 at 12:47
  • 1
    @Burner, i also didn't get from your question what is wrong with infer_datetime_format=True - it should do exactly the same what you are going to "re-invent", IMO Commented Jul 8, 2016 at 13:31

4 Answers 4

1

you can try to use infer_datetime_format parameter, but be aware - pd.to_datetime() will use dayfirst=False per default

Demo:

In [422]: s
Out[422]:
0    01.12.2016 23:30:59,123
1    23.12.2016 03:30:59,123
2    31.12.2016 13:30:59,123
dtype: object

In [423]: pd.to_datetime(s, infer_datetime_format=True)
Out[423]:
0   2016-01-12 23:30:59.123
1   2016-12-23 03:30:59.123
2   2016-12-31 13:30:59.123
dtype: datetime64[ns]

In [424]: pd.to_datetime(s, infer_datetime_format=True, dayfirst=True)
Out[424]:
0   2016-12-01 23:30:59.123
1   2016-12-23 03:30:59.123
2   2016-12-31 13:30:59.123
dtype: datetime64[ns]
Sign up to request clarification or add additional context in comments.

5 Comments

It's right this would solve my problem but I just tried it and it seems infer_datetime_format=True does not work with datetime strings with fractions of seconds :-(.
@Burner, well, you may try to dig into to_datetime() implementation - maybe it'll help you to figure out how to get format
Yes it seems there is no out of the box solution when using fractions of seconds. So my question is not that basic ;-). I think I will have a closer look because somehow to_datetime() is able to get it right with fractions.
@Burner, post an answer to your own question if you'll come with a working solution - it might help others in future...
I will! But I'm not sure if I succeed since I'm new to Python (~3 weeks)
0

use "datatime" to return the data and time. I this this will help you.

import datetime
print datetime.datetime.now().strftime('%d.%m.%Y %H:%M:%S,%f')

1 Comment

I'm sorry, how will this help me?
0

You can use datetime.strptime() inside datetime package which would return a datetime.datetime object.

In your case you should do something like:

datetime.strptime('01.12.2016 23:30:59,123', '%d.%m.%Y %H:%M:%S,%f').

After you have the datetime.datetime object, you can use datetime.strftime() function to get the datetime in the desired string format.

3 Comments

I think you did not understand the thing I want to do. I want to generate the format string not the datetime string. I want to input '01.12.2016 23:30:59,123' and get '%d.%m.%Y %H:%M:%S,%f'.
So is there any particular pattern of string? Like time will be separated by ':' and date by '.'?
There are a lot of patterns since around the world the time and date is formatted differently. Date could be separated by '.' or '/'. Day and month could be switched. Month given by name or number. The year could be stated first. And so on. All this work is done by pandas.to_datetime() already. I only want to get the format string of it and not the datetime object.
0

You should probably have a look here: https://github.com/humangeo/DateSense/

From its documentation:

>>> import DateSense
>>> print DateSense.detect_format( ["15 Dec 2014", "9 Jan 2015"] )
%d %b %Y

3 Comments

DateSense.detect_format( ["01.01.02 15:30:59.123123"] ) -> %m.%d.%y %H:%M:%S.123123 DateSense.detect_format( ["01.01.02 15:30:59,1"] ) -> %m.%d.%y %H:%M:%S,%w DateSense.detect_format( ["01.01.02 15:30:59,123"] ) -> %m.%d.%y %H:%M:%S,%Y -> Not even close.
maybe you should try with more dataset? btw it looks pretty close to me, except for whatever is the last part.. to be honest, even as a human I do not fully understand what is to be expected..
but if it is the microsecond parameter, it looks to me the only thing is miscalculating. Probably your best shot is then to adapt that code to your needs.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.