3

File dataexample_df.txt:

2020-12-04_163024 26.15 26.37 19.40 24.57
2020-12-04_163026 26.15 26.37 19.20 24.57
2020-12-04_163028 26.05 26.37 18.78 24.57

I want to read it in as pandas dataframe where the index column has only the time part in format '%H:%M:%S', without the date.

import pandas as pd
df = pd.read_csv("dataexample_df.txt", sep=' ', header=None, index_col=0)
print(df)

Output:

                       1      2      3      4
0
2020-12-04_163024  26.15  26.37  19.40  24.57
2020-12-04_163026  26.15  26.37  19.20  24.57
2020-12-04_163028  26.05  26.37  18.78  24.57

However, wanted output:

              1      2      3      4
0
16:30:24  26.15  26.37  19.40  24.57
16:30:26  26.15  26.37  19.20  24.57
16:30:28  26.05  26.37  18.78  24.57

I have tried different date_parser= -functions (cf. Answers in Parse_dates in Pandas) but get only error messages. Also, somewhat relevant is Python/Pandas convert string to time only but no luck, I'm stuck. I'm using Python 3.7.

3 Answers 3

3

Considering your df to be this:

In [121]: df
Out[121]: 
                       1      2      3      4
0                                            
2020-12-04_163024  26.15  26.37  19.40  24.57
2020-12-04_163026  26.15  26.37  19.20  24.57
2020-12-04_163028  26.05  26.37  18.78  24.57

You can use Series.replace with Series.dt.time:

In [122]: df.reset_index(inplace=True)
In [127]: df[0] = pd.to_datetime(df[0].str.replace('_', ' ')).dt.time

In [130]: df.set_index(0, inplace=True)

In [131]: df
Out[131]: 
              1      2      3      4
0                                   
16:30:24  26.15  26.37  19.40  24.57
16:30:26  26.15  26.37  19.20  24.57
16:30:28  26.05  26.37  18.78  24.57
Sign up to request clarification or add additional context in comments.

1 Comment

Thanks, but I get error message from df['0']. It does work if I use df[0]and df.set_index(0, inplace=True)
1

Here, I created a simple function to formate your datetime column, Please try this.

import pandas as pd

df = pd.read_csv('data.txt', sep=" ", header=None)

def format_time(date_str):
    # split date and time
    time =  iter(date_str.split('_')[1])
    # retun the time value adding
    return ':'.join(a+b for a,b in zip(time, time))

df[0] = df[0].apply(format_time)

print(df)

3 Comments

Thanks, this is great example how to use apply method, I never knew much about that. But this still requires one more operation at the end: df.set_index(0, inplace=True) like in @Mayank answer above.
@user9393931 Also to add, apply is basically loops under the hood. So it is a bit slower when compared to the functions I used in my answer.
@user9393931 agreed with you.
0

You need to tell it what the format of your date is using the format argument (otherwise you'll get an error):

# gives an error:
pd.to_datetime('2020-12-04_163024')

# works:
pd.to_datetime('2020-12-04_163024', format=r'%Y-%m-%d_%H%M%S')

So you can apply this to your dataframe and then access the time by using dt.time:

df['time'] = pd.to_datetime(df.index, format=r'%Y-%m-%d_%H%M%S').dt.time

That will give you the time as an object, but if you want to format it, just use something like this:

df['time'] = df['time'].strftime('%H:%M:%S')

1 Comment

I get an error AttributeError: 'DatetimeIndex' object has no attribute 'dt' from the line df['time'] = pd.to_datetime(df.index, format=r'%Y-%m-%d_%H%M%S').dt.time ?

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.