0

I have two .csv files joined in Python with the Pandas module. One column is date with the format "dd.mm.yyyy".

Now I would like to extract only the month (as 2 digit integer with leading zero) from it for further use.

I have so far accomplished the job but I had to cheat. Python thinks the string that I am getting is the DAY. I don't like half-baked things, so I wanted to ask the community how I can tell Python specifically which part of the date is the month so it can be returned to me?

Here is what I have so far:

import pandas

def saison(input):
    if input == "04" or input == "05" or input == "06" or input == "07" or input == "08" or input == "09":
        return "Sommer"
    else:
        return "Winter"

df_1 = pandas.read_csv("a.csv", sep=';', names=["DWD_ID", "Datum"], header=0)
df_2 = pandas.read_csv("b.csv", sep=';', names=[VEG", "DWD_ID"], header=0)
df_joined = pandas.merge(df_1, df_2, on="DWD_ID")
df_joined["Datum"] = pandas.to_datetime(df_joined["Datum"])

df_joined["Saison"] = saison(df_joined["Datum"].apply(lambda x: x.strftime('%d')))

If I use

x.strftime('%m') 

instead it returns me the day.

2 Answers 2

1

You could supply the format you want to keep in the arg of to_datetime:

pd.to_datetime(df['date_col'], format="%d.%m.%Y").dt.month.astype(str).str.zfill(2)
Sign up to request clarification or add additional context in comments.

4 Comments

dayfirst=True would also work but format will be faster and will fail when something goes wrong instead of silently making choices that you might disagree with.
That did the job! Thanks!
@Khaled, I see that you haven't accepted answers for any of your prior questions including this one ;-) You should certainly accept them(by clicking on the check mark) if it has solved your problem. This indicates to anyone coming over this page in future that this solution of mine worked to resolve the problem.
Oops, sorry, I haven't been aware of this feature yet!
1

First it seems you have swap month and day in datetime, so you need add argument format='%Y-%d-%m' to to_datetime (Python's strftime directives):

df = pd.DataFrame({'Date': ['2016-24-02']})  
print (df)
         Date
0  2016-24-02

print (pd.to_datetime(df.Date, format='%Y-%d-%m'))
0   2016-02-24
Name: Date, dtype: datetime64[ns]

After converting you can use:

dt.strftime:

print (df.Date.dt.strftime('%m'))

Another solution with extract month, then convert to string and last add zero fill by zfill:

print (df.Date.dt.month.astype(str).str.zfill(2))

Sample:

start = pd.to_datetime('2015-02-24')
rng = pd.date_range(start, periods=10, freq='m')

df = pd.DataFrame({'Date': rng})  
print (df)
        Date
0 2015-02-28
1 2015-03-31
2 2015-04-30
3 2015-05-31
4 2015-06-30
5 2015-07-31
6 2015-08-31
7 2015-09-30
8 2015-10-31
9 2015-11-30


print (df.Date.dt.strftime('%m'))
0    02
1    03
2    04
3    05
4    06
5    07
6    08
7    09
8    10
9    11
Name: Date, dtype: object

print (df.Date.dt.month.astype(str).str.zfill(2))
0    02
1    03
2    04
3    05
4    06
5    07
6    08
7    09
8    10
9    11
Name: Date, dtype: object

Last you can use numpy.where with condition with isin:

saison = ["04","05","06","07","08","09"]
df['Saison'] = np.where(df.Date.dt.strftime('%m').isin(saison), 'Sommer','Winter')
print (df)
        Date  Saison
0 2015-02-28  Winter
1 2015-03-31  Winter
2 2015-04-30  Sommer
3 2015-05-31  Sommer
4 2015-06-30  Sommer
5 2015-07-31  Sommer
6 2015-08-31  Sommer
7 2015-09-30  Sommer
8 2015-10-31  Winter
9 2015-11-30  Winter

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.