2

I have .csv data that I want to sort by it's date column. My date format is of the following:

Week,Quarter,Year: So WK01Q12001 for example.

When I .sort() my dataframe on this column, the resulting is sorted like:

WK01Q12001, WK01Q12002, WK01Q12003, WK01Q22001, WK01Q22002, WK01Q22003, ... WK02Q12001, WK02Q12002...

for example. This makes sense because its sorting the string in ascending order.

But I need my data sorted chronologically such that the result is like the following:

WK01Q12001, WK02Q12001, WK03Q12001, WK04Q12001, ... , WK01Q22001, WK02Q22001, ... WK01Q12002, WK02Q22002 ...

How can I sort it this way using pandas? Perhaps sorting the string in reverse? (right to left) or creating some kind of datetime object?

I have also tried using Series(): pd.Series([pd.to_datetime(d) for d in weeklyData['Date']]) But the result is same as the above .sort() method.

UPDATE: My DataFrame is similar in format to an excel sheet and currently looks like the following. I want to sort chronologically by 'Date'.

Date          Price     Volume
WK01Q12001    32        500
WK01Q12002    43        400
WK01Q12003    55        300
WK01Q12004    58        350
WK01Q22001    33        480
WK01Q22002    40        450
.
.
.
WK13Q42004    60        400

3 Answers 3

2

You can add a new column to your dataframe containing the date components as a list.

e.g.

a = ["2001", "Q2", "WK01"]
b = ["2002", "Q2", "WK01"]
c = ["2002", "Q2", "WK02"]

So, you can apply a function to your data frame to do this...

def tolist(x):
    g = re.match(r"(WK\d{2})(Q\d)(\d{4})", str(x))
    return [g.group(3), g.group(2), g.group(1)]

then...

 df['datelist'] = df['Date'].apply(tolist)

which gives you your date as a list arranged in the order of importance...

         Date  Price  Volume          datelist
0  WK01Q12001     32     500  [2001, Q1, WK01]
1  WK01Q12002     22     400  [2002, Q1, WK01]
2  WK01Q12003     42     500  [2003, Q1, WK01]

When comparing lists of equal length in Python the comparison operators behave well. So, you can use the standard DataFrame sort to order your data.

So the default sorting in a Pandas series will work correctly when you do...

df.sort('datelist')
Sign up to request clarification or add additional context in comments.

5 Comments

I wasn't sure if this would work with a Pandas Series but I just tried making a series with lists broken down this way and it worked just fine.
What is 'date' ? is this my DataFrame object? Please see above UPDATE on my question. I get an Type Error when trying this: "TypeError: expected string or buffer" Thanks!
Likewise, when I try with re.match(r"(WK\d{2})(Q\d)(\d{4})", dataframeobj['date']) , I get a buffer size mismatch error.
what python version are you using? Are your strings unicode or plain strings?
I updated my answer, your buffer mismatch error is due to unicode conversion, if your date strings are going to be consistently the same format you can safely convert to plain strings
1

Use str.replace to change the order of the keys first:

s = "WK01Q12001, WK01Q12002, WK01Q12003, WK01Q22001, WK01Q22002, WK01Q22003, WK02Q12001, WK02Q12002"
date = map(str.strip, s.split(","))
df = pd.DataFrame({"date":date, "value":range(len(date))})
df["date2"] = df.date.str.replace(r"WK(\d\d)Q(\d)(\d{4})", r"\3Q\2WK\1")
df.sort("date2")

1 Comment

The date column is just one column in my pandas dataframe. Its like an excel sheet where I want to sort by date, but the date is in the wrong format. Will your method work to sort the entire dateframe? Also I have 13WKs per Quarter, 4 Quarters per year, and several years. Thats a couple of hundred 'dates'. Is there a better way to do this? Thanks!
1

I was also able to accomplish this Date reformatting very easily using SQL. When I first query my data, I did SELECT *, RIGHT([Date], 4) + SUBSTRING([Date], 5, 2) + LEFT([Date], 4) As 'SortedDate' FROM [Table] ORDER BY 'SortedDate' ASC.

Use the right tool for the job!

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.