0

I had a data frame in pandas where a variable consisted of Quarters and Years, and where I wanted to split those into Months (for Quarters) and Quarters and Months (for Years). The original data frame looked like this:

        volume  tenor      NewTenor
170     -3      quarter     Q1 21
516     -3      quarter     Q1 22
597     22      quarter     Q1 22
622     -3      year        Cal 21
625     22      quarter     Q2 22
657     -14     year        Cal 21
678     -16     quarter     Q1 22
704     -7      year        Cal 21
750     -16     quarter     Q1 22
934     -10     year        Cal 21

And using the following code:

def split_tenor(tenor):
    start, year = tenor.split(" ")
    if start == "Cal":
        months = ["Jan", "Feb", "Mar", "Q2", "Q3", "Q4"]
        year = int(year) + 1
    elif start == "Q1":
        months = ["Jan", "Feb", "Mar"]
    elif start == "Q2":
        months = ["Apr", "May", "Jun"]
    elif start == "Q3":
        months = ["Jul", "Aug", "Sep"]
    elif start == "Q4":
        months = ["Oct", "Nov", "Dec"]
    else:
        return tenor

    return [f"{m} {year}" for m in months]

my_data["NewTenor"] = my_data["NewTenor"].apply(split_tenor)
my_data = my_data.explode("NewTenor")

I managed to turn it into this:

     volume     tenor      NewTenor
170     -3      quarter     Jan 21
170     -3      quarter     Feb 21
170     -3      quarter     Mar 21
516     -3      quarter     Jan 22
516     -3      quarter     Feb 22
516     -3      quarter     Mar 22
597     22      quarter     Jan 22
597     22      quarter     Feb 22
597     22      quarter     Mar 22
622     -3      year        Jan 22
622     -3      year        Feb 22
622     -3      year        Mar 22
622     -3      year        Q2 22
622     -3      year        Q3 22
622     -3      year        Q4 22
625     22      quarter     Apr 22
625     22      quarter     May 22
625     22      quarter     Jun 22
657     -14     year        Jan 22
657     -14     year        Feb 22

However, the volume in the data frame remains the same, although it should split throughout the period accordingly (e.g. when a quarter is split into months, the corresponding volume should also split into three equal volumes).

Can someone help me split the volume at the same time as I split the period? Thank you.

EDIT

The right code should return something like this:

df_1 = pd.DataFrame({'volume':[12, 9],
               'tenor':['year', 'quarter'],
               'NewTenor':['Cal 21', 'Q2 22']})

Sould return:

df_2 = pd.DataFrame({'volume':[1, 1, 1, 3, 3, 3, 3, 3, 3],
               'tenor':['year', 'year', 'year', 'year', 'year', 'year', 'quarter', 'quarter', 'quarter'],
               'NewTenor':['Jan 21', 'Feb 21', 'Mar 21', 'Q2 21', 'Q3 21', 'Q4 21', 'Apr 22', 'May 22', 'Jun 22']})
1
  • 1
    Why don't you share how do you want it to be (expected output)? Commented May 24, 2022 at 21:28

1 Answer 1

1

You could do it by modifying your existing code to create two columns at once:

df = pd.DataFrame({'volume':[12, 9],
                   'tenor':['year', 'quarter'],
                   'NewTenor':['Cal 21', 'Q2 22']})
def split_tenor(row):
    start, year = row['NewTenor'].split(" ")
    if start == "Cal":
        months = ["Jan", "Feb", "Mar", "Q2", "Q3", "Q4"]
        year = int(year) + 1
    elif start == "Q1":
        months = ["Jan", "Feb", "Mar"]
    elif start == "Q2":
        months = ["Apr", "May", "Jun"]
    elif start == "Q3":
        months = ["Jul", "Aug", "Sep"]
    elif start == "Q4":
        months = ["Oct", "Nov", "Dec"]
    else:
        return tenor, tenor

    if start == "Cal":
        split_vol = [row['volume']/12] * 3 + [row['volume']/4] * 3
    else:
        split_vol = [row['volume']/len(months)] * len(months)
    
    return [f"{m} {year}" for m in months], split_vol

df["NewTenor"], df["NewVolume"] = zip(*df[["volume", "NewTenor"]].apply(split_tenor, axis = 1))
# in case you are using earlier version of pandas, change this line to explode two columns separately
df = df.explode(["NewTenor", "NewVolume"])
Sign up to request clarification or add additional context in comments.

5 Comments

Thank you for the answer. It is close, but does not split the volume according to the size of the period, i.e. it gives the same share to the month as it gives to the quarter, ideally a quarter should have the same share as three quarters.
I am not following here, a quarter with a volume of 3 split into 3 months would result in a volume of 1 for each month right? You could share your target output in the question.
You are right about splitting of the quarter, the problem occurs when you split the year. I updated my question to give an example of the expected result.
ah I see what you did with the year back there. The updated code should be able to handle it.
Thank you very much, this time it gave the output I was expecting.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.