2

Okay so I went through some long way of getting my dataframe to look like this so that I was able to graph it:

data_mth  GROUP
201504    499 and below    0.001806
201505    499 and below    0.007375
201506    499 and below    0.000509
201507    499 and below    0.007344
201504    500 - 599        0.016672
201505    500 - 599        0.011473
201506    500 - 599        0.017733
201507    500 - 599        0.017651
201504    800 - 899        0.472784
201505    800 - 899        0.516837
201506    800 - 899        0.169811
201507    800 - 899        0.293966
201504    900 and above    0.065144
201505    900 and above    0.226626
201506    900 and above    0.251585
201507    900 and above    0.299850

Because of how much space this way was taking up I had to modify my code, and I have this dataframe now:

ptnr_cur_vntg_scor_band  499 and below  500 - 599  800 - 899  900 and above
data_mth
201504                   0.001806       0.016672   0.472784   0.065144
201505                   0.007375       0.011473   0.516837   0.226626
201506                   0.000509       0.017733   0.169811   0.251585
201507                   0.007344       0.017651   0.293966   0.299850

What is a good way to manipulate the second data frame to look like the first one?

My current code looks like so:

df = self.bunch['occ_data.all_data']
df = cpr.filter(df, 'ccm_acct_status', 'Open', 'Open-Inactive', 'Open-Active', 'OpenFraud', 'New')
df = df.groupby(['ptnr_cur_vntg_scor_band', 'data_mth']).sum()['ccm_curr_vntg_cnt']

df = df.unstack(0).fillna(0)

df.loc[:,"499andbelow":"NoVantageScore"] = df.loc[:, "499andbelow":"NoVantageScore"].div(df.sum(axis=1), axis=0)
df = df.fillna(0)

Its output is the second dataframe above.

2
  • And what is the output of your current code? Commented Jul 18, 2016 at 20:40
  • output is the second data frame that I listed Commented Jul 18, 2016 at 20:47

1 Answer 1

4
import io
import pandas as pd

data = io.StringIO('''\
499 and below,500 - 599,800 - 899,900 and above
201504,0.001806,0.016672,0.472784,0.065144
201505,0.007375,0.011473,0.516837,0.226626
201506,0.000509,0.017733,0.169811,0.251585
201507,0.007344,0.017651,0.293966,0.299850
''')

df = pd.read_csv(data)
df.index.name = 'data_mth'
df.columns.name = 'ptnr_cur_vntg_scor_band'
print(df)

# ptnr_cur_vntg_scor_band  499 and below  500 - 599  800 - 899  900 and above
# data_mth                                                                   
# 201504                        0.001806   0.016672   0.472784       0.065144
# 201505                        0.007375   0.011473   0.516837       0.226626
# 201506                        0.000509   0.017733   0.169811       0.251585
# 201507                        0.007344   0.017651   0.293966       0.299850

s = df.unstack().swaplevel()
s.index.names = 'data_mth', 'GROUP'
print(s)

Output:

data_mth  GROUP   
201504    499 and below    0.001806
201505    499 and below    0.007375
201506    499 and below    0.000509
201507    499 and below    0.007344
201504    500 - 599        0.016672
201505    500 - 599        0.011473
201506    500 - 599        0.017733
201507    500 - 599        0.017651
201504    800 - 899        0.472784
201505    800 - 899        0.516837
201506    800 - 899        0.169811
201507    800 - 899        0.293966
201504    900 and above    0.065144
201505    900 and above    0.226626
201506    900 and above    0.251585
201507    900 and above    0.299850
dtype: float64
Sign up to request clarification or add additional context in comments.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.