1

My question is related to this question:

Merge dataframe with another dataframe created from apply function?

Here is my version of code:

col = ['State','Annual Salary']
dat = [['New York', 132826], ['New Hampshire',128704], ['California',127388], ['Vermont',121599], ['Idaho',120011]]
df = pd.DataFrame(dat, columns=col)

def get_taxes_from_api(state, annual_salary):
    return pd.DataFrame({'State': [state, state], 
                         'annual.fica.amount': [int(annual_salary * 0.067),
                                                int(annual_salary * 1.067)], 
                         'annual.federal.amount': [int(annual_salary * 0.3),
                                                   int(annual_salary * 1.3)], 
                         'annual.state.amount': [int(annual_salary * 0.048),
                                                 int(annual_salary * 1.048)]})

How do I apply get_taxes_from_api to each row of df and combine the returned dataframes into on dataframe?

The only difference is that my function returns a multiple-row dataframe, not a 1-row dataframe. So the solution to that question above does not work for my situation. (And I don't have enought reputation to leave a comment there.)

3 Answers 3

1

This doesn't directly answer your question, but here's one way that doesn't use apply

col = ['State','Annual Salary']
dat = [['New York', 132826], ['New Hampshire',128704], ['California',127388], ['Vermont',121599], ['Idaho',120011]]
df = pd.DataFrame(dat, columns=col)

#Create the "first" row of each state from your function by adding columns
df['annual.fica.amount'] = df['Annual Salary'].multiply(0.067)
df['annual.federal.amount'] = df['Annual Salary'].multiply(0.3)
df['annual.state.amount'] = df['Annual Salary'].multiply(0.048)

#Create the "second" row of each state as a new df
cumulative_df = df.copy()
cumulative_df['annual.fica.amount'] += cumulative_df['Annual Salary']
cumulative_df['annual.federal.amount'] += cumulative_df['Annual Salary']
cumulative_df['annual.state.amount'] += cumulative_df['Annual Salary']

#Concatenate the two tables and sort so the states are right next to each other
final_df = pd.concat((df,cumulative_df)).sort_values('State').reset_index(drop=True)

Output

enter image description here

Sign up to request clarification or add additional context in comments.

Comments

1

You could use concat for the nested DataFrame

nested_df = df.apply(lambda x: get_taxes_from_api(x["State"],x["Annual Salary"]),axis=1)

result = pd.DataFrame()

for element in nested_df:
    result = pd.concat([result,element])

result:

print(result)
State annual.fica.amount annual.federal.amount annual.state.amount
0 New York 8899 39847 6375
1 New York 141725 172673 139201
0 New Hampshire 8623 38611 6177
1 New Hampshire 137327 167315 134881
0 California 8534 38216 6114
1 California 135922 165604 133502
0 Vermont 8147 36479 5836
1 Vermont 129746 158078 127435
0 Idaho 8040 36003 5760
1 Idaho 128051 156014 125771

Comments

0

You can create a new join key among the two dfs and do pd.merge. See here:

df["df_merge_key"] = "#"
df_after_apply["df_merge_key"] = "#"
details_df = pd.merge(df, df_after_apply, how="left", on="df_merge_key").drop(labels=["df_merge_key"], axis=1)

This is simpler and neater in my opinion.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.