Merge dataframes returned from applying function to DataFrame?

Question

My question is related to this question:

Merge dataframe with another dataframe created from apply function?

Here is my version of code:

col = ['State','Annual Salary']
dat = [['New York', 132826], ['New Hampshire',128704], ['California',127388], ['Vermont',121599], ['Idaho',120011]]
df = pd.DataFrame(dat, columns=col)

def get_taxes_from_api(state, annual_salary):
    return pd.DataFrame({'State': [state, state], 
                         'annual.fica.amount': [int(annual_salary * 0.067),
                                                int(annual_salary * 1.067)], 
                         'annual.federal.amount': [int(annual_salary * 0.3),
                                                   int(annual_salary * 1.3)], 
                         'annual.state.amount': [int(annual_salary * 0.048),
                                                 int(annual_salary * 1.048)]})

How do I apply get_taxes_from_api to each row of df and combine the returned dataframes into on dataframe?

The only difference is that my function returns a multiple-row dataframe, not a 1-row dataframe. So the solution to that question above does not work for my situation. (And I don't have enought reputation to leave a comment there.)

mitoRibo · Accepted Answer · 2022-06-29 22:38:27Z

This doesn't directly answer your question, but here's one way that doesn't use apply

col = ['State','Annual Salary']
dat = [['New York', 132826], ['New Hampshire',128704], ['California',127388], ['Vermont',121599], ['Idaho',120011]]
df = pd.DataFrame(dat, columns=col)

#Create the "first" row of each state from your function by adding columns
df['annual.fica.amount'] = df['Annual Salary'].multiply(0.067)
df['annual.federal.amount'] = df['Annual Salary'].multiply(0.3)
df['annual.state.amount'] = df['Annual Salary'].multiply(0.048)

#Create the "second" row of each state as a new df
cumulative_df = df.copy()
cumulative_df['annual.fica.amount'] += cumulative_df['Annual Salary']
cumulative_df['annual.federal.amount'] += cumulative_df['Annual Salary']
cumulative_df['annual.state.amount'] += cumulative_df['Annual Salary']

#Concatenate the two tables and sort so the states are right next to each other
final_df = pd.concat((df,cumulative_df)).sort_values('State').reset_index(drop=True)

Output

Yefet · Accepted Answer · 2022-06-29 23:13:30Z

1

You could use concat for the nested DataFrame

nested_df = df.apply(lambda x: get_taxes_from_api(x["State"],x["Annual Salary"]),axis=1)

result = pd.DataFrame()

for element in nested_df:
    result = pd.concat([result,element])

result:

print(result)

	State	annual.fica.amount	annual.federal.amount	annual.state.amount
0	New York	8899	39847	6375
1	New York	141725	172673	139201
0	New Hampshire	8623	38611	6177
1	New Hampshire	137327	167315	134881
0	California	8534	38216	6114
1	California	135922	165604	133502
0	Vermont	8147	36479	5836
1	Vermont	129746	158078	127435
0	Idaho	8040	36003	5760
1	Idaho	128051	156014	125771

edited Jun 29, 2022 at 23:13

answered Jun 29, 2022 at 23:06

Yefet

2,1162 gold badges12 silver badges21 bronze badges

Comments

Anmol Deep · Accepted Answer · 2023-07-25 10:25:01Z

0

You can create a new join key among the two dfs and do pd.merge. See here:

df["df_merge_key"] = "#"
df_after_apply["df_merge_key"] = "#"
details_df = pd.merge(df, df_after_apply, how="left", on="df_merge_key").drop(labels=["df_merge_key"], axis=1)

This is simpler and neater in my opinion.

answered Jul 25, 2023 at 10:25

Anmol Deep

7631 gold badge10 silver badges23 bronze badges

Collectives™ on Stack Overflow

Merge dataframes returned from applying function to DataFrame?

3 Answers 3

Comments

Comments

Comments

Your Answer

Linked

Hot Network Questions

Collectives™ on Stack Overflow

3 Answers 3

Comments

Comments

Comments

Your Answer

Sign up or log in

Post as a guest

Linked

Related