I would like to use a function that produces multiple outputs to create multiple new columns in an existing pandas dataframe.
For example, say I have this test function which outputs 2 things:
def testfunc (TranspoId, LogId):
thing1 = TranspoId + LogId
thing2 = LogId - TranspoId
return thing1, thing2
I can give those returned outputs to 2 different variables like so:
Thing1,Thing2 = testfunc(4,28)
print(Thing1)
print(Thing2)
I tried to do this with a dataframe in the following way:
data = {'Name':['Picard','Data','Guinan'],'TranspoId':[1,2,3],'LogId':[12,14,23]}
df = pd.DataFrame(data, columns = ['Name','TranspoId','LogId'])
print(df)
df['thing1','thing2'] = df.apply(lambda row: testfunc(row.TranspoId, row.LogId), axis=1)
print(df)
What I want is something that looks like this:
data = {'Name':['Picard','Data','Guinan'],'TranspoId':[1,2,3],'LogId':[12,14,23], 'Thing1':[13,16,26], 'Thing2':[11,12,20]}
df = pd.DataFrame(data, columns=['Name','TranspoId','LogId','Thing1','Thing2'])
print(df)
In the real world that function is doing a lot of heavy lifting, and I can't afford to run it twice, once for each new variable being added to the df.
I've been hitting myself in the head with this for a few hours. Any insights would be greatly appreciated.
apply,lambdaand a custom function?