I have a piece of code which works fine alone, but when I put it in loop (or use df.apply() method), it does not work.
The code is:
import pandas as pd
from functools import partial
datadf=pd.DataFrame(data,columns=['X1','X2'])
for i in datadf.index.values.tolist():
row=datadf.loc[i]
x1=row['X1']
x2=row['X2']
set1=set([x1,x2])
links=data2[data2['Xset']==set1]
df1=pd.DataFrame(range(1,11),columns=['year'])
def idlist1(row,var1):
year=row['year']
id1a=links[(links['xx1']==var1) & (links['year']==year)]
id1a=id1a['id1'].values.tolist()
id1b=links[(links['xx2']==var1) & (links['year']==year)]
id1b=id1b['id2'].values.tolist()
id1=list(set(id1a+id1b))
return id1
df1['id1a']=df1.apply(partial(idlist1,var1=x1),axis=1)
#...(do other stuffs to return a value using "df1")
del df1
Here data2 is another dataframe. Here I'm trying to match the values of (x1,x2) to data2.
The code works fine outside the loop by which I mean, I specify (x1,x2) directly. But when I put the code in the loop or use df.apply, I always get the error message
ValueError: could not broadcast input array from shape (0) into shape (1)
I don't understand why. Could someone help? Thanks!
(BTW, the version of pandas is 0.18.0.)
The full error message is:
File "<ipython-input-229-541c0f3a4d2f>", line 19, in <module>
df1['id1a']=df1.apply(partial(idlist1,var1=x1),axis=1)
File "/anaconda2/lib/python2.7/site-packages/pandas/core/frame.py", line 4042, in apply
return self._apply_standard(f, axis, reduce=reduce)
File "/anaconda2/lib/python2.7/site-packages/pandas/core/frame.py", line 4155, in _apply_standard
result = self._constructor(data=results, index=index)
File "/anaconda2/lib/python2.7/site-packages/pandas/core/frame.py", line 223, in __init__
mgr = self._init_dict(data, index, columns, dtype=dtype)
File "/anaconda2/lib/python2.7/site-packages/pandas/core/frame.py", line 359, in _init_dict
return _arrays_to_mgr(arrays, data_names, index, columns, dtype=dtype)
File "/anaconda2/lib/python2.7/site-packages/pandas/core/frame.py", line 5250, in _arrays_to_mgr
return create_block_manager_from_arrays(arrays, arr_names, axes)
File "/anaconda2/lib/python2.7/site-packages/pandas/core/internals.py", line 3933, in create_block_manager_from_arrays
construction_error(len(arrays), arrays[0].shape, axes, e)
File "/anaconda2/lib/python2.7/site-packages/pandas/core/internals.py", line 3895, in construction_error
raise e
ValueError: could not broadcast input array from shape (0) into shape (1)
Update: I found out the df.apply method somehow is not compatible with the loop, so I converted all the apply's in the loop to loops, and the code works fine now. Although I "sort of" solved the issue, but I'm still very confused about why this would happen. If anyone knows why, I'd really appreciate the answer. Thanks!