I have a loop in pandas that is really slow (ten plus minutes). I am trying to replace it with a vectorized function, but can't think of what to use. There are multiple records that have different household numbers but the same relationship group number, and if a record's household number is the same as the relationship group number then I want to use the officer number and name for that record for all records with that relationship group number (including if household number is different). See code below:
rg['RG Officer Number'] = pd.np.nan
rg['RG Officer Name'] = pd.np.nan
for index, row in rg.iterrows():
if row['Relationship Group'] == row['Household Number']:
mask = rg['Relationship Group'] == row['Relationship Group']
rg.loc[mask, 'RG Officer Number'] = row['Household Primary Officer Number']
rg.loc[mask, 'RG Officer Name'] = row['Household Primary Officer Name']
I tried the below, but I got an error (cannot use a single bool to index into setitem). I think I am completely off track. Maybe this is impossible with a vectorized function, but it seems it should not be.
mask = row['Relationship Group'] == row['Household Number']
rg.loc[mask, 'RG Officer Number'] = rg.loc['Household Primary Officer Number']
Any help you offer would be appreciated.
