Boolean indexer script works without Error but doesn't work

Question

The first code works, the second code block gives no Error but doesn't give the result I expected.

First code creates a new column ['Type']. Names of equal stores but with different names are binned in column['Type']. So: shop name A and Shop name B, are in column ['Naam']. The script labels both as 'Supermarket'in column ['Type']. So far so good.

The second block of code is supposed to lable every store / shop etc. that is not named in the Namendict.test dictionary. I want these not recognised shop / stores etc. labeld as ['Diversen']. Hope someone has a suggestion. Thanks!

1: working code:

from Namendict import test

for value in df['Naam']:
     for i, (k,v) in enumerate(test.items()):          
        boolean_indexer = df['Naam'].str.contains(k)
        df.loc[boolean_indexer, 'Type'] = (v)

2: supposed to work code ( no Error, but also no Diversen in column ['Type'], just NaN):

from Namendict import test

for value in df['Naam']:
     for i, (k,v) in enumerate(test.items()):          
        boolean_indexer = df['Naam'].str.contains(k)
        if True:
            df.loc[boolean_indexer, 'Type'] = (v)
        else:
            df.loc[boolean_indexer, 'Type'] = ('Diversen.')

Many thanks. Janneman

if True: always evaluates to True, so Diversen never gets set. Think you might need to add some criteria to checking your condition. — Shan S
– Shan S, Commented Jul 14, 2020 at 13:44
Oops.... your right of course. What was I thinking... Thanks! — Janneman
– Janneman, Commented Jul 14, 2020 at 13:59

THeek · Accepted Answer · 2020-07-14 13:39:31Z

1

There are multiple options to tackle this problem. First option is just to replace the 'NaN' values afterwards with 'Diverse' with the fillna function of pandas. This looks as follows:

from Namendict import test

# Looping over all existing records in the dict
for k,v in test.items():          
   boolean_indexer = df['Naam'].str.contains(k)
   df.loc[boolean_indexer, 'Type'] = v

# Filling in all empty ("nan") values with "Diversen."
df['Type'] = df['Type'].fillna("Diversen.")

Another option is to check if the name exists in the 'test' dictionary. If so, the 'type' stored in the dictionary can be put in the DataFrame. This loops over all unique names in the column instead over all the values. This makes sure you don't execute multiple times the same action.

from Namendict import test

for naam in df['Naam'].unique(): # Loop over all unique names in DataFrame
    boolean_indexer = df['Naam'].str.contains(naam)

if naam in test.keys(): # Check if the name allready excist in dict
    # If True --> get type from the dictionary     
    df.loc[boolean_indexer, 'Type'] = test[naam] 
else:
    # If False --> fill in 'Diversen.' 
    df.loc[boolean_indexer, 'Type'] = "Diversen."

answered Jul 14, 2020 at 13:39

THeek

763 bronze badges

Sign up to request clarification or add additional context in comments.

1 Comment

Janneman Over a year ago

hey Theek. thanks! first solution obviously works. simple yet working solution. Your second suggestion doesn't work unfortunally. It sets almost all rows to Diversen in ['Type']. I think I will simply use fillna. Works all the time. thanks again!

Collectives™ on Stack Overflow

Boolean indexer script works without Error but doesn't work

1 Answer 1

1 Comment

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

1 Answer 1

1 Comment

Your Answer

Sign up or log in

Post as a guest

Related