numpy where with multiple conditions linked to dataframe

Question

I'm using numpy where with multiple conditions to assign a category based on a text string a transaction description.

Part of the code is below

`import numpy as np

conditions = [
    df2['description'].str.contains('AEGON', na=False),                        
    df2['description'].str.contains('IB/PVV', na=False),                   
    df2['description'].str.contains('Picnic', na=False),                   
    df2['description'].str.contains('Jumbo', na=False),  
]    
values = [
    'Hypotheek',                                                  
    'Hypotheek',                                                  
    'Boodschappen',                                               
    'Boodschappen']   

df2['Classificatie'] = np.select(conditions, values, default='unknown')

I have many conditions which - only partly shown here. I want to create a table / dataframe in stead of including every seperate condition and value in the code. So for instance the following dataframe:

import pandas as pd

Conditions = {'Condition': ['AEGON','IB/PVV','Picnic','Jumbo'],
         'Value': ['Hypotheek','Hypotheek','Boodschappen','Boodschappen']
         
        }

df_conditions = pd.DataFrame(Conditions, columns= ['Condition','Value'])

How can I adjust the condition to look for (in the str.contains) a text string as listed in df_condictions['condition'] and to apply the Value column to df2['Classificatie']? The values are already a list in the variable explorer, but I can't find a way to have the str.contains to look for a value in a list / dataframe.

    Desired outcome:
    In [3]: iwantthis
    Out[3]:
       Description               Classificatie
    0  groceries Jumbo on date   boodschappen         
1  mortgage payment Aegon.    Hypotheek
    2  transfer picnic.           Boodschappen

The first column is the input data frame, te second column is what I'm looking for.

Please note that my current code already allows me to create this column, but I want to use another more automated way using de df_condtions table.

I'm not yet really familiair with Python and I can't find anything online.

Can you please edit your question and put there sample (small) input dataframe and expected output? — Andrej Kesely
– Andrej Kesely, Commented Apr 9, 2022 at 13:54
I edited my question, did it clarify? @AndrejKesely I found another similar topic in the mean time: stackoverflow.com/questions/62854013/… tried this, but it led to strange combinations, also when adjusting the cut-offs. But in essence it seems to be the same question — Renalt
– Renalt, Commented May 1, 2022 at 18:16
thanks. I get two errors: 1) InvalidIndexError: Reindexing only valid with uniquely valued Index objects and if I run again: raise KeyError(key) from err KeyError: 'Condition' I had some duplcicates in the conditions table, which I removed. But the error keeps occurring. I've added a verification of the integrity (both set tot TRUE and false), but this does not solve the issue df_conditions = df_conditions.set_index('Condition', verify_integrity=False) — Renalt
– Renalt, Commented May 6, 2022 at 12:11

Andrej Kesely · Accepted Answer · 2022-05-01 18:39:00Z

0

Try:

import re

df_conditions["Condition"] = df_conditions["Condition"].str.lower()
df_conditions = df_conditions.set_index("Condition")

tmp = df["Description"].str.extract(
    "(" + "|".join(re.escape(c) for c in df_conditions.index) + ")",
    flags=re.I,
)

df["Classificatie"] = tmp[0].str.lower().map(df_conditions["Value"])
print(df)

Prints:

               Description Classificatie
0  groceries Jumbo on date  Boodschappen
1  mortgage payment Aegon.     Hypotheek
2         transfer picnic.  Boodschappen

answered May 1, 2022 at 18:39

Andrej Kesely

196k15 gold badges60 silver badges105 bronze badges

Sign up to request clarification or add additional context in comments.

Collectives™ on Stack Overflow

numpy where with multiple conditions linked to dataframe

1 Answer 1

Comments

Your Answer

Linked

Hot Network Questions

Collectives™ on Stack Overflow

1 Answer 1

Comments

Your Answer

Sign up or log in

Post as a guest

Linked

Related