Delete specific strings from pandas dataframe with operators chaining

Question

I want to delete specific strings with regular expressions from the column Sorte which I don't want to have in my dataframe file_df with the following code:

file_df = file_df[(file_df.Sorte != 'sonstige') & (file_df.Sorte != 'verauslagte Portokosten')
                  & (file_df.Sorte != 'erhaltenenzahlung Re  vom')
                  & (file_df.Sorte != 'geleistetenzahlung aus Re-Nr')
                  & (file_df.Sorte != '^.*Holzkisten geliefert.*$')
                  & (file_df.Sorte != '^.*Infomaterialktionspakete.*$')
                  & (file_df.Sorte != '^.*Aloe Vera  haben wir nicht im Sortiment.*$') 
                  & (file_df.Sorte != '^.*Anzeigenvorlage Planten ut`norden.*$')]

But somehow when I execute this code these strings still are in the dataset and I can not figure out why. I wanted to chain this expression to not create so many copies.

Update

The code worked for some strings in the dataset, for others not.

You are not using regular expressions here, but simple string comparison — mozway
– mozway, Commented Oct 12, 2021 at 7:51
Should you be using OR '|' instead ? You code seems to be saying Var != 'A' & Var != 'B'. Can't be A and B at the same time. — EBDS
– EBDS, Commented Oct 12, 2021 at 8:18
@mozway How can I compare strings using regular expressions? — jeiglsperger
– jeiglsperger, Commented Oct 12, 2021 at 8:22
@EBDS As all the strings are in the column "Sorte", I guess that "&" should work as the condition that all the unwanted strings shouldn't be in the remaining dataset can be fulfilled, but I can try "|" — jeiglsperger
– jeiglsperger, Commented Oct 12, 2021 at 8:29
I think could try == and "|" since you want to remove them. Then use ~ to negate it. — EBDS
– EBDS, Commented Oct 12, 2021 at 8:58

tzinie · Accepted Answer · 2021-10-12 08:07:16Z

0

Maybe something like this:

ls = ['sonstige', 'verauslagte Portokosten', 'erhaltenenzahlung Re  vom', ...]
file_df = file_df[~ file_df.country.str.contains('|'.join(ls))]

answered Oct 12, 2021 at 8:07

tzinie

7495 silver badges9 bronze badges

Sign up to request clarification or add additional context in comments.

Comments

jeiglsperger · Accepted Answer · 2022-08-08 10:04:45Z

I figured out another solution that worked out for me derived from the answer at https://stackoverflow.com/a/54410702/14553595, which is basically a combination of your suggestions in the comments:

file_df = file_df.loc[:,~(file_df.columns.str.contains('^.*Fracht.*$', case=False)
                          | file_df.columns.str.contains('^.*Angebotspaket.*$', case=False)
                          | file_df.columns.str.contains('^.*Werbe.*$', case=False)
                          | file_df.columns.str.contains('^.*Vita.Verde.*$', case=False)
                          | file_df.columns.str.contains('^.*zahlung.*$', case=False)
                          | file_df.columns.str.contains('^.*Europalette.*$', case=False)
                          | file_df.columns.str.contains('^.*Angebotspaket.*$', case=False)
                          | file_df.columns.str.contains('^.*Aufkleber fuer Saeule.*$', case=False)
                          | file_df.columns.str.contains('^.*Aufsetzer.*$', case=False)
                          | file_df.columns.str.contains('^.*Ausstellen der Pflanzen in die Beete  pauschal.*$', case=False)
                          | file_df.columns.str.contains('^.*Ausstellungsflaeche.*$', case=False)
                          | file_df.columns.str.contains('^.*Auswaschen.*$', case=False)
                          | file_df.columns.str.contains('^.*Bild.*$', case=False)
                          | file_df.columns.str.contains('^.*etikette.*$', case=False)
                          )]

Collectives™ on Stack Overflow

Delete specific strings from pandas dataframe with operators chaining

2 Answers 2

Comments

Comments

Your Answer

Linked

Hot Network Questions

Collectives™ on Stack Overflow

2 Answers 2

Comments

Comments

Your Answer

Sign up or log in

Post as a guest

Linked

Related