0

I want to delete specific strings with regular expressions from the column Sorte which I don't want to have in my dataframe file_df with the following code:

file_df = file_df[(file_df.Sorte != 'sonstige') & (file_df.Sorte != 'verauslagte Portokosten')
                  & (file_df.Sorte != 'erhaltenenzahlung Re  vom')
                  & (file_df.Sorte != 'geleistetenzahlung aus Re-Nr')
                  & (file_df.Sorte != '^.*Holzkisten geliefert.*$')
                  & (file_df.Sorte != '^.*Infomaterialktionspakete.*$')
                  & (file_df.Sorte != '^.*Aloe Vera  haben wir nicht im Sortiment.*$') 
                  & (file_df.Sorte != '^.*Anzeigenvorlage Planten ut`norden.*$')]

But somehow when I execute this code these strings still are in the dataset and I can not figure out why. I wanted to chain this expression to not create so many copies.

Update

The code worked for some strings in the dataset, for others not.

5
  • You are not using regular expressions here, but simple string comparison Commented Oct 12, 2021 at 7:51
  • Should you be using OR '|' instead ? You code seems to be saying Var != 'A' & Var != 'B'. Can't be A and B at the same time. Commented Oct 12, 2021 at 8:18
  • @mozway How can I compare strings using regular expressions? Commented Oct 12, 2021 at 8:22
  • @EBDS As all the strings are in the column "Sorte", I guess that "&" should work as the condition that all the unwanted strings shouldn't be in the remaining dataset can be fulfilled, but I can try "|" Commented Oct 12, 2021 at 8:29
  • I think could try == and "|" since you want to remove them. Then use ~ to negate it. Commented Oct 12, 2021 at 8:58

2 Answers 2

0

Maybe something like this:

ls = ['sonstige', 'verauslagte Portokosten', 'erhaltenenzahlung Re  vom', ...]
file_df = file_df[~ file_df.country.str.contains('|'.join(ls))]
Sign up to request clarification or add additional context in comments.

Comments

0

I figured out another solution that worked out for me derived from the answer at https://stackoverflow.com/a/54410702/14553595, which is basically a combination of your suggestions in the comments:

file_df = file_df.loc[:,~(file_df.columns.str.contains('^.*Fracht.*$', case=False)
                          | file_df.columns.str.contains('^.*Angebotspaket.*$', case=False)
                          | file_df.columns.str.contains('^.*Werbe.*$', case=False)
                          | file_df.columns.str.contains('^.*Vita.Verde.*$', case=False)
                          | file_df.columns.str.contains('^.*zahlung.*$', case=False)
                          | file_df.columns.str.contains('^.*Europalette.*$', case=False)
                          | file_df.columns.str.contains('^.*Angebotspaket.*$', case=False)
                          | file_df.columns.str.contains('^.*Aufkleber fuer Saeule.*$', case=False)
                          | file_df.columns.str.contains('^.*Aufsetzer.*$', case=False)
                          | file_df.columns.str.contains('^.*Ausstellen der Pflanzen in die Beete  pauschal.*$', case=False)
                          | file_df.columns.str.contains('^.*Ausstellungsflaeche.*$', case=False)
                          | file_df.columns.str.contains('^.*Auswaschen.*$', case=False)
                          | file_df.columns.str.contains('^.*Bild.*$', case=False)
                          | file_df.columns.str.contains('^.*etikette.*$', case=False)
                          )]

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.