2

I have dataframe with 2 columns and a few thousands of rows. What I need now is drop out, delete, rows which contains 'css', 'jpg', 'png', 'favicon', etc. in column values. It looks like this:

Referer      Count

favicon.ico   24
ponto.css     21
mobil/net     16
private/net   14
ort.jpg       11

The desired output is this:

   Referer      Count

    mobil/net     16
    private/net   14

I've tried with this:

df[df['Referer'].str.contains('css', 'jpg', 'png', 'favicon.ico')]

But this is what I got:

unsupported operand type(s) for &: 'str' and 'int'

1 Answer 1

4

Need | what is or in regex and then invert boolean mask by ~.

So need css or jpg ...

df = df[~df['Referer'].str.contains('css|jpg|png|favicon.ico')]
print (df)
       Referer  Count
2    mobil/net     16
3  private/net     14

If values are in list, is possible use join with | - output is same.

L = ['css','jpg','png','favicon.ico']

df = df[~df['Referer'].str.contains('|'.join(L))]
print (df)
       Referer  Count
2    mobil/net     16
3  private/net     14
Sign up to request clarification or add additional context in comments.

5 Comments

Glad can help you! Nice day!
@jezrael, you are really fast :-)
@Praveen - Sometimes yes, sometimes not. But I think your solution is same, so the best is delete it. Thanks
@jezrael i guess, bracket notation is better than dot notation.
Yes, obviously yes. There is also problem with columns like sum, mean if use dot notation.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.