Regex for deleting two patterns in a string

Question

I'm using regex to parse HTML. So, confessing that sin right off the bat. If you have a better way, answer it here because I feel dirty and wrong.

Nonetheless, I can't find the answer to this regex question which can apply to non-HTML.

I have a string like:

tag ='style="width: 2010px; background-color: red; height: 200px; font-size: 12px"'

and want to remove the width and height elements only, so I tried:

    r = r'style="(width:\s?\d+px;?)|(height:\s?\d+px;?)'
    tag = re.sub(r, "", tag)

The pattern seems to match in regex101 here but I'm getting a TypeError: 'expected string or buffer.

Works for me without modifications: ' background-color: red; font-size: 12px"'. — DYZ
– DYZ, Commented Feb 27, 2017 at 21:55
Are you sure tag is a string, and not a BeautifulSoup element or some other object? — Ben
– Ben, Commented Feb 27, 2017 at 21:58

m87 · Accepted Answer · 2017-02-27 22:07:24Z

1

Try using the following regex :

(?:width|height):\s?\d+px;?\s?

DEMO

python

import re
regex = r"(?:width|height):\s?\d+px;?\s?"
test_str = '<div id="attachment_9565" class="wp-caption aligncenter" style="width: 2010px;background-color:red;height:200px">'
subst = ""
result = re.sub(regex, subst, test_str, 0)
if result:
    print (result)

edited Feb 27, 2017 at 22:07

answered Feb 27, 2017 at 21:54

m87

4,5313 gold badges18 silver badges31 bronze badges

Sign up to request clarification or add additional context in comments.

Collectives™ on Stack Overflow

Regex for deleting two patterns in a string

1 Answer 1

Comments

Your Answer

Linked

Hot Network Questions

Collectives™ on Stack Overflow

1 Answer 1

Comments

Your Answer

Sign up or log in

Post as a guest

Linked

Related