4

I've got a table with blob text fields. This blob has a lot of html in it. One of the pieces of html is an h2. I would like to find all instances of a word in the h2 tag, and replace it with another word (while leaving other words the same).

For example, I would like to replace the h2 "wiggles" with "bumbles" in the following:

Before:

<h2>This is some wiggles html!</h2>
<p>And here is some more wiggles html that could be ignored</p>
<h2>And this is a decoy h2</h2>

After:

<h2>This is some bumbles html!</h2>
<p>And here is some more wiggles html that could be ignored</p>
<h2>And this is a decoy h2</h2>

A pitfall I'm concerned about is the regex not stopping at the end of the first h2, and rather continuing through to the last closing of it.

I have access to shell and phpmyadmin.

2
  • 5
    Is it wrong to want to +1 just for bumbles and wiggles? Commented Jul 1, 2010 at 20:25
  • Haha, I won't complain :). Gotta keep things fun ;) Commented Jul 1, 2010 at 20:33

3 Answers 3

3

Replacing text in MySQL with Regular Expressions

You can add a library to MySQL to gain this feature.

Adding: LIB_MYSQLUDF_PREG
Allows: Regular expression search & replace using PCRE.
Site: http://www.mysqludf.org/lib_mysqludf_preg/

Examples:

SELECT PREG_REPLACE('/(.*?)(fox)/' , 'dog' , 'the quick brown fox' );

Yields:

the quick brown 

Matching HTML with Regular Expressions

Parsing HTML with regexp is not easy and has a lot of pitfalls. However, your example is simple enough that you should be able to do what your looking to do.

I think this will be helpful: http://haacked.com/archive/2004/10/25/usingregularexpressionstomatchhtml.aspx

Sign up to request clarification or add additional context in comments.

Comments

1

There is no regexp replace feature in mySQL proper: The regex functions are match only.

There seems to be a user defined function that adds the functionality somehow, but it requires re-compiling mySQL and is probably not an option.

I'd recommend doing this using a programming/scripting language like PHP, using its built-in regex replace functions to change the content, and update the records.

Edit: overlooked the php tag.

1 Comment

Yeah, Oracle & PostgreSQL are the only db's I know with native regex replace functionality. SQL Server 2005+, you have to build it yourself via CLR functions...
1

Html is not a regular language therefore trying to parse it with regex is not the best option. In my opinion i would want to leverage a html parser to do this job. Here is a sample parser.

Enjoy!

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.