1

I have two table columns, one with an id and the other with the webpage content storing href links. I would like to write an SQL query using regex that finds all href links within the table row and strips all other characters. Currently stuck with the code below.

SELECT id,web_data FROM web_data_table WHERE web_data REGEXP 'href'

Current output:

+----+----------------------------------------------------------------+
| id |                            web_data                            |
+----+----------------------------------------------------------------+
|  1 | random txt,href="link1"                                        |
|  2 | random txt, random txt, href="link2", href="link3", random txt |
+----+----------------------------------------------------------------+

Desired output:

+----+---------------------------+
| id |         web_data          |
+----+---------------------------+
|  1 | href="link1"              |
|  2 | href="link2" href="link3" |
+----+---------------------------+

4
  • Which version of MySQL are you using? Commented Oct 30, 2019 at 9:16
  • Currently using 8.0 Commented Oct 30, 2019 at 9:24
  • this might help. stackoverflow.com/questions/2412895/… Commented Oct 30, 2019 at 10:07
  • I think you will have to select and retrieve rows with complete web_data columns where there is at least one link and then use the regex engine of your programming language to find all links in each row. Commented Oct 30, 2019 at 10:19

0

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.