I have two table columns, one with an id and the other with the webpage content storing href links. I would like to write an SQL query using regex that finds all href links within the table row and strips all other characters. Currently stuck with the code below.
SELECT id,web_data FROM web_data_table WHERE web_data REGEXP 'href'
Current output:
+----+----------------------------------------------------------------+
| id | web_data |
+----+----------------------------------------------------------------+
| 1 | random txt,href="link1" |
| 2 | random txt, random txt, href="link2", href="link3", random txt |
+----+----------------------------------------------------------------+
Desired output:
+----+---------------------------+
| id | web_data |
+----+---------------------------+
| 1 | href="link1" |
| 2 | href="link2" href="link3" |
+----+---------------------------+
web_datacolumns where there is at least one link and then use the regex engine of your programming language to find all links in each row.