7

Say I have the following text

..(content).............
<A HREF="http://foo.com/content" >blah blah blah </A>
...(continue content)...

I want to delete the link and I want to delete the tag (while keeping the text in between). How do I do this with a regular expression (since the URLs will all be different)

Much thanks

2

8 Answers 8

18

This will remove all tags:

preg_replace("/<.*?>/", "", $string);

This will remove just the <a> tags:

preg_replace("/<\\/?a(\\s+.*?>|>)/", "", $string);
Sign up to request clarification or add additional context in comments.

3 Comments

won't that wipe out every tag?
isn't that what was asked for?
perfect! direct and strict.
16

Avoid regular expressions whenever you can, especially when processing xml. In this case you can use strip_tags() or simplexml, depending on your string.

Comments

4
<?php
//example to extract the innerText from all anchors in a string
include('simple_html_dom.php');

$html = str_get_html('<A HREF="http://foo.com/content" >blah blah blah </A><A HREF="http://foo.com/content" >blah blah blah </A>');

//print the text of each anchor    
foreach($html->find('a') as $e) {
    echo $e->innerText;
}
?>

See PHP Simple DOM Parser.

Comments

3

Not pretty but does the job:

$data = str_replace('</a>', '', $data);
$data = preg_replace('/<a[^>]+href[^>]+>/', '', $data);

1 Comment

strip_tags works well when HTML il well formed. I had the problem with an HTML file where attributes were missing quotes, and this approach worked. thanks!
1

strip_tags() can also be used.

Please see examples here.

2 Comments

Welcome to Stack Overflow! While this may answer the question, it would be better to include the essential parts of the answer here, and provide the link for reference.
@senderle, I generally agree with you but this time it's not "any" external page, it's PHP.net's official page which describes strip_tag function and copying code samples here isn't necessary ;) This answer already contains function name and its link-reference.
1
$pattern = '/href="([^"]*)"/';

Comments

0

I use this to replace the anchors with a text string...

function replaceAnchorsWithText($data) {
        $regex  = '/(<a\s*'; // Start of anchor tag
        $regex .= '(.*?)\s*'; // Any attributes or spaces that may or may not exist
        $regex .= 'href=[\'"]+?\s*(?P<link>\S+)\s*[\'"]+?'; // Grab the link
        $regex .= '\s*(.*?)\s*>\s*'; // Any attributes or spaces that may or may not exist before closing tag
        $regex .= '(?P<name>\S+)'; // Grab the name
        $regex .= '\s*<\/a>)/i'; // Any number of spaces between the closing anchor tag (case insensitive)

        if (is_array($data)) {
            // This is what will replace the link (modify to you liking)
            $data = "{$data['name']}({$data['link']})";
        }
        return preg_replace_callback($regex, array('self', 'replaceAnchorsWithText'), $data);
    }

Comments

-2

use str_replace

2 Comments

how should he do this with different href strings ?
(I'm not the downvoter, but as it seems he will not explain why he downvoted, which is not that helpful, might I add, let's guess why...) With str_replace, you cannot specify a "pattern", which is a problem, as the URL can change ; and even if it did not change, you'd have to use two calls to str_replace : one for the openig tag, and one for the closing one, as you want to keep what is beetween.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.