8

I have a string that looks like:

$string = '<a href="http://google.com">http://google.com</a>';

How can I remove the http:// part from the link text, but leave it in the href attribute?

1

7 Answers 7

11

Without using a full blown parser, this may do the trick for most situations...

$str = '<a href="http://google.com">http://google.com</a>';

$regex = '/(?<!href=["\'])http:\/\//';

$str = preg_replace($regex, '', $str);

var_dump($str); // string(42) "<a href="http://google.com">google.com</a>"

It uses a negative lookbehind to make sure there is no href=" or href=' preceding it.

See it on IDEone.

It also takes into account people who delimit their attribute values with '.

Sign up to request clarification or add additional context in comments.

1 Comment

that works, tx. nice site this ideone, you can actually run php code on it :)
9
$string = '<a href="http://google.com">http://google.com</a>'; 
$var = str_replace('>http://','>',$string); 

Just tried this in IDEone.com and it has the desired effect.

3 Comments

Just worth throwing out there, this won't catch > http://..., but if you trim out the spaces beforehand this should do it.
Nah, a space between the <a> tags, like <a href='...'> Text </a>
@Robert Or newline if you indent your text nodes (I often do for readability.)
4

In this simple case, the preg_replace function will probably work. For more stability, try using DOMDocument:

$string = '<a href="http://google.com">http://google.com</a>';
$dom = new DOMDocument;
$dom->loadXML($string);

$link = $dom->firstChild;
$link->nodeValue = str_replace('http://', '', $link->nodeValue);
$string = $dom->saveXML($link);

1 Comment

Just an edge case, you may want to use regex to make sure you strip it off from the beginning only, what about a link like http://example.com/send-to-friend?url=http://somewhere.com ? Also, +1 for using a parser.
4
$str = 'http://www.google.com';
$str = preg_replace('#^https?://#', '', $str);
echo $str; // www.google.com

that will work for both http:// and https://

running live code

Comments

2

Any simple regular expression or string replacement code is probably going to fail in the general case. The only "correct" way to do it is to actually parse the chunk as an SGML/XML snippet and remove the http:// from the value.

For any other (reasonably short) string manipulation code, finding a counterexample that breaks it will be pretty easy.

2 Comments

Well, the incorrect way is still more appropriate. There's not enough edge case potential to warrant using the overkill solution (html parser) here. A regular expression is sufficient. (The no regex for html parsing meme is somewhat dated.)
One man's "meme" is another man's correctness. We don't know how critical it is for this to work all the time, or how trustworthy the input might be. Regex will probably work, but I don't want to give @Alexandra the impression that their problem solved for every possible input.
2

Assuming that "http://" always appears twice on $string, search the string for "http://" backwards using strripos. If the search succeeds, you'll know the start_index of the "http://" you want to remove (and you know the length of course). Now you can use substr to extract everything that goes before and after the chunk you want remove.

Comments

1
$string = '<a href="http://google.com">http://google.com</a>';
$var = explode('http://',$string);
echo $var[2]; 

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.