1

I am trying to extract a string from this string:

...width="100%" height="166" scrolling="no" frameborder="no" src="https://w.soundcloud.com/player/?url=http%3A%2F%2Fapi.soundcloud.com%2Ftracks%2F84915247&color=da1716&auto_play=false&show_artwork=true"...

All I want is the URL portion:

http%3A%2F%2Fapi.soundcloud.com%2Ftracks%2F84915247&amp

This is the regex I made:

$regex = "/^httpamp$/", but it does not work.

4 Answers 4

1

this will work:

$regex = '~url=\K\S+?&amp~';

(the \K feature keeps only characters after its position)

or with a lookbehind:

$regex = '~(?<=url=)\S+?&amp~';
Sign up to request clarification or add additional context in comments.

Comments

1

You could use:

/url=([^ ]*)/

The URL would be in the 1st capturing group.

Comments

0

It is not advisable to parse HTML text with a regex, better to use DOM parser.

However if you know what you're doing then use following regex:

~\?url=([^;]+)~

Your desired URL is available in matched group # 1.

Live Demo: http://www.rubular.com/r/yiqtviBaSd

Comments

0

@anubhava solution is good, but I've tweaked it to improve it, and to support the new iframe HTML5 SoundCloud embed code.

\?url=([^"]+)

That will match the url but not the trailing quotation marks. This way a user could input a SoundCloud widget, then the code would strip out the widget leaving only url on the input.

Then when you render to screen (output from your db), a second script could be written to match a SoundCloud url beginning with http://api.soundcloud.com and wrap it in the widget.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.