1

I a newbie to regx. I am unable to get things right with his. Can some one help me with this. I have a web page source code, where in i need to locate this pattern:

http://cdn1b.mobile.website.com/videos/201102/18/174092/240P_245K_174092.mp4?rs=125&ri=600&s=1394217550&e=1394228350&h=b99d1d9d38da8ba3ab99601de0cf794e

I need to get only one instance of this even if there are more. But i am getting selection from first http in the page to the last mp4?rs=125&ri=600&s=1394217550&e=1394228350&h=b99d1d9d38da8ba3ab99601de0cf794e

in the page. I am using php.

Edited: This is what i was trying. (sorry if its stupid)

(http(s?):).*\.(mp4|flv|mkv|avi)(\?rs=[A-Za-z0-9=]+).*.(ri=[A-Za-z0-9=]+).*.(s=[A-Za-z0-9=]+).*.(e=[A-Za-z0-9=]+).*.(h=[A-Za-z0-9=]+)

Edited: Here is a pastebin of what i am getting with my expression

http://pastebin.com/trmNzMti

9
  • 1
    Can you show your current regex/code? Commented Mar 7, 2014 at 22:00
  • Please provide your current regexp in php and what exactly you want as output Commented Mar 7, 2014 at 22:03
  • @tenub I think he's after URIs of that form, not a specific string. Commented Mar 7, 2014 at 22:03
  • Pattern example: cdn1b.mobile.website.com/videos/201102/18/174092/… cdn1b.mobile.website.com/videos/201102/18/174092/240P_245K_174092 This part may change in future so the pattern will be based on : Starts with http or https ends with .mp4?rs=125&ri=600&s=1394217550&e=1394228350&h=b99d1d9d38da8ba3ab99601de0cf794e Commented Mar 7, 2014 at 22:04
  • PHP has special parameters on its regex functions to limit the match count. Otherwise use callback and break the function after 1 loopcycle. Commented Mar 7, 2014 at 22:07

2 Answers 2

2

This should do it :

preg_match_all("/(http(s?):)([^\s]+)\.(mp4|flv|mkv|avi)(\?rs=[A-Za-z0-9=]+)([^\s]+)(ri=[A-Za-z0-9=]+)([^\s]+)(s=[A-Za-z0-9=]+)([^\s]+)(e=[A-Za-z0-9=]+)([^\s]+)(h=[A-Za-z0-9=]+)/", $html, $matches, PREG_SET_ORDER);

// each occurrence
foreach ($matches as $val) {
    echo "matched: " . $val[0] . "\n";
}

// first occurrence
echo $matches[0][0]

Changed * to ([^\s]+) matches anything except spaces, you can add other characters you wish to exclude between desired matches.

Sign up to request clarification or add additional context in comments.

3 Comments

It didnt work it gives the same output as mine. Selects from first occurrence of http till last occurrence of .mp4?.... Please see the pastebin link
Ok for some reason that just worked!! thank you very much. It was a mountain for me by myself. Unfortunately i am a newbie here and cant voteup
No problem, pleased it helped.
1

If you want to find a url in string by regExp check this link which has a full patterns for different requests

If you have the url string and want to find a parameter in the query string use parse_url() command parse_url

Example:

$query = parse_url('http://cdn1b.mobile.website.com/videos/201102/18/174092/240P_245K_174092.mp4?rs=125&ri=600&s=1394217550&e=1394228350&h=b99d1d9d38da8ba3ab99601de0cf794e');
//to get whatever after http or https and before the filename in url you can use this:
$specifict_section = $query['host'].str_replace(basename($query['path'], '', $query['path']));

$query_parts = explode('&', $query['query']);
$params = array(); 
foreach ($query_parts as $param) { 
    $item = explode('=', $param); 
    $params[$item[0]] = $item[1]; 
} 

// Do your stuffs with $params
print_r( $params );

2 Comments

javad i dont want the params. SEE for example if this is the url: cdn1b.mobile.website.com/videos/201102/18/174092/… I want to find the url in the page code. the part after http, namely: cdn1b.mobile.website.com/videos/201102/18/174092/ May be anything. it may even change in future. the trailer part is what identifies the url in the page.
Great, still you can use parse_url() check my added section to the top solution, I am sure it will help you

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.