2

I want to get all .mp4 URLs of this String using Regex.

Also I want to know how to get only the last .mp4 URL using Regex.

Thanks

contentType=application/x-mpegURL, url=https://video.twimg.com/amplify_video/822938952332144642/pl/BjHU8aBCbOgZNzXQ.m3u8}, 

Variant{bitrate=0, contentType=application/dash+xml, url=https://video.twimg.com/amplify_video/822938952332144642/pl/BjHU8aBCbOgZNzXQ.mpd}, 

Variant{bitrate=320000, contentType=video/mp4, url=https://video.twimg.com/amplify_video/822938952332144642/vid/320x180/YqZ72rzLj3VWVhy4.mp4}, 

Variant{bitrate=832000, contentType=video/mp4, url=https://video.twimg.com/amplify_video/822938952332144642/vid/640x360/A2vMgzo2ElpPP6TE.mp4}, 

Variant{bitrate=2176000, contentType=video/mp4, url=https://video.twimg.com/amplify_video/822938952332144642/vid/1280x720/j9xbNzRZqEbYs_2s.mp4}]}]";

2 Answers 2

4

Regex:

https?.*?\.mp4

Literal http

Followed by an optional 's': s?

Remove the question mark if they will all use HTTPS.

Followed by as few characters as possible: .*?

Followed by an mp4 extension (literal dot) \.mp4

Sign up to request clarification or add additional context in comments.

1 Comment

This should be the accepted answer! However, it would be helpful if you could provide some more information on how to exclude some Strings (like jpg) within the URL. i.e. I don’t want to match <img src="https://image.jpg"><a href="http://video.mp4">
0

2 Approaches:

  1. If you're sure the URL's will always begin with https:// and will not contain a mp4 after the complete URL is finished, then you can use pattern = "https://.*mp4";

    String[] arr = {
        "contentType=application/x-mpegURL, url=https://video.twimg.com/amplify_video/822938952332144642/pl/BjHU8aBCbOgZNzXQ.m3u8}",
    
        "Variant{bitrate=0, contentType=application/dash+xml, url=https://video.twimg.com/amplify_video/822938952332144642/pl/BjHU8aBCbOgZNzXQ.mpd}",
    
        "Variant{bitrate=320000, contentType=video/mp4, url=https://video.twimg.com/amplify_video/822938952332144642/vid/320x180/YqZ72rzLj3VWVhy4.mp4}",
    
        "Variant{bitrate=832000, contentType=video/mp4, url=https://video.twimg.com/amplify_video/822938952332144642/vid/640x360/A2vMgzo2ElpPP6TE.mp4}",
    
        "Variant{bitrate=2176000, contentType=video/mp4, url=https://video.twimg.com/amplify_video/822938952332144642/vid/1280x720/j9xbNzRZqEbYs_2s.mp4}]}]" 
    };
    String pattern = "https://.*mp4";
    Pattern r = Pattern.compile(pattern);
    
    for (String line : arr) {
        Matcher m = r.matcher(line);
        if (m.find()) {
            System.out.println(m.group(0));
        } else {
            System.out.println("NO MATCH");
        }
    }
    
  2. If not, to Support all types of URL's then change your pattern to what is defined here with a little modification,

    String pattern = 
        "(((ht|f)tp(s?)\\:\\/\\/|~\\/|\\/)|www.)" + 
        "(\\w+:\\w+@)?(([-\\w]+\\.)+(com|org|net|gov" + 
        "|mil|biz|info|mobi|name|aero|jobs|museum" + 
        "|travel|[a-z]{2}))(:[\\d]{1,5})?" + 
        "(((\\/([-\\w~!$+|.,=]|%[a-f\\d]{2})+)+|\\/)+|\\?|#)?" + 
        "((\\?([-\\w~!$+|.,*:]|%[a-f\\d{2}])+=?" + 
        "([-\\w~!$+|.,*:=]|%[a-f\\d]{2})*)" + 
        "(&(?:[-\\w~!$+|.,*:]|%[a-f\\d{2}])+=?" + 
        "([-\\w~!$+|.,*:=]|%[a-f\\d]{2})*)*)*" + 
        "(#([-\\w~!$+|.,*:=]|%[a-f\\d]{2})*)?\\b"+"mp4";
    

Output:

NO MATCH
NO MATCH
https://video.twimg.com/amplify_video/822938952332144642/vid/320x180/YqZ72rzLj3VWVhy4.mp4
https://video.twimg.com/amplify_video/822938952332144642/vid/640x360/A2vMgzo2ElpPP6TE.mp4
https://video.twimg.com/amplify_video/822938952332144642/vid/1280x720/j9xbNzRZqEbYs_2s.mp4

1 Comment

The first approache will not work since my urls are in one single line string not in new lines or arrays. but the second approache works perfect thanks

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.