0

Trying to create a regex that gets everything after the final "/" in a possible URL, providing the final character isn't a "/".

I have this so far:

(?<url>(http(s)?://)?([\w-]+\.)+[\w-]+[.com]+?[a-zA-Z0-9\.\/\?\:@\-_=#]+(/[/?%&=]*))

My test URLs are

https://linkedin.com/in/username

https://www.facebook.com/username

username

https://plus.google.com/u/0/username/

This passes on all except the final one. The correct result would be username for each test.

7
  • 3
    what is output for third and last one? Commented May 24, 2016 at 14:42
  • 3
    Why use regular expression and not just Substring based on lastIndexOf Commented May 24, 2016 at 14:43
  • 1
    if (url.Contains("/") && !url.EndsWith("/")) { result = url.Substring(url.LastIndexOf('/') + 1); } - and so on, just add the necessary logic for those strings that do not contain / Commented May 24, 2016 at 14:45
  • 1
    Also System.Uri has some useful features you may use. Commented May 24, 2016 at 14:46
  • @AdrianoRepetti is right, use System.Uri for url manipulation and validation. Much easier and more reliable. Commented May 24, 2016 at 14:54

3 Answers 3

1

I think you want can benefit of the Uri object the framework provides. It does not provide the whole solution (segments ending with "/"), but it does most of the job.

    List<string> strings = new List<string>
    {
        "https://linkedin.com/in/username",
        "https://www.facebook.com/username",
        "username",
        "https://plus.google.com/u/0/username/"
    };

    List<Tuple<int, string>> results = new List<Tuple<int, string>>();

    for (int i = 0; i < strings.Count; i++)
    {
        var s = strings.ElementAt(i);
        try
        {
            Uri uri = new Uri(s);
            var lastSegment = uri.Segments.LastOrDefault();
            if (!lastSegment.EndsWith("/") && !string.IsNullOrEmpty(lastSegment))
                results.Add(new Tuple<int, string>(i, lastSegment));
        }
        catch (Exception ex)
        {
            //s is not a valid uri and thus a valid uri object could not be created out of it
            results.Add(new Tuple<int, string>(i, ex.Message));
        }
    }

    foreach (var segment in results)
        Console.WriteLine(segment);

Output: (tuples where the number is the element index in your sample) (the last element is not added because you do not want segments ending with /)

enter image description here

Sign up to request clarification or add additional context in comments.

1 Comment

Well, the guys were already talking about using System.Uri above... I came too late.
0

If you really want to go with regex (demo):

\/(\w+)$|(\w+)$|\/(\w+)\/$

If you want to go full C# and a bit of Linq:

List<string> urls = new List<string>
{
    @"https://linkedin.com/in/username",
    @"https://www.facebook.com/username",
    @"username",
    @"https://plus.google.com/u/0/username/",
};

foreach (string url in urls)
{
    Console.Out.WriteLine(url.TrimEnd({'/'}).Split('/').Last());
}

Comments

0
(?<url>(http(s)?://)?([\w-]+\.)+[\w-]+[.com]+?[a-zA-Z0-9\.\/\?\:@\-_=#]+(/*[/?%&=]*)) 

should cover all except "username"? the regex Thomas wrote should cover this?

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.