16

From any URL I want to extract its path.

For example:

URL: https://stackoverflow.com/questions/ask Path: questions/ask

It shouldn't be difficult:

url[/(?:\w{2,}\/).+/]

But I think I use a wrong pattern for 'ignore this' ('?:' - doesn't work). What is the right way?

2 Answers 2

36

I would suggest you don't do this with a regular expression, and instead use the built in URI lib:

require 'uri'

uri = URI::parse('http://stackoverflow.com/questions/ask')

puts uri.path # results in: /questions/ask

It has a leading slash, but thats easy to deal with =)

Sign up to request clarification or add additional context in comments.

1 Comment

I agree, using the built in class is best. However, if you are interested in learning how to parse URI's for academic reasons, check out the source code to lib/uri/common.rb -- I've linked to Rubinius' source code because I find it easy to read. The (very complex) regular expressions are at the top of the file, absolute URI is at line 188.
3

You can use regex in this case, which is faster than URI.parse:

s = 'http://stackoverflow.com/questions/ask'

s[s[/.*?\/\/[^\/]*\//].size..-1]
# => "questions/ask"  (6,8 times faster)

s[/\/(?!.*\.).*/]
# => "/questions/ask" (9,9 times faster, but with an extra slash)

But if you don't care with the speed, use uri, as ctcherry showed, is more readable.

4 Comments

If you want correctness (e.g. s = 'http://stackoverflow.com/questions//ask/stuff'), use URI.parse. Don't worry about the speed difference until you have found URI parsing to be a real bottleneck in your code.
A negative lookahead! Wow, crazy! I have to go figure out how that one works.
In a development environment a common URL may be http://localhost/test. The negative lookahead expression will fail.
I would now strongly recommend one to use URI.parse regardless of such speed improvements of the regex version. URI.parse is much more solid and will work on any edges cases as @ClintPachl noted.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.