0

I've got a large database of projects and issue trackers, some of which have urls.

I'd like to query it to figure out a list of urls for each project, but many have extra data I'd like to avoid.

I'd like to do something like this:

substring(tracker_extra_field_data.field_data FROM 'http://([^/]*).*')

Except some urls are https, and I'd like to capture that as well as the first sub directory.

For example, given the url:

https://dev.foo.com/bar/action/?param=val

I'd like the select to return:

https://dev.foo.com/bar/

Is there a semi-simple way to do this with substring/regex in pgsql?

2 Answers 2

4

try this:

select substring('https://dev.foo.com/bar/action/?param=val' from '(https?://([^/]*/){1,2})');

template1=# select substring('https://dev.foo.com/bar/action/?param=val' from '(https?://([^/]*/){1,2})');
        substring
-------------------------
 https://dev.foo.com/bar/
(1 row)

template1=# select substring('http://dev.foo.com/bar/action/?param=val' from '(https?://([^/]*/){1,2})');
       substring
------------------------
 http://dev.foo.com/bar/
Sign up to request clarification or add additional context in comments.

2 Comments

+1 Since you had most of it first. Consider this sqlfiddle, though.
yeah My original I had the / optional. (and Id didn't anchor the https.. if I use ` (https?://([^/]*/?){1,2})` it gets more. OP asked for first directory as as well which is why went with the {1,2}. but yep all depends on how normalized the data inside is.
0

Updated after I didn't read the Q properly at first.

Use the pattern

^https?://[^/]+(?:/[^/]+)?/?

^ .. start of string
? .. zero or one atoms
(?:) .. non-capturing parens
[^/]+ .. any character except /, 1 or more of them

This only accepts URLs starting with http:// or https:// (protocol header required).

->SQLfiddle with a bigger test case.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.