src.match(/^(https?\:\/\/.*)\//)
I know regular expressions but the syntax is not familiar to me. Can somebody explain to me what it's matching?
Matches anything that starts with http:// or https:// followed by any number of any character (.*), followed by another / slash.
The / slashes need to be escaped. I don't know why the colon is escaped too.
.* is done greedily, so <a href="http://mysite.com/mypage.php">go to mypage</a> will result with http://mysite.com/mypage.php">go to mypage< in match[1]. I'm not saying this regex is flawed in that fashion, you just need to be aware of it when using.^ start of string( start of a capture group
http the characters "http"s? zero or one of the character "s"\: a colon character (escaped, though not necessary)\/\/ two forward slash characters (escaped so that it doesn't close the regex literal).* zero more more of any character, except a line break) end of the capture group\/ a forward slash chararacter (escaped so that it doesn't close the regex literal)The starting and ending / characters simply denote regular expression literal notation.
It's a pretty ordinary regex:
^ At the start of the string
( Start a capture
http Match "http" literally
s? Match an optional "s"
\: Match a literal colon
\/ Match a literal slash
\/ Match a literal slash
.* Then as many characters as possible
) End the capture
\/ Ending at a literal slash
The regex has the effect of capturing the protocol, host, and path from a URL and excluding any file at the end. For instance in the case of https://www.host.com/path/to/my/file.cgi, https://www.host.com/path/to/my would be captured.