0

Let's say I have a db url string which looks like this:

"mysql2://foo:[email protected]/fizz?reconnect=true"

and I came up with a regex for extracting a username, password and host name:

/\w:\/\/(\w+):/ # extracts username ("foo")
/\w:\/\/\w+:(\w+)/ # extracts password ("bar")
/\w:\/\/\w+:\w+@([\w+-\/]+)/ # extracts host name ("baz.com")

How can this regex be improved / made more efficient?

3
  • 1
    it is pretty safe to say, regex runtime is not polynomial; so that is not an efficient approach for large data anyway; how about a small data like that? doesn't even matter bro Commented Apr 3, 2020 at 17:42
  • 1
    check out this SO question What's the Time Complexity of Average Regex algorithms? Commented Apr 3, 2020 at 17:50
  • thanks! very good point about data size! Commented Apr 3, 2020 at 18:06

1 Answer 1

3

Here's a regex combining your 3 into one regex with 3 different capturing groups:

\w:\/{2}(\w+):(\w+)@(\w+\.\w+)

They seem to be pretty straightforward and fast regexes to begin with, but here's a good tool to test your regexes: https://regex101.com/. It shows you how many steps it takes to run based on your samples and the capture groups. For me it's one of the first tools I pull up when working on a new regex that isn't simple.

As for improving regexes, you want to try and make the engine perform as few steps as possible. So, quick matching and quick failure in the regex will help. For example, if it's always mysql2, you can start the regex with 2:\/{2} instead and that cuts out 10 steps based on the regex I have above.

Sign up to request clarification or add additional context in comments.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.