2

Note: this is a theoretical question about PHP flavor of regex, not a practical question about validation in PHP. I am merely using Domain Names for lack of a better example.

"Second Level Domain" refers to the combination of letters, numbers, period signs, and/or dashes that are placed between http:// or http://www. and .com (.co, .info, .etc) .

I am only interested in second level domains that use English version of Latin alphabet.

This pattern:

[A-Za-z0-9.-]+

matches valid domain names, such as stackoverflow, StackOverflow, stackoverflow.co (as in stackoverflow.co.uk), stack-overflow, or stackoverflow123.

However, the same pattern would also match something like stack...overflow, stack---over--flow, ........ , -------- , or even . and -.

How can that pattern be rewritten, to indicate that period signs and dashes, even though they can be used multiple times in a node,

  • cannot be used without other symbols,
  • cannot be placed twice or more side by side with each other,
  • and cannot be placed in the beginning or end of the node?

Thank you in advance!

4
  • 1
    Thank you, @Gordon, your edit makes the question much clearer! Commented Feb 14, 2013 at 15:51
  • You are welcome. Tbh, I am not sure using Second Level Domains is a good example. At least, stack---overflow.com is a valid SLD. Maybe you can rewrite it to exclude any hints to Domains at all. Your rules for what the Regex should allow are pretty clear without using SLDs as the example purpose. Commented Feb 14, 2013 at 16:00
  • Hmm... I'm a bit reluctant to make the question appear too abstract. Someone may move it to some other StackExchange website. Commented Feb 14, 2013 at 16:03
  • the only valid separator in a DNS name is .. anything else is part of that "level" of the FQDN. I don't think .. is valid in DNS, since that'd mean an "empty" level. Commented Feb 14, 2013 at 16:03

2 Answers 2

1

I think something like this should do the trick:

^([a-zA-Z0-9]+[.-])*[a-zA-Z0-9]+$

What this tries to do is

start at the beginning of string, end at the end

one or more letter or digit
followed by either dot or hypen

the group above repeated 0 or more times

followed by one or more letter or digit

Sign up to request clarification or add additional context in comments.

Comments

1

Assuming that you are looking for a regex that does not allow two consecutive . or - you can use:

^[a-zA-Z0-9]+([-.][a-zA-Z0-9]+)*$

regexr demo

2 Comments

Thank you @Salman A, but wouldn't it validate something like this: stack-overflow, while considering stack-over-flow as invalid?
Thank you for the edit @Salman A, and for the link. I wasn't aware of that regex tool, it's very useful!

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.