1

I am trying to create a regex for first time for the URL having following conditions:

  1. Starts with or without the www
  2. Do not starts with http:// or https://
  3. Allowed special characters are: hyphen (-) and dot (.)
  4. Ends with a TLD like: .com, .io, .co.in, etc.

Some of the examples for reference:

Valid:

xyz.com
xyz-px.com
www.xyz.com
abc.xyz.px.io
www.abc.xyz.px.io

Invalid:

xyz-.com
xy_pm.com
http://xyz.io
https://xyz.io
http://www.pm.xyz
[email protected]
xyz.io$#
www.xy
xyz.io-

I have created a regex

/(^(?!https|http)?:\/\/(www\.?)[a-zA-Z0-9][a-zA-Z0-9-]+[a-zA-Z0-9]\.[^\s]{2,})/gi

But it is not working as desired.

6
  • Maybe like ^[^\W_]+(?:[-.][^\W_]+)*(?:\.[a-z]{2,})+$ regex101.com/r/E08Ce3/1 Commented Jun 7, 2021 at 9:08
  • It won't work for the fouth condition ending with Top Level Domain Commented Jun 7, 2021 at 10:59
  • What does not work exactly? Commented Jun 7, 2021 at 11:01
  • www.xy it becomes valid url but it is not Commented Jun 7, 2021 at 11:03
  • 1
    Great! This will work in my case Commented Jun 7, 2021 at 11:47

3 Answers 3

3

You might use

^[^\W_]+(?:[-.][^\W_]+)*\.(?:io|com)$

The pattern matches:

  • ^ Start of string
  • [^\W_]+ Match 1+ word chars without an _
  • (?:[-.][^\W_]+)* Optionally repeat matching . or - and 1+ word chars without _
  • \. Match a literal dot
  • (?:io|com) Match any of the alternatives
  • $ End of string

Regex demo

Sign up to request clarification or add additional context in comments.

Comments

0

You can use

^(?!(http|https):\/\/)(?:www\.)?(?:(?:[a-zA-Z0-9])(?:[-\.][a-zA-Z0-9])?)+(?:\.[a-zA-Z0-9]+)$

DEMO - REGEXR

And the flowchart is: enter image description here

3 Comments

How to make sure that the string ends with a tld because now it can validate www.xy which is not a valid URL
That's very hard to do this because there could be more than 1000 top level domain
Yess but at the end there must be any top level domain not necessarily .com,.in it can have .xyz or anything ,ie. www.xy.xy is allowed but not www.xy
-1

The modified regex is

/^(?!(https|http):\/\/)(?:www\.)?[a-zA-Z0-9][a-zA-Z0-9-]+[a-zA-Z0-9](?:.[^\s\W]{2,})+$/gi

https://regex101.com/r/aNy8St/1

3 Comments

Explain your modifications please.
It won't work in case we want the urls to end with a TLD like: .com, .io, .co.in, etc.
There's no need to bother with the 'www' part - it's just a subdomain, which is handled already by the next capture group.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.