1

I have built an API where you can register a callback URL.

The URL's are validated using the Apache UrlValidator class.

I now have to add a feature that allow to add placeholders in the configured URL.

https:/foo.com/${placeholder1}/bar/${placeholder2}

These placeholders will be dynamically replaced using the Apache StrSubstitutor or something similar.

Now my issue, how do I validate the URL's with the placeholders ?

I have thought of a solution :

  • I replace the expected placeholders with an example value
  • Then I Validate the URL using the Apache UrlValidator

My issue with this solution is that the Apache UrlValidator only returns a boolean so the error message will be quite ambiguous.

Is there another solution than creating my own regex ?

Update : following discussions in the comments

There is a finite number of allowed placeholders.

The format of the Strings that will replace the placeholders is also known.

The first objective is to be able to check if the given URL which eventually contains placeholders is valid at the time it is configured.

The second objective is, if the URL is not valid return an intelligible error message.

There are multiple error cases :

  • A placeholder used in the URL is not in the allowed placeholder list
  • The URL in not valid independently of the placeholders
10
  • When are you going to validate: before or after the substitution? Commented Jan 31, 2018 at 15:52
  • Before the substitution, when the callback configuration is created. The placeholders are known, I may do a sample substitution. Commented Jan 31, 2018 at 15:54
  • Are you going to sanitize replacement strings separately? Let say you have a good skeleton and then use something like ":::::" as the replacement? Commented Jan 31, 2018 at 15:57
  • The replacement strings are already specified and sanitized (serial numbers and ids...). Commented Jan 31, 2018 at 15:59
  • Is this https:/foo.com/ a valid url ? Also, where do you expect the placeholder's to be within the url ? Commented Jan 31, 2018 at 17:01

2 Answers 2

1

For a minimal URL validation, you could use the java.net.URL constructor (it will work with your https:/foo.com/${placeholder1}/bar/${placeholder2} example).

According to the docs, it throws:

MalformedURLException - if no protocol is specified, or an unknown protocol is found, or spec is null.

You can then leverage the URL methods as a bonus, to get parts of it such as path, protocol, etc.

I would definitely advise against re-inventing the wheel with regex for URL validation.

Note that java.net.URI has a much stricter validation and would fail your example with placeholders as is.

Edit

As discussed, since you need to validate placeholders as well, you probably want to actually try to fill them first and fail fast if something's wrong, then proceed and validate the populated URL against java.net.URI, for strict validation.

General caveat

You might also want to make your life easier and leverage an existing framework that would allow you to use annotated path variables in the first place (e.g. Spring, etc.), but that's quite a broad discussion.

Sign up to request clarification or add additional context in comments.

6 Comments

I should have added that there is a list of "allowed" placeholders and I want to check that only those are used.
@A.Malle and are you doing that originally by configuring the UrlValidator?
the placeholders handling is part of the new feature. For now the URL's did not allowed any placeholders.
@A.Malle right, so you could use URL for the basics, then regex to validate anything found in between ${ and } (reluctantly matched), against a set of accepted placeholders. Of course, things would get more complicated if there was a possibility to have "malformed" placeholders and such like. Or, you could try and populate with StrSubstitutor, fail fast if something wrong happens, and otherwise validate against the end result with stricter URI.
however if I use java.net.url for the validation with the placeholders, then I just check the placeholders with a regex... That may not be too ugly ?
|
0

The UrlValidator from commons-validator before version 1.7 accepted placeholders in URLs. It used a custom regexp instead of validating through new URI()

So, while it's not a ready-to-use solution, you could get the source code of UrlValidator and DomainValidator#unicodeToASCII from 1.6 version of the validator library and compose a custom PlaceholdersAllowingUrlValidator.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.