1

I've written a regex to help validate a String for game character names. It's somehow passing seemingly invalid strings and not passing seemingly valid strings.

Requirements:

  • Starts with a capital letter
  • Has any number of alphanumeric characters after that (this includes spaces)

This is the rails code that does the validation in the Character Model:

validates :name, format: { with: %r{[A-Z][a-zA-Z0-9\s]*} }

Here's the unit test I'm using

test "character name should be properly formatted and does not contain any special characters" do
    character = get_valid_character
    assert character.valid?

    character.name = "aBcd"
    assert character.invalid?, "#{character.name} should be invalid"

    character.name = "Number 1"
    assert character.valid?, "#{character.name} should be valid"

    character.name = "McDonalds"
    assert character.valid?, "#{character.name} should be valid"

    character.name = "Abcd."
    assert character.invalid?, "#{character.name} should be invalid"

    character.name = "Abcd%"
    assert character.invalid?, "#{character.name} should be invalid"
end

The problems: The regex passes "aBcd", "Abcd.", and "Abcd%" when it shouldn't. Now, I know this works because I tested this out in Python and it works just as you would expect.

What gives?

Thank you for your help!

2 Answers 2

6

Regular expressions look for matches anywhere in the given string unless told otherwise.

So the test string 'aBcd' is invalid, but it contains a valid substring: 'Bcd'. Same with 'Abcd%', where the valid substring is 'Abcd'.

If you want to match the entire string, use this as your regex:

# \A matches string beginning, \z matches string end
%r{\A[A-Z][a-zA-Z0-9\s]*\z}

PS: Some people will say to match the beginning of a string with ^ and the end with $. In Ruby, those symbols match the beginning and end of a line, not a string. So "ABCD\n%" would still match if you used ^ and $, but won't match if you use \A and \z. See the Rails security guide for more on this.

Sign up to request clarification or add additional context in comments.

3 Comments

Would that properly forbid newlines? Some Google-hits suggest that in Ruby, ^ and $ default to meaning "start-of-line" and "end-of-line" (like other languages' multiline-mode regexes), so you need to use \A and \z to mean "start-of-string" and "end-of-string". (I have not tested this myself.)
Rushing to get the answer, and I forgot about that gotcha. :)
Thank you that clears this up a lot! It works wonderfully now!
0

If you only want to match the capital letter at the beginning of the string, you need to put in the "start of line" marker ^ so it would look like:

validates :name, format: { with: %r{^[A-Z][a-zA-Z0-9\s]*} }

Check out Rubular to play around with your regex

2 Comments

Rubular is actually where I tested the answer to this question. :) Great resource.
No problem. so close though, but Jergason's is exactly what you need. Good luck!

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.