16

In my rails app, I want to validate input on a string field containing any number of keywords (which could be more than 1 natural language word (e.g. "document number")). To recognize the individual keywords, I am entering them separated by ", " (or get their end by end of string).

For this I use

validates :keywords, presence: true, format: { with: /((\w+\s?-?\w+)(,|\z))/i, message: "please enter keywords in correct format"}

It should allow the attribute keywords (string) to contain: "word1, word2, word3 word4, word5-word6"

It should not allow the use of any other pattern. e.g. not "word1; word2;" It does incorrectly allow "word1; word2"

On rubular, this regex works; yet in my rails app it allows for example "word1; word2" or "word3; word-"

where is my error (got to say am beginner in Ruby and regex)?

8
  • There can be whitespace or hyphen in between 1+ word chars, right? Commented Nov 11, 2016 at 10:15
  • Try \A(\w+(?:[\s-]\w+)?)(?:,\s(\g<1>))*\z. Commented Nov 11, 2016 at 10:16
  • yes correct; eg "credit note" or "credit-note" or "credit - note"; but not "credit-" Commented Nov 11, 2016 at 10:16
  • Or \A(\w+(?:[\s-]*\w+)?)(?:,\s*(\g<1>))*\z Commented Nov 11, 2016 at 10:17
  • in rubular that works; now in my app it does allows for example "word1; word2" but not "word1; word2;". same as before. Commented Nov 11, 2016 at 10:23

1 Answer 1

17

You need to use anchors \A and \z and modify the pattern to fit that logic as follows:

/\A(\w+(?:[\s-]*\w+)?)(?:,\s*\g<1>)*\z/

See the Rubular demo

Details:

  • \A - start of string
  • (\w+(?:[\s-]*\w+)?) - Group 1 capturing:
    • \w+ - 1 or more word chars
    • (?:[\s-]*\w+)? - 1 or 0 sequences of:
      • [\s-]* - 0+ whitespaces or -
      • \w+ - 1 or more word chars
  • (?:,\s*\g<1>)* - 0 or more sequences of:
    • ,\s* - comma and 0+ whitespaces
    • \g<1> - the same pattern as in Group 1
  • \z - end of string.
Sign up to request clarification or add additional context in comments.

2 Comments

Wiktor, I need to extend the regex to allow "credit note, credit-note, credit note date, credit notes" pattern. i.e. any number of words followed by comma. Have tried now for 2+ hrs but can't find a solution. Help appreciated.
Sorry, there is a national holidary here in Poland, I was offline. I understand you want to match any number of hyphenated/space-separated words that may be followed with 0+ sequence of a comma+space and hyphenated/space-separated words. I think all you need is to replace the ? quantifier in Group 1 with *: \A(\w+(?:[\s-]*\w+)*)(?:,\s*\g<1>)*\z.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.