0

I'm using Ruby and I have a string that is formatted like this:

id,part#,foreign_key_id,false

OR

id,part#,foreign_key_id,false|id,part#,foreign_key_id,true

the actual data would look something like this:

253,RFTL 9984.0,588,false

There may be one array with 0 pipe delimiters, or multiple arrays with multiple delimiters. I'm using the following regex to match the array data:

/\d*,.*,\d*,(true|false)/

However, I am not sure how to account for 0 or more | characters with 0 or more arrays. I thought about taking my original string and splitting it by | and then checking that each indices of the array matches my regex above, but this would be significantly slower than matching the entire string since I would have to loop through the every array indices so I am looking for a regex pattern to match the entire string.

2
  • If you want a regex, you need to know what requirements it must meet. Please define them in a clear way. Commented Nov 13, 2020 at 17:46
  • Could you please edit to explain what you are attempting to do with the strings (before forcusing on the problem you are having)? For example, do you wish to save the comma-separated values to variables? In your second example, does false|id mean that that "field" contains either "false" or an id (that represents an integer). If so, do you wish to save it to one variable if it's "false" and another if it's an id? It would also be helpful to give a few example strings (not just the one). Commented Nov 14, 2020 at 22:39

2 Answers 2

1

You can use an unroll the loop approach by first matching the pattern, and then optionally repeat a | followed by the same pattern.

  • If you don't need the value of the capturing group, you can make it non capturing (?:true|false)
  • This part .* matches first until the end of the string, and will cause some backtracking. If you don't want to match a comma, you could use a negated character class [^.]+
  • Using \d* will match optional digits. If there should at least be a single digit present, you could use \d+

The pattern might look like

\A(\d+,[^,]+,\d+,(?:true|false))(?:\|\d+,[^,]+,\d+,(?:true|false))*\z

Rubular demo

Or a bit compacter with a capturing group for the first pattern, and recursing the first subpattern.

\A(\d+,[^,]+,\d+,(?:true|false))(?:\|\g<1>)*\z

Rubular demo

Sign up to request clarification or add additional context in comments.

Comments

0

You can use this regex - https://regex101.com/r/wLyNPv/1/

Pattern: ^(?:\d*,[^,]*,\d*,(?:true|false)\|?)+(?<!\|)$

Changes from your pattern

  • [^,]* instead of .*. To ensure no characters after the comma are matched
  • \|? - optionally match literal |
  • ^(?:\d*,[^,]*,\d*,(?:true|false)\|?)+ - match the array 1 or more times
  • (?<!\|)$ - use negative lookbehind to ensure | is not at the end of the string

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.