0

Need to extract values from a string using regex(for perf reasons). Cases might be as follows:

  1. RED,100
  2. RED,"100"
  3. RED,"100,"
  4. RED,"100\"ABC\"200"

The resulting separated [label, value] array should be:

  1. ['RED','100']
  2. ['RED','100']
  3. ['RED','100,']
  4. ['RED','100"ABC"200']

I looked into solutions and a popular library even, just splits the entire string to get the values, e.g. 'RED,100'.split(/,/) might just do the thing.

But I was trying to make a regex with comma, which splits only if that comma is not enclosed within a quotes type value.

This isnt a standard CSV behaviour might be. But its very easy for end-user to enter values. enter label,value. Do whatever inside value, if thats surrounded by quotes. If you wanna contain quotes, use a backslash.

Any help is appreciated.

4
  • Yes take the case 3. If we split by comma, it will break in 3 piece right? which we dont want Commented Jan 18, 2018 at 15:22
  • can comma be in the fisrt part (RED) ? Commented Jan 18, 2018 at 15:27
  • Yes the same is true for first part i.e. label as well. Commented Jan 18, 2018 at 15:29
  • 1
    From user`s point its simple entry. the format is label,value But if you want to enter comma(,) or quotes(") inside either label/value, we wont be splitting on that, Commented Jan 18, 2018 at 15:30

2 Answers 2

1

You can use this regex that takes care of escaped quotes in string:

/"[^"\\]*(?:\\.[^"\\]*)*"|[^,"]+/g

RegEx Explanation:

  • ": Match a literal opening quote
  • [^"\\]*: Match 0 or more of any character that is not \ and not a quote
  • (?:\\.[^"\\]*)*: Followed by escaped character and another non-quote, non-\. Match 0 or more of this combination to get through all escaped characters
  • ": Match closing quote
  • |: OR (alternation)
  • [^,"]+: Match 1+ of non-quote, non-comma string

RegEx Demo

const regex = /"[^"\\]*(?:\\.[^"\\]*)*"|[^,"]+/g;

const arr = [`RED,100`, `RED,"100"`, `RED,"100,"`,
`RED,"100\\"ABC\\"200"`];
let m;

for (var i = 0; i < arr.length; i++) {
  var str = arr[i];
  var result = [];
  while ((m = regex.exec(str)) !== null) {
    result.push(m[0]);
  }
  console.log("Input:", str, ":: Result =>", result);
}

Sign up to request clarification or add additional context in comments.

1 Comment

Thanks a ton. This is incredible. Took almost 10mins to go by it. Thanks again. You just inspired me about the powers of regex.
1

You could use String#match and take only the groups.

var array = ['RED,100', 'RED,"100"', 'RED,"100,"', 'RED,"100\"ABC\"200"'];

console.log(array.map(s => s.match(/^([^,]+),(.*)$/).slice(1)))

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.