0

I'm new to Regex and I'm struggling in finding a string inside a pattern.

I have this string:

{"linha":""},
{"linha":"              REDE GETNET"},
{"linha":"               SANTANDER"},
{"linha":""},
{"linha":"20/04/15 09:07:32 AUT:006299 DOC:000235"},
{"linha":"EC:000000000370484 TERM: T0385403    M"},
{"linha":"CV:010000024       CAIXA:00003333"},
{"linha":""},
{"linha":"CARTAO              ************2125"},
{"linha":""},
{"linha":"            CREDITO A VISTA"},
{"linha":"VALOR:        12,00"},
{"linha":""},
{"linha":"  ______________________________"},
{"linha":"         ASSINATURA"},
{"linha":""}, 
{"linha":""},
{"linha":"CUPOM: 00000000000000        MAC: 9235"},
{"linha":"NSU_CTF: 001899  LOJA: 0019  PDV: 897"},
{"linha":""},
{"linha":""}

I'd like to find the ocurrences between:

{"linha":

and

},

Getting only the string in double quotes after colon.

Until now my regex is:

(\{".*(linha).[:])

and it's getting only

{"linha":

Can someone help me? I intend doing it in javascript.

8
  • 5
    Are you sure that regular expressions are the best solution for this problem? You data appears to be JSON. Commented Apr 20, 2018 at 13:05
  • It is something like: /\{\s*\"linha\"\s*\:\s*\"(.*)\"\}/g? regex101.com/r/IjAJgB/1 Commented Apr 20, 2018 at 13:06
  • Even if it wasn't JSON, a good'ol substring would be enough since the start and end parts seem to be constant. Commented Apr 20, 2018 at 13:08
  • To precise Hunter McMillen's hint: decode your string from JSON (examples: json_decode in PHP, JSON.parse in Javascript), then you can loop over the objects and extract linha properties Commented Apr 20, 2018 at 13:11
  • @Aaron Regex could be used, but it's not a good idea if you can avoid it. Use the right tool for the job. If this is JSON, then parse it as JSON. For example, what if one line is: {"linha":"hello {\"world\"}!" },, or {"linha":null}, or {"linha":123 },? Regex could fail in all sorts of weird ways (as well as being harder to understand/update!), and there is almost certainly a better simple solution. Commented Apr 20, 2018 at 13:15

2 Answers 2

1

Solutions in Javascript:

Using Regex:

var input_str = '{"linha":""},{"linha":"              REDE GETNET"},{"linha":"               SANTANDER"},{"linha":""},{"linha":"20/04/15 09:07:32 AUT:006299 DOC:000235"},{"linha":"EC:000000000370484 TERM: T0385403    M"},{"linha":"CV:010000024       CAIXA:00003333"},{"linha":""},{"linha":"CARTAO              ************2125"},{"linha":""},{"linha":"            CREDITO A VISTA"},{"linha":"VALOR:        12,00"},{"linha":""},{"linha":"  ______________________________"},{"linha":"         ASSINATURA"},{"linha":""}, {"linha":""},{"linha":"CUPOM: 00000000000000        MAC: 9235"},{"linha":"NSU_CTF: 001899  LOJA: 0019  PDV: 897"},{"linha":""},{"linha":""}'
var re = new RegExp('linha*\":(".*")', 'g'); 
var myArray;
while ((myArray = re.exec(input_str)) !== null) {
  var msg = 'Found ' + myArray[1];
  console.log(msg);
}

Using JSON.parse: Notice that i have added [] to wrap the string as JSON Array.

var input_str1 = '[{"linha":""},{"linha":"              REDE GETNET"},{"linha":"               SANTANDER"},{"linha":""},{"linha":"20/04/15 09:07:32 AUT:006299 DOC:000235"},{"linha":"EC:000000000370484 TERM: T0385403    M"},{"linha":"CV:010000024       CAIXA:00003333"},{"linha":""},{"linha":"CARTAO              ************2125"},{"linha":""},{"linha":"            CREDITO A VISTA"},{"linha":"VALOR:        12,00"},{"linha":""},{"linha":"  ______________________________"},{"linha":"         ASSINATURA"},{"linha":""}, {"linha":""},{"linha":"CUPOM: 00000000000000        MAC: 9235"},{"linha":"NSU_CTF: 001899  LOJA: 0019  PDV: 897"},{"linha":""},{"linha":""}]'
var parsed_json = JSON.parse(input_str1); 
console.log(parsed_json);
parsed_json.forEach(x => console.log(x.linha))
Sign up to request clarification or add additional context in comments.

3 Comments

Upvoting because you showed also the amazing JSON way :)
But the regex you suggest does not work for me on the last case... regex101.com/r/SqgwBt/1 (but it works for all OP cases)
yes, there was an issue (does not support quotes in the value) in the regex, i have edited the answer now.
1

As far as I can see the example is JSON, thus you should use JSON methods to get the data you want. BUT, if for any reason you are "forced" to use regular expression, than something like that may do the job:

/\{\s*\"linha\"\s*\:\s*\"(.*)\"\}/g

You can test it here, and it is also robust with respect to (at least) {"linha": " \" \" "} and it also matches { "linha": ""}. For sure there are some cases in which this regex will not work correctly (for example, it will not get numeric values, only strings).

Thus, again, you should really check out JSON. It is amazing! :)

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.