2

Could you please help me parsing some JSON content in Google Sheet cells?

I can match the first pattern with regex capturing groups, but not the following ones. I didn't succeed to put the /gmi options or to adapt my case from some other code examples, and I'm wasting my time again since 2 days. Thanks a lot.

The json in cell :

[
{"idcode":"1AGLG";parent:"1A";level:"Genus";title:"Aglaonema";IsGroup:true};
{"idcode":"1ALDG";parent:"1A";level:"Genus";title:"Alocasia";IsGroup:true};
{"idcode":"1BBSG";parent:"1A";level:"Genus";title:"Ambrosina";IsGroup:true};{"idcode":"1AMUG";parent:"1A";level:"Genus";title:"Amorphophallus";IsGroup:true}
]

My formula :

REGEXEXTRACT(A1; """idcode"":""([\w]+)""(?:.*?title:"")([\w]+)""")

And the sheet file: https://docs.google.com/spreadsheets/d/17YSCK2S8IeqFE_Y_kqWQLVwT9VONXkxvCY3Hlr-8Xpc/edit

0

5 Answers 5

2

How about this sample formula?

Sample formula:

=ARRAYFORMULA(TRIM(SPLIT(TRANSPOSE(SPLIT(REGEXREPLACE(REGEXREPLACE(REGEXREPLACE(A1,"[\[{}\]]",""),"""idcode"":""([\w]+)""(?:.*?title:"")([\w]+)"";IsGroup:true;?","$1,$2,"),"(([\w\s\S]+?,){2})","$1@"),"@")),",")))
  • In this sample formula, the value of [{"idcode":"1AGLG";parent:"1A";level:"Genus";title:"Aglaonema";IsGroup:true};{"idcode":"1ALDG";parent:"1A";level:"Genus";title:"Alocasia";IsGroup:true};{"idcode":"1BBSG";parent:"1A";level:"Genus";title:"Ambrosina";IsGroup:true};{"idcode":"1AMUG";parent:"1A";level:"Genus";title:"Amorphophallus";IsGroup:true}] is put in the cell "A1".
  • The flow of this formula is as follows.
    1. Replace [\[{}\]] in the original value with "" using REGEXREPLACE.
    2. Replace ""idcode"":""([\w]+)""(?:.*?title:"")([\w]+)"";IsGroup:true;? in the 1st replaced value with $1,$2, using REGEXREPLACE.
    3. Split the 2nd replaced value with 2 columns.

Result:

enter image description here

Note:

  • As other method, the following sample formula can be retrieved the same result with above formula. In this formula, SPLIT is used 2 times using @ and ,.

      =ARRAYFORMULA(SPLIT(TRANSPOSE(SPLIT(REGEXREPLACE(REGEXREPLACE(A1,"[\[{}\]]",""),"""idcode"":""([\w]+)""(?:.*?title:"")([\w]+)"";IsGroup:true;?","$1,$2@"),"@")),","))
    
  • If , and @ are included in the values of the original value, please change above formulas.

References:

Sign up to request clarification or add additional context in comments.

1 Comment

Thank you very much, your second formula works like a charm (I didn't succeed with the first one)
1

One could also use the following formulas where we use 2 capturing groups with REGEXREPLACE in combination with the JOIN function, or REGEXEXTRACT for more "flexibility".

In both cases an ArrayFormula as well as the SPLIT function are a must:

As a single cell

=ArrayFormula(JOIN(" / ",REGEXREPLACE(SPLIT($A1,"};{",0), 
                                         ".*(\d\D{2,5})"".*""(\D+)"".*$","$1 - $2")))

In separate cells in a row

={ArrayFormula(REGEXEXTRACT(SPLIT($A1,"};{",0),"(\d\D{2,5})"""));
  ArrayFormula(REGEXEXTRACT(SPLIT($A1,"};{",0),".*""(\D+)"".*$"))}

In separate cells as a list

={ArrayFormula(TRANSPOSE(REGEXEXTRACT(SPLIT($A1,"};{",0),"(\d\D{2,5})"""))),
  ArrayFormula(TRANSPOSE(REGEXEXTRACT(SPLIT($A1,"};{",0),".*""(\D+)"".*$")))}

enter image description here

Functions used:

12 Comments

Thank you very much again marikamitsos, your formula works perfectly and looks elegant. Sorry I can't validate twice the answers
Tanaike's formula seems less elegant but it matches exactly what I want between the " ". I'l try to make a mix with yours
...what I want between the " ". What is it that you want? I think it does match your example. If you have something else, please let me know.
With Tanaike's formula I can specify the query with a 'flag name' (for example matching the following word of "parent", or in another json situation with different flags). I hope to be more precised, anyway you both have helped me to solve my problem, please don't blame because me I can't validate 2 solutions.. Thank you so much again
After testingyour formula with your original example, it gave me wrong results. Am I missing something? Also. Could you please give an example for your comment?
|
0

I have finally mixed both formulas in order to specify the targetted flags and to match more than one word following them :

=ARRAYFORMULA(join(" ; ";TRANSPOSE(SPLIT(REGEXREPLACE(REGEXREPLACE(A1;"[\[{}\]]";"");"""idcode"":""(\d\D{2,5})""(?:.*?title:"")(\D+)"";IsGroup:true;?";"$1
= $2@");"@"))))

Comments

0

Extract anything between quotation marks after a given pattern (here the word 'parent') (inspired by marikamitsos)

=ArrayFormula(JOIN(" ; ";REGEXREPLACE(SPLIT($B20;"};{";0); 
                                         ".*(parent):""((.+?))"".*$";"$1 = $2")))

Comments

0

Extract between-brackets strings following 3 given patterns (that are here 'flag1', 'parent', '3rdFlag')

json text example :

[{"flag1":"1AGLG";parent:"1A is 2nd to retrive";level:"Genus";3rdFlag:"Aglaonema is the way i like it"};{"flag1":"1ALDG";parent:"12A is 2nd to retrive";level:"Genus";3rdFlag:"Alocasia"};{"flag1":"1AOWG";parent:"BA is 2nd to retrive";level:"Genus";3rdFlag:"Anchomanes"};{"flag1":"1AUIG";parent:"1A is 2nd to retrive";level:"Genus";3rdFlag:"Anubias"};{"flag1":"1AQOG";parent:"2CA is 2nd to retrive";level:"Genus";3rdFlag:"Ariopsis at the end"}]

Formula :

=ArrayFormula(JOIN(" ; ";REGEXREPLACE(SPLIT($B24;"};{";0); 
                                         ".*(flag1"":""(.+?)"")(.+?)(parent:""(.+?)"")(.+?)(3rdFlag:""(.+?)"").*$";"$2 / $5 / $8")))

Result :

1AGLG / 1A is 2nd to retrive / Aglaonema is the way i like it ; 1ALDG / 12A is 2nd to retrive / Alocasia ; 1AOWG / BA is 2nd to retrive / Anchomanes ; 1AUIG / 1A is 2nd to retrive / Anubias ; 1AQOG / 2CA is 2nd to retrive / Ariopsis at the end

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.