Ruby parse through string

Question

I have a string that looks like below, and I have to remove everything between the first bracket and the last bracket. All bets are off, on what's in between (regarding other brackets). What would be the best aproach, thanks.

'[

        { "foo":
            {"bar":"foo",
                "bar": {
                    ["foo":"bar", "foo":"bar"]
                }
            }
        }

    ],

"foo":"bar","foo":"bar"'

result:

  ',

    "foo":"bar","foo":"bar"'

Your example data doesn't seem to be valid JSON. Was it supposed to be? The deviations are :- The inner most array is using pairs, so it should be an object.. AND The outermost scope seems to be a list, but it contains [] and pairs "":"" .. so it's not an object body or array body. -- is this the way you intended it? — Nigel Thorne
– Nigel Thorne, Commented Nov 23, 2011 at 21:50

mu is too short · Accepted Answer · 2011-10-07 19:00:27Z

1

If your data really does look like that and you don't have an brackets in the bit at the end then:

s.gsub(/\[.*\]/m, '')

If you want to be a little more paranoid, then you can look for ], followed by an end-of-line:

s.gsub(/\[.*\],$/m, ',')

Hard to say any more than that without a specification of your data format.

answered Oct 7, 2011 at 19:00

mu is too short

436k71 gold badges863 silver badges822 bronze badges

Sign up to request clarification or add additional context in comments.

6 Comments

dt1000 Over a year ago

Darn, good observation, actually the bit at the end MAY have brackets. I have to find the corresponding close-bracket for the first open-bracket, and remove what's in between. Incidentally, this is json, but I can't treat it like a hash because the order matters. Total pain. So, this almost works, but what if i have brackets at the end? Thanks!

mu is too short Over a year ago

@dt1000: But what you posted in your question is not JSON. You might want to update your question with real data (or at least valid fake data). I take it that fix the JSON producer to produce sensible JSON (i.e. ordered things are in arrays) is out of the question?

mu is too short Over a year ago

@dt1000: Also, how exactly do you identify the part that you want to remove?

dt1000 Over a year ago

not sure if this will be readable, but i am trying to remove the key "thingToRemove" and its value, thanks'{ "groupRateSDI":"0.125","groupRate":"0.55", "coverageLevels":[".5",".6" ], "thingToRemove": [ { "memberCoverage": {"formula":"Formulas.levelsMultiplier", "parameters": { "type":"memberCoverage" } } }, { "memberMaxIncrements": {"formula":"Formulas.maxIncrement", "parameters": { "type":"member", "incrementType":"salaryMultiplierCoverageArray" } } } ], "someKey":"800","someKey2":"180"}'

mu is too short Over a year ago

@dt1000: Do you know the order that the keys should be in in the JSON? If you do then you could parse it as JSON, snip out the stuff you don't want, and then put it back into JSON format piece by piece.

|

psyho · Accepted Answer · 2011-10-07 19:03:20Z

0

Here you go:

string.gsub(/\[.*\]/m, '')

You need to use the m flag for the . to match newline characters. .* is already greedy, so it will match any number of brackets in between.

answered Oct 7, 2011 at 19:03

psyho

7,2225 gold badges25 silver badges24 bronze badges

Comments

Andy Waite · Accepted Answer · 2011-10-07 19:04:33Z

0

It's difficult to tell what you're trying to achieve, but that looks like JSON to me so it would probably be much easier to parse it and then manipulate it that way.

answered Oct 7, 2011 at 19:04

Andy Waite

11.1k4 gold badges35 silver badges49 bronze badges

1 Comment

dt1000 Over a year ago

can't gotta keep the order. cant change the json.

Tilo · Accepted Answer · 2011-10-07 19:09:52Z

0

you need multi-line mode:

str.gsub(/\[.*\]/m, '')

answered Oct 7, 2011 at 19:09

Tilo

33.8k5 gold badges83 silver badges107 bronze badges

Comments

Nigel Thorne · Accepted Answer · 2011-11-23 23:05:30Z

You could use something like Parslet to write a parser. Here's an example I wrote, based on the JSON grammer from http://www.json.org/

require 'parslet'

#This needs a few more 'as' calls to annotate the output 
class JSONParser < Parslet::Parser
  rule(:space)              { match('[\s\n]').repeat(1)}
  rule(:space?)             { space.maybe }
  rule(:digit)              { match('[0-9]') }
  rule(:hexdigit)           { match('[0-9a-fA-F]') }

  rule(:number)             { space? >> str('-').maybe >> 
                                (str('0') | (match('[1-9]') >> digit.repeat)) >> 
                                (str('.') >> digit.repeat).maybe >> 
                                ((str('e')| str('E')) >> (str('+')|str('-')).maybe >> digit.repeat ).maybe }

  rule(:escaped_character)  { str('\\') >> (match('["\\\\/bfnrt]') | (str('u') >> hexdigit.repeat(4,4))) }
  rule(:string)             { space? >> str('"') >> (match('[^\"\\\\]') | escaped_character).repeat >> str('"') }
  rule(:value)              { space? >> (string | number | object | array | str('true') | str('false') | str('null')) }

  rule(:pair)               { string >> str(":") >> value }
  rule(:pair_list)          { pair >> (space? >> str(',') >> pair).repeat }
  rule(:object)             { str('{') >> space? >> pair_list.maybe >> space? >> str('}') }

  rule(:value_list)         { value >> (space? >> str(',') >> value).repeat }
  rule(:array)              { space? >> str('[') >> space? >> value_list.maybe >> space? >> str(']') >> space?}

  rule(:json)               { value.as('value') >> (space? >> str(',') >> value.as('value')).repeat }
  root(:json)
end

# I've changed your doc to be a list of JSON values
doc = '[

        { "foo":
            {"bar":"foo",
                "bar": [
                    {"foo":"bar", "foo":"bar"}
                ]
            }
        }

    ],

{"foo":"bar"},{"foo":"bar"}'

puts JSONParser.new.parse(doc)[1..-1].map{|value| value["value"]}.join(",")
# => {"foo":"bar"},{"foo":"bar"}

However as your document isn't valid JSON (as far as I know).. then you can change the above...

require 'parslet'

class YourFileParser < Parslet::Parser
  rule(:space)              { match('[\s\n]').repeat(1)}
  rule(:space?)             { space.maybe }
  rule(:digit)              { match('[0-9]') }
  rule(:hexdigit)           { match('[0-9a-fA-F]') }

  rule(:number)             { space? >> str('-').maybe >> 
                                (str('0') | (match('[1-9]') >> digit.repeat)) >> 
                                (str('.') >> digit.repeat).maybe >> 
                                ((str('e')| str('E')) >> (str('+')|str('-')).maybe >> digit.repeat ).maybe }

  rule(:escaped_character)  { str('\\') >> (match('["\\\\/bfnrt]') | (str('u') >> hexdigit.repeat(4,4))) }
  rule(:string)             { space? >> str('"') >> (match('[^\"\\\\]') | escaped_character).repeat >> str('"') }
  rule(:value)              { space? >> (string | number | object | array | str('true') | str('false') | str('null')) }

  rule(:pair)               { string >> str(":") >> value }
  rule(:pair_list)          { (pair|value) >> (space? >> str(',') >> (pair|value)).repeat }
  rule(:object)             { str('{') >> space? >> pair_list.maybe >> space? >> str('}') }

  rule(:value_list)         { (pair|value) >> (space? >> str(',') >> (pair|value)).repeat }
  rule(:array)              { space? >> str('[') >> space? >> value_list.maybe >> space? >> str(']') >> space?}

  rule(:yourdoc)           { (pair|value).as('value') >> (space? >> str(',') >> (pair|value).as('value')).repeat }
  root(:yourdoc)
end

doc = '[

        { "foo":
            {"bar":"foo",
                "bar": {
                    ["foo":"bar", "foo":"bar"]
                }
            }
        }

    ],

"foo":"bar","foo":"bar"'

puts YourFileParser.new.parse(doc)[1..-1].map{|value| value["value"]}.join(",")

Collectives™ on Stack Overflow

Ruby parse through string

5 Answers 5

6 Comments

Comments

1 Comment

Comments

Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

5 Answers 5

6 Comments

Comments

1 Comment

Comments

Comments

Your Answer

Sign up or log in

Post as a guest

Related