2

I'm trying to parse something like this: Key1=[val123, val456], Key2=[val78, val123]

into a Map<String, List<String>> A prob is that both the key and values could have non-alpha num characters like .:-_

This looks like something I should be able to use the regexp pattern match/group thing to make short work of without parsing, but I'm not having any luck getting the regex expression working. Any regexp gurus?

4 Answers 4

6

Try

([^=\s]+)\s*=\s*\[\s*([^\s,]+),\s*([^\s,]+)\s*\]

This will match one key/values pair and extract the key in backreference 1, the first value in backreference 2 and the second value in backreference 3.

In Java this could look something like this:

Pattern regex = Pattern.compile("([^=\\s]+)\\s*=\\s*\\[\\s*([^\\s,]+),\\s*([^\\s,]+)\\s*\\]");
Matcher regexMatcher = regex.matcher(subjectString);
while (regexMatcher.find()) {
    key  = regexMatcher.group(1);
    val1 = regexMatcher.group(2);
    val2 = regexMatcher.group(3);
}

Explanation:

([^=\s]+)   # Match one or more characters except whitespace or =
\s*=\s*     # Match =, optionally surrounded by whitespace
\[\s*       # Match [ plus optional whitespace
([^\s,]+)   # Match anything except spaces or commas
,\s*        # Match a comma plus optional whitespace
([^\s,]+)   # Match anything except spaces or commas
\s*\]       # Match optional whitespace and ]
Sign up to request clarification or add additional context in comments.

Comments

2

Here's a way in Groovy:

import java.util.regex.*

def map = [:]
def matcher = "Key1=[val123, val456], Key2=[val78, val123, val666]" =~ /(\S+)=\[([^]]*)]/
matcher.each { 
  map.put(it[1], it[2].split(/,\s*/)) 
}
println map

which produces:

[Key1:[val123, val456], Key2:[val78, val123, val666]]

Test rig can be found here: http://ideone.com/6oFsU

2 Comments

Thanks Bart, I combined your regexp and Tim's to come up with ([^=\s]+)\s*=\s*\[([^]]*)] Appreciate your assistance
It’s a shame that Java’s \S matches whitespace. :(
0

You can get your example to work using this Groovy:

def str = 'Key1=[val123, val456], Key2=[val78, val123]'

class Evaluator extends Binding {
  def parse( s ) {
    GroovyShell shell = new GroovyShell( this );
    shell.evaluate( s )
  }
  Object getVariable( String name ) { name }
}

new Evaluator().parse "[$str]".tr( '=', ':' )

But you say you can have more complex examples?

The best, safest solution would be to get the program generating the output to use a proper data format such as xml or json

However (of course) this is not always possible

1 Comment

Yes, I had considered this but unfortunately the input string is (untrusted) user entered, so for security reasons this was a no-go. Thanks for the reply
0

A more idiomatic way building on Bart's method:

def map = [:]
("Key1=[val123, val456], Key2=[val78, val123, val666]" =~ /(\S+)=\[([^]]*)]/ ).each { text, key, value ->
    map[key] = value.split(/,\s*/)
}

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.