1

well, i put xml-response with a lot of symbols, like this:

def xmlString = "<TAG1>1239071ABCDEFGH</TAG1><TAG2>1239071ABCDEFGH</TAG2>"

using xmlSlurper to leave only digits

def node = 
new XmlSlurper().parseText(xmlString)
    def nodelist = [node.tag1.tag2]

after this "node" got a value like "1239071123907112390711239071" and i try to put java RegExp to separate the digits by 7

System.out.println(java.util.Arrays.toString( nodelist.node.split("(?<=\G.{7})") ))

Where i did wrong? it doesn't work

4
  • 1
    How does <TAG1>1239071ABCDEFGH</TAG1><TAG2>1239071ABCDEFGH</TAG2> give you 1239071123907112390711239071? Then why are you splitting by 7 chars? Commented Aug 14, 2013 at 9:31
  • 1
    Also, XmlSlurper won't slurp that Xml as it's not valid (no root node) Commented Aug 14, 2013 at 9:32
  • 1
    Also, I believe def nodelist = [node.tag1.tag2] would return [ null ] Commented Aug 14, 2013 at 9:32
  • there's a lot of <TAG1><TAG2>...<TAGX> tags with same type of content, but i need only digits and separate it by 7 chars to get [1239071 1239071 1239071] etc. Commented Aug 14, 2013 at 9:35

1 Answer 1

1

Assuming you have some valid xml like:

def xmlString = """<document>
                  |    <TAG1>1239071ABCDEFGH</TAG1>
                  |    <TAG2>1239071ABCDEFGH</TAG2>
                  |</document>""".stripMargin()

Then you can get all elements starting with TAG, and for each of these trim off the end chars which aren't digits:

def nodeList = new XmlSlurper().parseText( xmlString )
                               .'**'
                               .findAll { node ->
                                   node.name().startsWith( 'TAG' )
                               }
                               .collect { node ->
                                   it.text().takeWhile { ch ->
                                       Character.isDigit( ch )
                                   }
                               }

nodeList in this example would then equal:

assert nodeList == ['1239071', '1239071']

If you want to keep these numbers associated with the TAG that contained them (assuking TAGn tags are unique), then you can change to collectEntries

def nodeList = new XmlSlurper().parseText( xmlString )
                               .'**'
                               .findAll { node ->
                                   node.name().startsWith( 'TAG' )
                               }    
                               .collectEntries { node ->
                                   [ node.name(), node.text().takeWhile { Character.isDigit( it ) } ]
                               }


assert nodeList == [TAG1:'1239071', TAG2:'1239071']
Sign up to request clarification or add additional context in comments.

2 Comments

thank you! that's very good help! and one more please: what to do if not only TAGn unique, but values too: '1239071', '1239082', etc
The Map variant should be fine, values in a map can be unique or duplicated it doesn't matter

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.