4

XML parsing in Scala doesn't seem to be as easy and straightforward as it should be.

What I needed was something that behaved similar to document.getElementsByTagName(name) in JavaScript, but for my purposes all I needed was the first element of a particular tag-name. Here is what I ended up with:

import scala.xml.{Document, Elem, Node}
import scala.xml.parsing.ConstructingParser
def _getFirstMatchingElementByName(search: String, n: Node): Option[Node] = {
    if (n.label == search) {
        Some(n)
    } else {
        var i = 0
        var result: Option[Node] = None
        try {
            while (result == None) {
                result = _getFirstMatchingElementByName(search, n.child(i))
                i += 1
            }
        } catch {
            case e: IndexOutOfBoundsException => None
        }
        result
    }
}

It basically recurses through until a match is found or all possibilities are exhausted.

Now that the feature which required that I have this ability has been released I have reviewed this a little more and it really bugs me. I'm sure there are many Java libraries available to help parse XML, but given the native support that Scala has for generating XML (i.e. it can pretty much just be inlined anywhere), I am curious if I am missing something.

Is there a better way to do this in Scala?

2
  • I would recommend using XPath (tag added) to deal with XML selection. Here is an article on dealing with XML in Scala and a more recent SO post where XOM was selected. Commented Aug 31, 2011 at 22:29
  • 1
    I assume there's a problem with n \\ search. Could you explain what is it? Commented Sep 1, 2011 at 1:05

1 Answer 1

6

You doing it wrong!
all I needed was the first element of a particular tag-name
given this xml:

val page = 
  <root>
    <need>text1</need>
    <doesnotneed>text2</doesnotneed>
    <doesnotneed>text3</doesnotneed>
    <need>text4</need>
  </root>

Now calling this code will give you list of all nodes with given tag name:

scala> page \\ "need"
res3: scala.xml.NodeSeq = NodeSeq(<need>text1</need>, <need>text4</need>)

To get only first one:

scala> page \\ "need" head
res4: scala.xml.Node = <need>text1</need>

P.S. deep-first element would be treated as head.

Sign up to request clarification or add additional context in comments.

1 Comment

That's awesome, I didn't realize that the lift-json DSL-syntax was used anywhere else. Thanks!

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.