12

I'm trying to parse a rss feed that looks like this for the attribute "date":

<rss version="2.0">
<channel>
    <item>
        <y:c date="AA"></y:c>
    </item>
</channel>
</rss>

I tried several different versions of this: (rssFeed contains the RSS data)

println(((rssFeed \\ "channel" \\ "item" \ "y:c" \"date").toString))

But nothing seems to work. What am I missing?

Any help would really be appreciated!

2
  • 1
    rssFeed ? Shouldn't it be rss? Commented May 17, 2010 at 17:48
  • 1
    rssFeed is a variable containging the RSS-data Commented May 17, 2010 at 17:52

4 Answers 4

20

The "y" in <y:c is a namespace prefix. It's not part of the name. Also, attributes are referred to with a '@'. Try this:

println(((rssFeed \\ "channel" \\ "item" \ "c" \ "@date").toString))
Sign up to request clarification or add additional context in comments.

Comments

14

Attributes are retrieved using the "@attrName" selector. Thus, your selector should actually be something like the following:

println((rssFeed \\ "channel" \\ "item" \ "c" \ "@date").text)

2 Comments

Note the .text to get the date as a String rather than a Node
Indeed. The text method is generally preferable to toString since it will gracefully handle the case where your selector grabbed a chunk of XML rather than a Text node.
3

Also, think about the difference between \ and \\. \\ looks for a descendent, not just a child, like this (note that it jumps from channel to c, without item):

scala> (rssFeed \\ "channel" \\ "c" \ "@date").text
res20: String = AA

Or this sort of thing if you just want all the < c > elements, and don't care about their parents:

scala> (rssFeed \\ "c" \ "@date").text            
res24: String = AA

And this specifies an exact path:

scala> (rssFeed \ "channel" \ "item" \ "c" \ "@date").text
res25: String = AA

Comments

3

Think about using sequence comprehensions, too. They're useful for dealing with XML, particularly if you need complicated conditions.

For the simple case:

for {
  c <- rssFeed \\ "@date"
} yield c

Gives you the date attribute from everything in rssFeed.

But if you want something more complex:

val rssFeed = <rss version="2.0">
                <channel>
                  <item>
                    <y:c date="AA"></y:c>
                    <y:c date="AB"></y:c>
                    <y:c date="AC"></y:c>
                  </item>
                </channel>
              </rss>

val sep = "\n----\n"

for {
  channel <- rssFeed \ "channel"
  item <- channel \ "item"
  y <- item \ "c"
  date <- y \ "@date" if (date text).equals("AA")
} yield {
  val s = List(channel, item, y, date).mkString(sep)
  println(s)
}

Gives you:

    <channel>
                        <item>
                          <y:c date="AA"></y:c>
                          <y:c date="AB"></y:c>
                          <y:c date="AC"></y:c>
                        </item>
                      </channel>
    ----
    <item>
                          <y:c date="AA"></y:c>
                          <y:c date="AB"></y:c>
                          <y:c date="AC"></y:c>
                        </item>
    ----
    <y:c date="AA"></y:c>
    ----
    AA

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.