0

I'm using Ruby, Norkigiri and Nori. I would like some thoughts about how I should go about parsing this XML file.

In this schema, an entity can include multiple contacts.

I need to return a hash of the following:

  • :id
  • :first_name
  • :last_name
  • :preferred_email
  • :manager

I thought about using xpath to try return the preferred email contact.

entities = doc.xpath("/entity_list/entity").each do |entity|
   puts entity.xpath("contact_list/contact[contains(type,'Email') and contains(preferred, '1')]")
end



  <entity>
    <id>21925</id>
    <last_name>Smith</last_name>
    <first_name>John</first_name>
    <preferred_name>Johnny</preferred_name>
    <manager>Timmy</manager>
    <dob>1970-01-01</dob>
    <type>individual</type>
    <contact_list>
      <contact>
        <type>Mobile Phone</type>
        <preferred>0</preferred>
        <value>563478653478</value>
      </contact>
      <contact>
        <type>Pager</type>
        <preferred>0</preferred>
        <value>7354635345</value>
      </contact>
      <contact>
        <notes>None</notes>
        <type>Home Email</type>
        <preferred>1</preferred>
        <value>[email protected]</value>
        <comments>None</comments>
      </contact>
      <contact>
        <notes>None</notes>
        <type>Work Email</type>
        <preferred>0</preferred>
        <value>[email protected]</value>
        <comments>None</comments>
      </contact>
      <contact>
        <type>Home Phone</type>
        <preferred>1</preferred>
        <value>56537646365</value>
      </contact>
     </contact_list>
     </entity>

What would be the best way to approach this problem?

Thanks

2
  • Show us the output you are looking for. There is a but of confusions about the email selections.. Commented Sep 13, 2013 at 8:03
  • It should return the first email address that has the preferred status set to 1. In this example, it should return [email protected] because it is the preferred email. Commented Sep 13, 2013 at 20:20

1 Answer 1

1

Here is one way of doing it (off the top of my head, based on your initial solution):

entities = doc.xpath("/entity_list/entity").map do |entity|
  {
    :id => entity.at_xpath("id").content.to_i,
    :first_name => entity.at_xpath("first_name").content,
    :last_name => entity.at_xpath("last_name").content,
    :preferred_email => entity.at_xpath("contact_list/contact[contains(type,'Email') and contains(preferred, '1')]/value").content,
    :manager => entity.at_xpath("manager").content
  }
end

EDIT

To rescue from missing nodes you could use ActiveSupport's try method, or just tack a rescue nil onto the end of each line, e.g.:

:first_name => (entity.at_xpath("first_name").content rescue nil),

But it would be better to use a helper method, something like:

def get_node_content(entity, xpath)
  node = entity.send(:at_xpath, xpath)
  node ? node.content : nil
end

Then you could use it like:

:first_name => get_node_content(entity, "first_name"),
Sign up to request clarification or add additional context in comments.

1 Comment

This seems to work well, until it hits an xpath query where the result return nil and .content doesn't exist. Is there a simple way around this issue?

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.