2

Total noob here so I am sorry for my ignorance in advance.

Most of what I have searched and messed around with has centered around using XML::LibXML with XPath.

The problem that I have is that I am not looking to capture text between tags: I need values of the tags.

This is my XML structure

<users>
  <entry name="asd">
    <permissions>
      <role-based>
        <superuser>yes</superuser>
      </role-based>
    </permissions>
  </entry>
  <entry name="fgh">
    <permissions>
      <role-based>
        <superuser>yes</superuser>
      </role-based>
    </permissions>
    <authentication-profile>RSA Two-Factor</authentication-profile>
  </entry>
  <entry name="jkl">
    <permissions>
      <role-based>
        <superreader>yes</superreader>
      </role-based>
    </permissions>
    <authentication-profile>RSA Two-Factor</authentication-profile>
  </entry>
</users>

I am trying to grab the name attribute (without the quotes) and also determine whether this person is a superuser or superreader.

I am stuck at not being able to do much other than print off the nodes. I need to turn this into a CSV file in the structure of username; role

4 Answers 4

5

The easiest way to extract information from XML documents with XML::LibXML is to use the find family of methods. These methods use an XPath expression to select nodes and values from the document. The following script extracts the data you need:

use XML::LibXML;

my $doc = XML::LibXML->load_xml(location => 'so.xml');

for my $entry ($doc->findnodes('//entry')) {
    my $name = $entry->getAttribute('name');
    my $role = $entry->findvalue(
        'local-name(permissions/role-based/*[.="yes"])'
    );
    print("$name;$role\n");
}   

It prints

asd;superuser
fgh;superuser
jkl;superreader

I used the local-name XPath function to get the name of the role element.

Note that you might want to use Text::CSV to create CSV files in a more robust way.

Sign up to request clarification or add additional context in comments.

2 Comments

Thanks for the quick response. I will have to try this out. When I was attempting to use the getAttribute call before it was telling me that it was unknown in my package. I was using XML:LibXML, maybe I have dependency issues somewhere?
@user2891632, if you're still having problems, post a new question showing the actual code you're using and the errors you're getting.
3

Another solution with a different module, XML::Twig:

#!/usr/bin/env perl

use warnings;
use strict;
use XML::Twig;

my ($name, $role);

my $twig = XML::Twig->new(
    twig_handlers => {
        'entry' => sub { 
            $name = $_->att('name');
            if ( defined $name && defined $role ) { 
                printf qq|%s;%s\n|, $name, $role;
            }   
            map { undef $_ } ($name, $role);
        },  
        'role-based' => sub { $role = $_->first_child->tag },
    },  
)->parsefile( shift );

Run it like:

perl script.pl xmlfile

That yields:

asd;superuser
fgh;superuser
jkl;superreader

Comments

2

Using XML::Rules:

use XML::Rules;

print "name is_superuser is_superreader\n";
my @rules = (
  entry => sub {
    my $entry = $_[1];
    $_ ||= 'no' for @$entry{qw(superuser superreader)};
    print "$entry->{name} $entry->{superuser} $entry->{superreader}\n";
  },
  'permissions,role-based' => 'pass no content',
  'superuser,superreader' => 'content',
  _default => undef,
);

my $p = XML::Rules->new(rules => \@rules);
my $s = $p->parse(doc());

sub doc {
return <<XML;
<users>
   <entry name="asd">
       <permissions>
            <role-based>
                <superuser>yes</superuser>
            </role-based>
       </permissions>
   </entry>
   <entry name="fgh">
       <permissions>
            <role-based>
                <superuser>yes</superuser>
            </role-based>
       </permissions>
       <authentication-profile>RSA Two-Factor</authentication-profile>
   </entry>
   <entry name="jkl">
       <permissions>
            <role-based>
                <superreader>yes</superreader>
            </role-based>
       </permissions>
       <authentication-profile>RSA Two-Factor</authentication-profile>
   </entry>
</users>
XML
}

Or an optional set of rules assuming all content is 'yes' (and some other assumptions) for your key fields:

my $name;
my @rules = (
  '^entry' => sub {
    $name = $_[1]->{name};
  },
  'superuser,superreader' => sub {
    print "$name,$_[0]\n";
  },
  _default => undef,
);

Comments

1

I like using XML::Simple for projects like this.

For example:

use XML::Simple;

my $su = $ARGV[0];
die unless (-e $su);

my $su_xml = XMLin($su, ForceArray => [ 'entry' ]);
my $suref = $su_xml->{entry};

foreach my $key (keys %{$suref}) {
    $rb = ${$suref}{$key}->{permissions}->{'role-based'};
    foreach my $rbkey (keys %{$rb}) {
        print "$key\t$rbkey\t${$rb}{$rbkey}\n";
    }
}

prints:

fgh     superuser       yes
asd     superuser       yes
jkl     superreader     yes

2 Comments

XML::Simple is often too simple. For example, your script breaks if there's only one <entry> within <users>.
@SlavenRezic Good catch! Fortunately, XML::Simple is highly configurable and easily accounts for this case through the use of ForceArray. Solution updated.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.