0

I have my xml file which has below data

<User text="HHd5">
         <max string="0"/>
         <min string="pick up"/>
         <valat string="0"/>
         <valon string="0"/>
         <time string="GMT"/>     
 </User>

through my script, i need to check for User text ie. HHd5. If found, i must extract valat and valon values. Please help

My code:

$file = "text.xml" 
$xml = new XML::Simple( KeyAttr => [] );
$data = $xml->XMLin("$file");
my $booklist = XMLin('$file');
foreach my $var ( @{ $booklist->{ User text } } ) {
    if ( $var->{ User text } eq "HHd5" ) { $var->{valat}; $var->{valon}; }

And:

#!/usr/bin/perl 
open( fp, "<", "testing.xml" );
$s = "HHd5";
while (<fp>) {
    $a = $_;
    if ( $a =~ /$s/ ) {
        while (<fp>) {
            $f = $_;
            if ( $f =~ /valon string="(\d+)/ ) { print "valon $1 \n"; }
            if ( $f =~ /valat string="(\d+)/ ) { print "valat $1 \n"; }
        }
    }
}
10
  • You will get better responses if you can show what you've tried so far, and explain what problems you're having. Otherwise this looks a bit like a homework question, so may well attract less constructive responses. Commented Mar 25, 2015 at 13:16
  • can we achieve this using xml simple.. i am new to scripting. Please help Commented Mar 25, 2015 at 13:20
  • $file="text.xml" $xml = new XML::Simple (KeyAttr=>[]); $data = $xml->XMLin("$file"); my $booklist = XMLin('$file'); foreach my $var (@{$booklist->{User text}}) { if($var->{User text} eq "HHd5"){ $var->{valat}; $var->{valon}; } Commented Mar 25, 2015 at 13:23
  • Look at the doc for XML::Simple : search.cpan.org/~grantm/XML-Simple-2.20/lib/XML/… Specifically: "The use of this module in new code is discouraged. Other modules are available which provide more straightforward and consistent interfaces. " Commented Mar 25, 2015 at 13:25
  • Can I suggest reformatting that code and editing your post. As it stands, it doesn't work. I don't want to amend it because that might change the context of the question. Commented Mar 25, 2015 at 13:27

3 Answers 3

4

Using XML::XSH2, a wrapper around XML::LibXML:

open file.xml ;
for //User[@text='HHd5']
    echo valat/@string valon/@string ;

Or, a more verbose solution using XML::LibXML only:

#! /usr/bin/perl
use warnings;
use strict;

use XML::LibXML;

my $xml = 'XML::LibXML'->load_xml( location => 'file.xml' );
for my $user ($xml->documentElement->findnodes('//User[@text="HHd5"]')) {
    print $_->{string},"\n" for $user->findnodes('valat | valon');
}
Sign up to request clarification or add additional context in comments.

Comments

2

Let me start with a personal peeve. XML is a strict language spec, and it has formal definitions as to what is - and isn't - allowed. Therefore it's actually very easy to parse with a parser, and gets horribly messy if you try and use a hand rolled solution like a regular expression.

Not least because XML can have linefeeds and be reformatted and still be valid.

I would also suggest - don't use XML::Simple. In it's module page:

The use of this module in new code is discouraged. Other modules are available which provide more straightforward and consistent interfaces.

Also - it's really important that you start a script with use strict; and use warnings;. These are really good ways to help diagnose problems and will also get much better responses if you're posting code on Stack Overflow.

With that in mind, I'd suggest picking up XML::Twig which has the ability to set twig_handlers - subroutines that are triggered to process a specific chunk of XML. In the example below - I specify twig_roots which indicates to the parser that I don't really care about anything else.

process_user is called with each User element. We test the User element for it having the appropriate attribute - and if it does, we extract the string attributes from the two subelements you're interested in.

Something like this:

#!/usr/bin/perl

use strict;
use warnings;

use XML::Twig;

sub process_user {
    my ( $twig, $user ) = @_;
    if ( $user->att('text') eq "HHd5" ) {
        print $user->first_child('valat')->att('string'), ":",
            $user->first_child('valon')->att('string');
    }
}

my $parser = XML::Twig->new( twig_roots => { 'User' => \&process_user, } );
$parser->parse( \*DATA );

__DATA__
<User text="HHd5">
         <max string="0"/>
         <min string="pick up"/>
         <valat string="0"/>
         <valon string="0"/>
         <time string="GMT"/>     
 </User>

But simplifying a bit perhaps, to make it similar to your existing code:

use strict;
use warnings;

use XML::Twig;

my $xml_twig = XML::Twig->new();
$xml_twig->parsefile("test.xml");

foreach my $user ( $xml_twig->root->children('User') ) {
    if ( $user->att('text') eq "HHd5" ) {
        print $user ->first_child('valat')->att('string');
        print ":";
        print $user ->first_child('valon')->att('string');
    }
}

(NB: The example above doesn't quite work with your XML snippet, but that's because I'm assuming that User isn't your root node in your XML. It couldn't be really. )

Comments

-1

to deal with parsing XML, the best way is to download a module from CPAN like XML::Simple. it will be worth your time to get this module, or one like it, and learn how to use it, if you are going to work with XML. these modules basically convert XML into a complex Perl variable (hash reference). manually parsing XML is not advised on a large scale.

however, in the case of a quick ad-hoc situation, you could parse it with regex.

open(my $xml,"<","file.xml");

my ($user, $valat, $valon);
while (my $line = <$xml>) {
    # regexes to capture your variables
}

4 Comments

I'm afraid I disagree on both points - XML::Simple is nasty. And so is parsing XML with regular expressions.
i've implied both of those things in my answer, but remember that by all appearances, OP is a beginner. for beginners, sometimes what is the easy way to us is the hard way for them.
That's the problem though - XML parsing via regex isn't the easy way. It may look like it, but it's "easy" in exactly the same way of turning off strict and warnings makes errors go away. It may well work, but it's also extremely prone to creating brittle code and exploding messily later. I wouldn't suggest to anyone that they do either of these things, but especially a beginner - an expert may well know better.
fair enough. one could abstract OP's question out to "what's the best way to parse xml?". then again maybe it's just "what's the easiest way to extract 3 strings from some text?" :)

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.