1

I have been trying to figure out how to get an xml data source parsed into a CSV file and it's driving me a little crazy. I have a data source that I need to parse an create a CSV. I also need to be able to include the Node ID as a column. Here is what I have:

         #!/usr/bin/perl
            use warnings;
        use strict;
        use XML::XPath;

        #Name of the CSV File
        my $filename = "parse.csv";

        #Create the file.
        open(INPUT,">$filename") or die "Cannot create file";

        #Collect the XML and set nodes
        my($xp) = XML::XPath->new( join('', <DATA>) );
        my(@records) = $xp->findnodes( '/CATALOG/CD' );
        my($firstTime) = 0;

        #Loop through each record
        foreach my $record ( @records ) {
            my(@fields) = $xp->find( './child::*', $record )->get_nodelist();
            unless ( $firstTime++ ) {
            #Print Headers
                print( join( ',', map { $_->getName() } @fields ), "\n");
            }
            #Print Content
                print( join( ',', map { $_->string_value() } @fields ), "\n");
        }
        #Close the file.
        close(INPUT);


        __DATA__
        <FOOD>
            <ITEM id='1'>
                <Color>Brown</Color>
                <Name>Steak</Name>
            </ITEM>
            <ITEM id='2'>
                <Color>Blue</Color>
                <Name>Blueberries</Name>
            </ITEM>
            <ITEM id='3'>
                <Color>Red</Color>
                <Name>Apple</Name>
            </ITEM>
        </FOOD>

It creates a CSV but its empty & I think its because of the print lines in the foreach loop.

Any help would be greatly appreciated!

1
  • As a matter of style, don't hardcode filenames into your scripts if you can avoid it. Making them optional arguments, reading input from <> (or doing the equivalent) and writing output to STDOUT makes your scripts much easier to reuse, combine and test. Commented Jan 9, 2015 at 16:59

3 Answers 3

2

You are printing your headers and content to Standard Output, not to your output file. You need to pass the file handle as the first argument to print without a comma between it and what you want to print. Something like: print FILE join(',', ...), "\n";

I would also recommend not using INPUT as the file handle you are outputting to - it makes it a little confusing to understand the code.

Sign up to request clarification or add additional context in comments.

Comments

2

Given the simplicity of the XML schema, this easier to do with AnyData

For instance:

#!/usr/bin/perl
# This script converts a XML file to CSV format.

# Load the AnyData XML to CSV conversion modules
use XML::Parser;
use XML::Twig;
use AnyData;

my $input_xml = "test.xml";
my $output_csv = "test.csv";


$flags->{record_tag} = 'ITEM';
adConvert( 'XML', $input_xml, 'CSV', $output_csv, $flags );

Would convert your data structure (XML) into:

id,Color,Name
1,Brown,Steak
2,Blue,Blueberries
3,Red,Apple

Comments

1

In your case , you are using /CATALOG/CD rather than your data. Please use something like

my(@records) = $xp->findnodes( '/FOOD/ITEM' );
....
...
...
print INPUT ( join( ',', map { $_->getName() } @fields ), "\n" );

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.