1

I've spent the last two nights searching here for a solution but I haven't been able to figure this out. I have multiple XML files and I'd like to loop through all of them for nodes that contain the same attributes and merge all of their nodes to an output file. Here's an example:

File1.xml

<game rotnum="241">
  <gamedate> 123 </gamedate>
  <team1> Giants </team1>
  <team2> Eagles </team2>
</game>

File2.xml

<game rotnum="241">
  <line> 4 </line>
  <points> 200 </points>
</game>

Merged.xml

<game rotnum="241">
  <gamedate> 123 </gamedate>
  <team1> Giants </team1>
  <team2> Eagles </team2>
  <line> 4 </line>
  <points> 200 </points>
</game>

More or less. I need to do this in PHP and I'm thinking that using the DOM will be easier than XSLT (I don't know much about this).

There's a lot of nodes throughout multiple files and I need to match up the data based on similar rotnum attributes.

I don't know if I necessarily need to do this, but it's the easiest way that I can think of to get all of my data into one simpleXML object that I can foreach through and generate separate tables for each game.

These XML files are from several API feeds that I created using via SimpleXML and cached them locally..

function stripAndSaveFile($xml) {
    $game = $xml->game;
    $output = new SimpleXMLElement("<justbetlinesfeed></justbetlinesfeed>");
    for ($i = 0; $i < 16; $i++) {
        //Get the date of each game. 
        $spreadpoints = $game[$i]->line->spread->attributes()->points;
        $spreadteam1 = $game[$i]->line->spread->attributes()->team1adj;
        $spreadteam2 = $game[$i]->line->spread->attributes()->team2adj;
        $rotnum = $game[$i]->attributes()->team1rotnum;
        $insert = $output->addChild("game");
        $insert->addAttribute("rotnum", "$rotnum");
        $insert->addChild("spreadpoints", "$spreadpoints");
        $insert->addChild("spreadteam1", "$spreadteam1");
        $insert->addChild("spreadteam2", "$spreadteam2");

    }
    file_put_contents($this->filePath, $output->asXML());
}
}
5
  • how big are your xml files? If not very large, you can use simpleXML. Have you done any code for this? If yes then please show it. Commented Aug 2, 2012 at 6:12
  • How complex are XML structures in your files? Do they have a lot of nested elements? How do you merge elements with the same name? Commented Aug 2, 2012 at 6:14
  • this post may help you stackoverflow.com/questions/4186016/merge-xml-in-php-dom Commented Aug 2, 2012 at 6:25
  • I did some simplexml to create the data I'm working with. Commented Aug 2, 2012 at 7:06
  • @galymzhan the XML structure won't be too complex since I'm essentially creating it myself. My plan is to only have one element with the same name across all files--<game>. The data from site #1 would be something like <site1line>2</site1line> etc. Commented Aug 4, 2012 at 5:39

2 Answers 2

1

In this sort of situation I prefer to use something like DomDocument

Reading the document:

$dom = new DomDocument();
$dom->load('filename.xml');
$games = $menu->getElementsByTagName("game");
foreach ($games as $game) {
    if($game->hasChildNodes)
    { 
         //Set element
         $element->nodeValue;
         //Store to array or similar
    }
Sign up to request clarification or add additional context in comments.

1 Comment

I agree, I think the DomDocument is probably a lot easier than setting up Saxon on Apache/PHP. Basically I need to be able to parse multiple xml files and re order them by the one matching "rotnum" attribute.
1

This may not be what you're looking for, but maybe it will give you an idea. This would require an XSLT 2.0 processor such as Saxon.

What it is doing is going through all XML files in a collection (a local directory for testing) and grouping them by the rotnum attribute in the root element. It's effectively merging all children under root into one file. The file is created in a separate directory and the name is based on the rotnum value.

The input to the XSLT would be any XML. I used the stylesheet itself as the input.

Here's the example...

XML files in the "input_dir" directory:

file1.xml

<game rotnum="241">
  <gamedate> 123 </gamedate>
  <team1> Giants </team1>
  <team2> Eagles </team2>
</game>

file2.xml

<game rotnum="241">
  <line> 4 </line>
  <points> 200 </points>
</game>

XSLT 2.0 (tested with Saxon-HE 9.4)

<xsl:stylesheet version="2.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
    <xsl:output indent="yes"/>
    <xsl:strip-space elements="*"/>

    <xsl:variable name="vCollection" select="collection('file:///C:/some_absolute_uri/input_dir?input=*.xml')"/>

    <xsl:template match="@*|node()">
        <xsl:copy>
            <xsl:apply-templates select="@*|node()"/>
        </xsl:copy>
    </xsl:template>

    <xsl:template match="/">
        <xsl:for-each-group select="$vCollection/*" group-by="@rotnum">
            <xsl:result-document href="file:///C:/some_absolute_uri/output_dir/{@rotnum}.xml">
                <xsl:copy>
                    <xsl:apply-templates select="@*"/>
                    <xsl:apply-templates select="$vCollection/*[@rotnum=current-grouping-key()]/*"/>                    
                </xsl:copy>
            </xsl:result-document>
        </xsl:for-each-group>
    </xsl:template>

</xsl:stylesheet>

XML files in the "output_dir" directory:

241.xml

<game rotnum="241">
   <gamedate> 123 </gamedate>
   <team1> Giants </team1>
   <team2> Eagles </team2>
   <line> 4 </line>
   <points> 200 </points>
</game>

4 Comments

Sorry for the delay. Thank you for going into such detail. How exactly are you executing that XSL with Saxon? From what I understand, I need another PHP script that actually processes everything, but I'm not sure how to go about that since there's going to be several .xml files in the directory. Your solution seems exactly what I'm looking for.
@CommanderKeen - I originally tested in oXygen XML Editor, but I also tried executing it from the command line. I don't really know PHP, but maybe you could execute it using shell_exec()? You should only need to execute the XSLT once no matter how many XML files are in the directory. The command line I used was java -cp saxonHE\saxon9he.jar net.sf.saxon.Transform -s:merge.xsl -xsl:merge.xsl.
Also, Saxon loads all of the XML files into memory so if you have a lot of XML and have memory issues, I can show you how to use the Saxon extension function saxon:discard-document().
Ah okay! That makes a little more sense. I was looking to run this sort of merge over a certain time interval via a CRON job. I'll look into running this XSLT with PHP a little more.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.