1

quick question: I need to transform a default RSS Structure into another XML-format.

The RSS File is like....

<?xml version="1.0" encoding="UTF-8"?>
    <rss version="2.0">
        <channel>
            <title>Name des RSS Feed</title>
            <description>Feed Beschreibung</description>
            <language>de</language>
            <link>http://xml-rss.de</link>
            <lastBuildDate>Sat, 1 Jan 2000 00:00:00 GMT</lastBuildDate>
            <item>
                <title>Titel der Nachricht</title>
                <description>Die Nachricht an sich</description>
                <link>http://xml-rss.de/link-zur-nachricht.htm</link>
                <pubDate>Sat, 1. Jan 2000 00:00:00 GMT</pubDate>
                <guid>01012000-000000</guid>
            </item>
            <item>
                <title>Titel der Nachricht</title>
                <description>Die Nachricht an sich</description>
                <link>http://xml-rss.de/link-zur-nachricht.htm</link>
                <pubDate>Sat, 1. Jan 2000 00:00:00 GMT</pubDate>
                <guid>01012000-000000</guid>
            </item>
            <item>
                <title>Titel der Nachricht</title>
                <description>Die Nachricht an sich</description>
                <link>http://xml-rss.de/link-zur-nachricht.htm</link>
                <pubDate>Sat, 1. Jan 2000 00:00:00 GMT</pubDate>
                <guid>01012000-000000</guid>
            </item>
        </channel>
    </rss>

...and I want to extract only the item-elements (with childs and attributes) XML like:

<?xml version="1.0" encoding="ISO-8859-1"?>
<item>
    <title>Titel der Nachricht</title>
    <description>Die Nachricht an sich</description>
   <link>http://xml-rss.de/link-zur-nachricht.htm</link>
   <pubDate>Sat, 1. Jan 2000 00:00:00 GMT</pubDate>
   <guid>01012000-000000</guid>
</item>
...

It hasn't to be stored into a file. I need just the output.

edit: Furthermore you need to know: The RSS File could have dynamic numbers of items. This is just a sample. So it has to be looped with while, for, for-each, ...

I tried different approaches with DOMNode, SimpleXML, XPath, ... but without success.

Thanks chris

1
  • I have posted a reply below incase you hadn't noticed. It should explain all :o) Commented Jun 16, 2010 at 21:06

3 Answers 3

1

A different approach would be to use an XSLT:

$xsl = <<< XSL
<?xml version="1.0" encoding="ISO-8859-1"?>
<xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
<xsl:template match="/">
<items>
  <xsl:copy-of select="//item">
    <xsl:apply-templates/>
  </xsl:copy-of>
</items>
</xsl:template>
</xsl:stylesheet>
XSL;

The above stylesheet has just one rule, namely deep copying all <item> elements from the source XML to an XML file and ignore everything else from the source file. The nodes will be copied into an <items> element for root node. To process this, you'd do

$xslDoc = new DOMDocument();           // create Doc for XSLT
$xslDoc->loadXML($xsl);                // load stylesheet into it
$xmlDoc = new DOMDocument();           // create Doc for RSS
$xmlDoc->loadXML($xml);                // load your XML/RSS into it
$proc = new XSLTProcessor();           // init XSLT engine
$proc->importStylesheet($xslDoc);      // load stylesheet into engine
echo $proc->transformToXML($xmlDoc);   // output transformed XML

Instead of outputting, you could just write the return value to file.

Further reading:

Sign up to request clarification or add additional context in comments.

5 Comments

i will try it tomorrow and give you feedback. didn't thought about an xslt approach - thanks for this!
Hey Gordon, where have I to include (or reference) to my given RSS-File? I'm asking, because in the PHP-Part you wrote in the fourth comment "load your xml/rss", but the var $xml is already used for the XSL above. - XSL is pretty new stuff for me, so I guess I'm still thinking too complex. edit: Okay, I am blind or still tired. I didn't see there are two different vars ($xml and $xsl). - Let's give it a try ;)
@Chris you can assign the $xml var the same way you assign the $xsl with HEREDOC syntax. Or use ->load('filename.xml').
Hm, are you sure you didn't forget something? Because I don't get any output.
Yea it works. My fault..had a silly typo in my code. Thanks again
1

What you ask for is hardly a transformation. You are basically just extracting the <item> elements as they are. Also, the result you give is not valid XML, as it lacks a root node.

Apart from that, you can simple do it like this:

$dom = new DOMDocument;           // init new DOMDocument
$dom->loadXML($xml);              // load some XML into it

$xpath = new DOMXPath($dom);      // create a new XPath
$nodes = $xpath->query('//item'); // Find all item elements
foreach($nodes as $node) {        // Iterate over found item elements
    echo $dom->saveXml($node);    // output the item node outerHTML
}

The above would echo the <item> nodes. You could simply buffer the output, concatenate it to a string, write to it an array and implode, etc - and write it to file.

If you want to do it properly with DOM (and a root node), the full code would be:

$dom = new DOMDocument;                          // init DOMDocument for RSS
$dom->loadXML($xml);                             // load some XML into it

$items = new DOMDocument;                        // init DOMDocument for new file
$items->preserveWhiteSpace = FALSE;              // dump whitespace
$items->formatOutput = TRUE;                     // make output pretty
$items->loadXML('<items/>');                     // create root node

$xpath = new DOMXPath($dom);                     // create a new XPath
$nodes = $xpath->query('//item');                // Find all item elements
foreach($nodes as $node) {                       // iterate over found item nodes
    $copy = $items->importNode($node, TRUE);     // deep copy of item node
    $items->documentElement->appendChild($copy); // append item nodes
}
echo $items->saveXML();                          // outputs the new document

Instead of saveXML(), you'd use save('filename.xml') to write it to a file.

2 Comments

thanks gordon, looks good, but i get an error message. couldn't find out what the failure is. "Warning: DOMDocument::loadXML() [domdocument.loadxml]: Start tag expected, '<' not found in Entity, line: 1 in /home/chris/http/dev/xmlfeed/index3.php on line 4"
@Chris I've used the RSS XML you gave for $xml. Remember, loadXML loads from a String. If you want to load from a URL or file use load() only.
0

Try:

<?php
$xmlFile = new DOMDocument(); //Instantiate new DOMDocument
$xmlFile->load("URL TO RSS/XML FILE"); //Load in XML/RSS file
$xmlString = file_get_contents("URL TO RSS/XML FILE"); 

$title[] = "";
$description[] = "";
$link[] = "";
$pubDate[] = "";
$guid[] = "";

for($i = 0; $i < substr_count($xmlString, "<item>"); $i++)
{
$title[] = $xmlFile->getElementsByTagName("title")->item(0)->nodeValue; //Get the value of the node <title>
$description[] = $xmlFile->getElementsByTagName("description")->item(0)->nodeValue;
$link[] = $xmlFile->getElementsByTagName("link")->item(0)->nodeValue;
$pubDate[] = $xmlFile->getElementsByTagName("pubDate")->item(0)->nodeValue;
$guid[] = $xmlFile->getElementsByTagName("guid")->item(0)->nodeValue;
}
?>

Untested but the arrays

$title[] $description[] $link[] $pubDate[] $guid[]

should be populated with all of the data that you need!

EDIT: OK so another approach:

<?php
$xmlString = file_get_contents("URL TO RSS/XML FILE"); 
$titles = preg_filter("/<title>([.]*)</title>/","\\1", mixed $xmlString);
$descriptions = preg_filter("/<description>([.]*)</description>/","\\1", mixed $xmlString);
$links = preg_filter("/<link>([.]*)</link>/","\\1", mixed $xmlString);
$pubDates = preg_filter("/<pubDate>([.]*)</pubDate>/","\\1", mixed $xmlString);
$guids = preg_filter("/<guid>([.]*)</guid>/","\\1", mixed $xmlString);
?>

In this example each variable will be filled with the correct values.

3 Comments

would be kind of you, if you could extend your approach. thanks
thank you cheif17, but it don't seem to me as a clean solution for this kind of problems. with your code, you have to pick up every single attribute and build the new xml document with the arrays.
ok i have put an edit at the bottom witha totally different approach!

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.