0

After executing the code below (minus the database call), I am receiving a "error on line 331 at column 7: Extra content at the end of the document" error. I went through these forums, but could not find a solution. I don't have any random characters or any code that should be adding that extra whitespace... Any ideas?

<?php 
header('Content-type: text/xml');
mysql_connect("localhost", "---", "---");
mysql_select_db("---");

$query = "SELECT title FROM table";
$result = mysql_query($query);

$xml = new XMLWriter();
$xml->openURI("php://output");
$xml->startDocument();
$xml->setIndent(true);
$xml->writeRaw('<urlset xmlns="http://www.sitemaps.org/schemas/sitemap/0.9" xmlns:news="http://www.google.com/schemas/sitemap-news/0.9">');
$xml->startElement('url');
while ($row = mysql_fetch_assoc($result)) {
    if(!empty($row)){
        $title = $row['title'];
        $xml->startElement('loc');
            $xml->writeRaw('http://domain.com/article/');
        $xml->endElement();
        $xml->startElement('news:news');
            $xml->startElement("news:publication");
                $xml->startElement("news:name");
                    $xml->writeRaw('Name');
                $xml->endElement();
                $xml->startElement("news:language");
                    $xml->writeRaw('en');
                $xml->endElement();
            $xml->endElement();
            $xml->startElement('news:title');
                $xml->writeRaw($title);
            $xml->endElement();
            $xml->endElement();
    }
}
$xml->endElement();
$xml->flush();

1 Answer 1

3

One key point in programming is to reduce the complexity of some code. That includes reducing the indentation so that there is not so much code stuck into each other. That often is hard to follow.

For example, the if-clause that is inside your while body can be reduced to a great extend, moving the inner already one level up:

while ($row = mysql_fetch_assoc($result)) {

    if (empty($row)) {
        continue;
    }

    $title = $row['title'];
    ...
}

The continue inside the loop just says: next iteration.

There is also indentation for the XML tags you create. Not all can be prevented, however, some can. The XMLWriter::writeElement() method for example allows to output a whote element including it's inner text. This allows to reduce the following three lines:

$xml->startElement('loc');
    $xml->writeRaw('http://domain.com/article/');
$xml->endElement();

To a single one:

$xml->writeElement('loc', 'http://domain.com/article/');

As there are multiple groups of such lines, the code actually now is pretty shortened. The end can be improved as well by ending the document, then there even is no need to flush. To make the indentation more readable, you can also make use of square brackets to express the indentation:

while ($row = mysql_fetch_assoc($result)) {

    if (empty($row)) {
        continue;
    }

    $title = $row['title'];

    $xml->writeElement('loc', 'http://domain.com/article/');

    $xml->startElement('news:news');
    {
        $xml->startElement("news:publication");
        {
            $xml->writeElement("news:name", 'Name');
            $xml->writeElement("news:language", 'en');
        }
        $xml->endElement();

        $xml->writeElement('news:title', $title);
    }
    $xml->endElement();
}

$xml->endDocument();

So this is not only more readable, the good news is that the case where you did run into an error is fixed, too. That is because the XMLWriter::writeRaw() method has been removed. What that function does it write raw text, that means unescaped:

$title = 'hackers <3 noodles';

$xml->startElement('news:title');
    $xml->writeRaw($title);
$xml->endElement();

Output:

<news:title>hackers <3 noodles</news:title>
                    ^

As the output demonstrates, the < character went into the output verbatim. Depending on the title, even pure XML codes can be injected destroying the whole document structure and leading to the error. Using XMLWriter::writeElement() is immune to that:

$title = 'hackers <3 noodles';

$xml->writeElement('news:title', $title);

Output:

<news:title>hackers &lt;3 noodles</news:title>
                    ^^^^

As the output demonstrates, the propery XML Entity is used here preserving the document structure.

So the method you were originally looking for then is XMLWriter::text(). But you don't need it any longer for this case, because the optimized code does not have that problem any longer. All text output is properly encoded through XMLReader::writeElement(). See as well Retain XML code when using PHP XMLWriter::writeElement which is about the same topic but from the opposite.

I hope this is still of use for you as the question was a little older.

Sign up to request clarification or add additional context in comments.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.