0

I need a template-engine for a project, that parses sgml-content and converts user-defined tags like <ext:grid />. In nearly every case, the input-content is valid. A common problem is the link-generator. The link-generator produces & instead of &amp; which is the main cause of my parsing-problems. I cant change this behavior, because the output of that generator is used in many other situations where the links are required to have an & instead of &amp;.

I have tried DOMDocument, SimpleXML and xml_parser. They all exit on entity problems. Any ideas? All I want is, that this "problem" gets simply ignored by the parser.

Where is a test-template:

<template xmlns:grid="templates/grid" xmlns:std="templates/std">
    <std:header text="Overview" type="h1" />
    <grid:base width="100%">
        <grid:columns>
        <grid:body>
            <?php foreach($products as $product): /* @var $product Dfm_Shop_Model_Product */ ?>
            <grid:row selectable="1">
                <grid:cell>
                    <div><?php echo $this->esc($product->getTitle()) ?></div>
                </grid:cell>
                <grid:cell>
                    <a href="<?php env()->http()->to(array('controller' => 'Dfm_Shop_Controller_Products', 'method' => 'showEdit')) ?>"><std:img src="icons/pencil.png" hint="Edit" /></a>
                </grid:cell>
            </grid:row>
            <?php endforeach ?>
        </grid:body>
    </grid:base>
</template>

2 Answers 2

1

Can you just search-replace your & to &amp; before trying to parse the document?


Edit: Just to add for the completeness, there's for example QueryPath that can handle invalid tags, too.

According to the thread linked above, libxml functions should've also worked.

Sign up to request clarification or add additional context in comments.

4 Comments

preg_replace would consume too much performance. A simple str_replace would replace &amp; with &amp;amp;. I should try strtr for this...
you can first str_replace &amp; to &, and then replace back :) that would of course fail with any other proper xml entities, like &apos; ...
$content = strtr($content, array('&amp;' => '&amp;', '&' => '&amp;')); does the trick
This problem is actually that your XML is not XML at all, but something close to it. The content is virtually correct xml. Now DOMDocument can parse the document without any problems.
0

If you can, try to use cdata with xml.

<![CDATA[your content]]>

http://www.w3schools.com/xml/xml_cdata.asp

2 Comments

That would require me to all CDATA everywhere I used my link-generator. That would be the same if I would just use htmlentities around the link-generator. Too easy to forget :/
But it can solve your problem permanently. To build a good system we need to do some extra efforts.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.