3

I'm trying to extract information from an ncx file via SimpleXML. The file looks like this:

<?xml version="1.0" encoding="UTF-8"?>
<ncx xmlns="http://www.daisy.org/z3986/2005/ncx/" version="2005-1">
<head>
    <meta name="dtb:uid" content="http://www.hxa7241.org/articles/content/epup-guide_hxa7241_2007_1.epub"/>
</head>
<docTitle>
    <text>Der Weg der Könige</text>
</docTitle>
<navMap>
    <navPoint id="toc1" playOrder="1">
        <navLabel>
            <text>Widmung</text>
        </navLabel>
        <content src="e9783641059446_ded01.html"/>
    </navPoint>
    <navPoint id="toc2" playOrder="2">
        <navLabel>
            <text>Inhaltsverzeichnis</text>
        </navLabel>
        <content src="e9783641059446_toc01.html"/>
    </navPoint>
    <navPoint id="toc3" playOrder="3">
        <navLabel>
            <text>PRÄLUDIUM</text>
        </navLabel>
        <content src="e9783641059446_fm02.html"/>
    </navPoint>
    <navPoint id="toc4" playOrder="4">
        <navLabel>
            <text>4500 JAHRE SPÄTER</text>
        </navLabel>
        <content src="e9783641059446_fm03.html"/>
    </navPoint>
    <navPoint id="toc5" playOrder="5">
        <navLabel>
            <text>PROLOG - TÖTEN</text>
        </navLabel>
        <content src="e9783641059446_fm04.html"/>
    </navPoint>
    <navPoint id="toc6" playOrder="6">
        <navLabel>
            <text>ERSTER TEIL - Über dem Schweigen</text>
        </navLabel>
        <content src="e9783641059446_p01.html"/>
        <navPoint id="toc7" playOrder="7">
            <navLabel>
                <text>1 - STURMGESEGNET</text>
            </navLabel>
            <content src="e9783641059446_c01.html"/>
        </navPoint>
        <navPoint id="toc8" playOrder="8">
            <navLabel>
                <text>2 - DIE EHRE IST TOT</text>
            </navLabel>
            <content src="e9783641059446_c02.html"/>
        </navPoint>
        <navPoint id="toc9" playOrder="9">
            <navLabel>
                <text>3 - DIE STADT DER GLOCKEN</text>
            </navLabel>
            <content src="e9783641059446_c03.html"/>
        </navPoint>
        <navPoint id="toc10" playOrder="10">
            <navLabel>
                <text>4 - DIE ZERBROCHENE. EBENE</text>
            </navLabel>
            <content src="e9783641059446_c04.html"/>
        </navPoint>
        <navPoint id="toc11" playOrder="11">
            <navLabel>
                <text>5 - HÄRETISCH</text>
            </navLabel>
            <content src="e9783641059446_c05.html"/>
        </navPoint>
        <navPoint id="toc12" playOrder="12">
            <navLabel>
                <text>6 - BRÜCKE VIER</text>
            </navLabel>
            <content src="e9783641059446_c06.html"/>
        </navPoint>
        <navPoint id="toc13" playOrder="13">
            <navLabel>
                <text>7 - ALLES, WAS VERNÜNFTIG IST</text>
            </navLabel>
            <content src="e9783641059446_c07.html"/>
        </navPoint>
        <navPoint id="toc14" playOrder="14">
            <navLabel>
                <text>8 - NÄHER ZUR FLAMME</text>
            </navLabel>
            <content src="e9783641059446_c08.html"/>
        </navPoint>
        <navPoint id="toc15" playOrder="15">
            <navLabel>
                <text>9 - VERDAMMNIS</text>
            </navLabel>
            <content src="e9783641059446_c09.html"/>
        </navPoint>
        <navPoint id="toc16" playOrder="16">
            <navLabel>
                <text>10 - GESCHICHTEN ÜBER CHIRURGEN</text>
            </navLabel>
            <content src="e9783641059446_c10.html"/>
        </navPoint>
        <navPoint id="toc17" playOrder="17">
            <navLabel>
                <text>11 - TROPFEN</text>
            </navLabel>
            <content src="e9783641059446_c11.html"/>
        </navPoint>
    </navPoint>
    <navPoint id="toc18" playOrder="18">
        <navLabel>
            <text>ZWISCHENSPIELE</text>
        </navLabel>
        <content src="e9783641059446_p02.html"/>
        <navPoint id="toc19" playOrder="19">
            <navLabel>
                <text>Z-1 - ISCHIKK</text>
            </navLabel>
            <content src="e9783641059446_c12.html"/>
        </navPoint>
        <navPoint id="toc20" playOrder="20">
            <navLabel>
                <text>Z-2 - NAN BALAT</text>
            </navLabel>
            <content src="e9783641059446_c13.html"/>
        </navPoint>
        <navPoint id="toc21" playOrder="21">
            <navLabel>
                <text>Z-3 - DER SEGEN DER UNWISSENHEIT</text>
            </navLabel>
            <content src="e9783641059446_c14.html"/>
        </navPoint>
    </navPoint>
    <navPoint id="toc22" playOrder="22">
        <navLabel>
            <text>ZWEITER TEIL - Die leuchtenden Stürme</text>
        </navLabel>
        <content src="e9783641059446_p03.html"/>
        <navPoint id="toc23" playOrder="23">
            <navLabel>
                <text>12 - EINHEIT</text>
            </navLabel>
            <content src="e9783641059446_c15.html"/>
        </navPoint>
        <navPoint id="toc24" playOrder="24">
            <navLabel>
                <text>13 - ZEHN HERZSCHLÄGE</text>
            </navLabel>
            <content src="e9783641059446_c16.html"/>
        </navPoint>
        <navPoint id="toc25" playOrder="25">
            <navLabel>
                <text>14 - ZAHLTAG</text>
            </navLabel>
            <content src="e9783641059446_c17.html"/>
        </navPoint>
        <navPoint id="toc26" playOrder="26">
            <navLabel>
                <text>15 - DER KÖDER</text>
            </navLabel>
            <content src="e9783641059446_c18.html"/>
        </navPoint>
        <navPoint id="toc27" playOrder="27">
            <navLabel>
                <text>16 - KOKONS</text>
            </navLabel>
            <content src="e9783641059446_c19.html"/>
        </navPoint>
        <navPoint id="toc28" playOrder="28">
            <navLabel>
                <text>17 - EIN BLUTROTER SONNENUNTERGANG</text>
            </navLabel>
            <content src="e9783641059446_c20.html"/>
        </navPoint>
        <navPoint id="toc29" playOrder="29">
            <navLabel>
                <text>18 - DER GROSSPRINZ DES KRIEGES</text>
            </navLabel>
            <content src="e9783641059446_c21.html"/>
        </navPoint>
        <navPoint id="toc30" playOrder="30">
            <navLabel>
                <text>19 - DER STURZ DER STERNE</text>
            </navLabel>
            <content src="e9783641059446_c22.html"/>
        </navPoint>
        <navPoint id="toc31" playOrder="31">
            <navLabel>
                <text>20 - SCHARLACHROT</text>
            </navLabel>
            <content src="e9783641059446_c23.html"/>
        </navPoint>
        <navPoint id="toc32" playOrder="32">
            <navLabel>
                <text>21 - WARUM MENSCHEN LÜGEN</text>
            </navLabel>
            <content src="e9783641059446_c24.html"/>
        </navPoint>
        <navPoint id="toc33" playOrder="33">
            <navLabel>
                <text>22 - AUGEN, HÄNDE ODER KUGELN?</text>
            </navLabel>
            <content src="e9783641059446_c25.html"/>
        </navPoint>
        <navPoint id="toc34" playOrder="34">
            <navLabel>
                <text>23 - VIELSEITIG</text>
            </navLabel>
            <content src="e9783641059446_c26.html"/>
        </navPoint>
        <navPoint id="toc35" playOrder="35">
            <navLabel>
                <text>24 - DIE GALERIE DER LANDKARTEN</text>
            </navLabel>
            <content src="e9783641059446_c27.html"/>
        </navPoint>
        <navPoint id="toc36" playOrder="36">
            <navLabel>
                <text>25 - DER SCHLÄCHTER</text>
            </navLabel>
            <content src="e9783641059446_c28.html"/>
        </navPoint>
        <navPoint id="toc37" playOrder="37">
            <navLabel>
                <text>26 - STILLE</text>
            </navLabel>
            <content src="e9783641059446_c29.html"/>
        </navPoint>
        <navPoint id="toc38" playOrder="38">
            <navLabel>
                <text>27 - KLUFTDIENST</text>
            </navLabel>
            <content src="e9783641059446_c30.html"/>
        </navPoint>
        <navPoint id="toc39" playOrder="39">
            <navLabel>
                <text>28 - ENTSCHEIDUNG</text>
            </navLabel>
            <content src="e9783641059446_c31.html"/>
        </navPoint>
    </navPoint>
    <navPoint id="toc40" playOrder="40">
        <navLabel>
            <text>ZWISCHENSPIELE</text>
        </navLabel>
        <content src="e9783641059446_p04.html"/>
        <navPoint id="toc41" playOrder="41">
            <navLabel>
                <text>Z-4 - RYSN</text>
            </navLabel>
            <content src="e9783641059446_c32.html"/>
        </navPoint>
        <navPoint id="toc42" playOrder="42">
            <navLabel>
                <text>Z-5 - DER SAMMLER AXIES</text>
            </navLabel>
            <content src="e9783641059446_c33.html"/>
        </navPoint>
        <navPoint id="toc43" playOrder="43">
            <navLabel>
                <text>Z-6 - EIN KUNSTWERK</text>
            </navLabel>
            <content src="e9783641059446_c34.html"/>
        </navPoint>
    </navPoint>
    <navPoint id="toc44" playOrder="44">
        <navLabel>
            <text>DRITTER TEIL - Sterben</text>
        </navLabel>
        <content src="e9783641059446_p05.html"/>
        <navPoint id="toc45" playOrder="45">
            <navLabel>
                <text>29 - IRRMASSUNG</text>
            </navLabel>
            <content src="e9783641059446_c35.html"/>
        </navPoint>
        <navPoint id="toc46" playOrder="46">
            <navLabel>
                <text>30 - UNSICHTBARE FINSTERNIS</text>
            </navLabel>
            <content src="e9783641059446_c36.html"/>
        </navPoint>
        <navPoint id="toc47" playOrder="47">
            <navLabel>
                <text>31 - UNTER DER HAUT</text>
            </navLabel>
            <content src="e9783641059446_c37.html"/>
        </navPoint>
        <navPoint id="toc48" playOrder="48">
            <navLabel>
                <text>32 - SEITENTRAGEN</text>
            </navLabel>
            <content src="e9783641059446_c38.html"/>
        </navPoint>
        <navPoint id="toc49" playOrder="49">
            <navLabel>
                <text>33 - CYMATIK</text>
            </navLabel>
            <content src="e9783641059446_c39.html"/>
        </navPoint>
        <navPoint id="toc50" playOrder="50">
            <navLabel>
                <text>34 - STURMWAND</text>
            </navLabel>
            <content src="e9783641059446_c40.html"/>
        </navPoint>
        <navPoint id="toc51" playOrder="51">
            <navLabel>
                <text>35 - EIN LICHT ZU SEHEN</text>
            </navLabel>
            <content src="e9783641059446_c41.html"/>
        </navPoint>
        <navPoint id="toc52" playOrder="52">
            <navLabel>
                <text>36 - DIE LEKTION</text>
            </navLabel>
            <content src="e9783641059446_c42.html"/>
        </navPoint>
    </navPoint>
    <navPoint id="toc53" playOrder="53">
        <navLabel>
            <text>SCHLUSSBEMERKUNG</text>
        </navLabel>
        <content src="e9783641059446_bm01.html"/>
    </navPoint>
    <navPoint id="toc54" playOrder="54">
        <navLabel>
            <text>ARS ARCANUM</text>
        </navLabel>
        <content src="e9783641059446_bm02.html"/>
    </navPoint>
    <navPoint id="toc55" playOrder="55">
        <navLabel>
            <text>DANKSAGUNG</text>
        </navLabel>
        <content src="e9783641059446_ack01.html"/>
    </navPoint>
    <navPoint id="toc56" playOrder="56">
        <navLabel>
            <text>Die Sturmlicht-Chroniken werden fortgesetzt in:</text>
        </navLabel>
        <content src="e9783641059446_tea01.html"/>
    </navPoint>
    <navPoint id="toc57" playOrder="57">
        <navLabel>
            <text>Copyright</text>
        </navLabel>
        <content src="e9783641059446_cop01.html"/>
    </navPoint>
</navMap>
</ncx>

And I want to grab all the html files, which are located in this element:

<content src="e9783641059446_cop01.html"/>

Currently I'm trying it this way:

$ncx = simplexml_load_file($file);
$items = $ncx->navMap->children();
foreach ($items as $it) {
    echo $it->content['src'];
}

Problem is that the content nodes are not in the same depth level as you might have noticed. Does anyone know how to fix it?

8
  • I can't understand your question exactly. Do you want to get all tag in e9783641059446_cop01.html file? Commented Oct 17, 2016 at 10:39
  • I want to get all <content /> tags to get the src="" attribute. The searched tags always looks like this: <content src="{some HTML-file}"/> Commented Oct 17, 2016 at 10:41
  • Why you dont use DOMDocument class? Commented Oct 17, 2016 at 10:42
  • if you want all src attributes of the content elements, regardless of position in the doc, just use $ncx->xpath("//content/@src"). Commented Oct 17, 2016 at 10:43
  • Because I don't know how to and SimpleXML was the first thing I saw when I looked up on how to search xml structured files. And also I need to process the result with php Commented Oct 17, 2016 at 10:44

2 Answers 2

1

The XML has a namespace. Try with

$ncx = simplexml_load_file('test.xml');
$ncx->registerXPathNamespace('x', 'http://www.daisy.org/z3986/2005/ncx/');
foreach ($ncx->xpath('//x:content/@src') as $src) {
    echo $src, PHP_EOL;
}

Without XPath:

$ncx = simplexml_load_file('test.xml', "SimpleXmlElement", 0, 'http://www.daisy.org/z3986/2005/ncx/', false);
foreach ($ncx->navMap->navPoint as $np) {
    echo $np->content->attributes()->src, PHP_EOL;
}
Sign up to request clarification or add additional context in comments.

Comments

0

the content nodes are not in the same depth level

Also you can use DOMDocument class easily, to finding target tag. Note that getElementsByTagName() get tag by name and getAttribute() get value of attribute.

$dom = DOMDocument::load($file);
$contents = $dom->getElementsByTagName("content");
foreach($content as $content){
    echo $content->getAttribute("src");
}

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.