1

So I have XML in $xml. It looks like this:

http://localhost:8888/?purp=oclcn&xml=<?xml version="1.0" encoding="UTF-8" standalone="no"?>
<record xmlns="http://www.loc.gov/MARC21/slim">
    <leader>00000cam a2200000 a 4500</leader>
    <controlfield tag="001">33333502</controlfield>
    <controlfield tag="008">951010s1996    vtua     b    001 0 eng  </controlfield>
    <datafield ind1=" " ind2=" " tag="010">
      <subfield code="a">   95045582 </subfield>
    </datafield>
    <datafield ind1=" " ind2=" " tag="020">
      <subfield code="a">1858983274</subfield>
    </datafield>
    <datafield ind1=" " ind2=" " tag="020">
      <subfield code="a">9781858983271</subfield>
    </datafield>
    <datafield ind1="0" ind2="0" tag="245">
      <subfield code="a">Economic sociology /</subfield>
      <subfield code="c">edited by Richard Swedberg.</subfield>
    </datafield>
    <datafield ind1=" " ind2=" " tag="260">
      <subfield code="a">Cheltenham, Glos, UK ;</subfield>
      <subfield code="a">Brookfield, Vt., US :</subfield>
      <subfield code="b">E. Elgar Pub. Co.,</subfield>
      <subfield code="c">©1996.</subfield>
    </datafield>
    <datafield ind1=" " ind2=" " tag="300">
      <subfield code="a">xv, 574 pages :</subfield>
      <subfield code="b">illustrations ;</subfield>
      <subfield code="c">25 cm.</subfield>
    </datafield>
    <datafield ind1="1" ind2=" " tag="490">
      <subfield code="a">The international library of critical writings in sociology ;</subfield>
      <subfield code="v">5</subfield>
    </datafield>
    <datafield ind1="1" ind2=" " tag="490">
      <subfield code="a">An Elgar reference collection</subfield>
    </datafield>
    <datafield ind1=" " ind2=" " tag="500">
      <subfield code="a">A collection of journal articles previously published between 1940-1994.</subfield>
    </datafield>
    <datafield ind1=" " ind2="0" tag="650">
      <subfield code="a">Economics</subfield>
      <subfield code="x">Sociological aspects.</subfield>
    </datafield>
    <datafield ind1=" " ind2="0" tag="650">
      <subfield code="a">Sociology.</subfield>
    </datafield>
    <datafield ind1=" " ind2="0" tag="650">
      <subfield code="a">Economics.</subfield>
    </datafield>
    <datafield ind1=" " ind2="6" tag="650">
      <subfield code="a">Économie politique</subfield>
      <subfield code="x">Aspect sociologique.</subfield>
    </datafield>
    <datafield ind1=" " ind2="6" tag="650">
      <subfield code="a">Sociologie.</subfield>
    </datafield>
    <datafield ind1=" " ind2="6" tag="650">
      <subfield code="a">Économie politique.</subfield>
    </datafield>
    <datafield ind1=" " ind2="7" tag="650">
      <subfield code="a">Economics.</subfield>
      <subfield code="2">fast</subfield>
      <subfield code="0">(OCoLC)fst00902116</subfield>
    </datafield>
    <datafield ind1=" " ind2="7" tag="650">
      <subfield code="a">Economics</subfield>
      <subfield code="x">Sociological aspects.</subfield>
      <subfield code="2">fast</subfield>
      <subfield code="0">(OCoLC)fst00902213</subfield>
    </datafield>
    <datafield ind1=" " ind2="7" tag="650">
      <subfield code="a">Sociology.</subfield>
      <subfield code="2">fast</subfield>
      <subfield code="0">(OCoLC)fst01123875</subfield>
    </datafield>
    <datafield ind1="1" ind2="7" tag="650">
      <subfield code="a">Economische sociologie.</subfield>
      <subfield code="2">gtt</subfield>
    </datafield>
    <datafield ind1=" " ind2="7" tag="650">
      <subfield code="a">Sociologie économique.</subfield>
      <subfield code="2">ram</subfield>
    </datafield>
    <datafield ind1=" " ind2="7" tag="650">
      <subfield code="a">Économie politique</subfield>
      <subfield code="x">Sociologie.</subfield>
      <subfield code="2">ram</subfield>
    </datafield>
    <datafield ind1="0" ind2="7" tag="650">
      <subfield code="a">Wirtschaftssoziologie.</subfield>
      <subfield code="2">swd</subfield>
    </datafield>
    <datafield ind1=" " ind2="4" tag="650">
      <subfield code="a">Sociologie.</subfield>
    </datafield>
    <datafield ind1=" " ind2="4" tag="650">
      <subfield code="a">Économie politique.</subfield>
    </datafield>
    <datafield ind1=" " ind2="4" tag="650">
      <subfield code="a">Économie politique - Aspect sociologique.</subfield>
    </datafield>
    <datafield ind1="0" ind2="7" tag="650">
      <subfield code="a">Wirtschaftssoziologie.</subfield>
      <subfield code="0">(DE-588)4066514-8</subfield>
      <subfield code="2">gnd</subfield>
    </datafield>
    <datafield ind1="1" ind2=" " tag="700">
      <subfield code="a">Swedberg, Richard.</subfield>
    </datafield>
  </record>

I am trying to get the value of "tag" attribute of every element. However, the foreach loop is not working. It only echoes 008hello, and no more. How come?

$dataf = $xml->getElementsByTagName("datafield");
$controlf = $xml->getElementsByTagName("controlfield");


        $count = $dataf->length + $controlf->length;

I put the contents of each DOMNodeList into an array so I can merge them together: $DOMarray = array();

        $i = 1;
        while ($i <= $controlf->length) {
        $p = $controlf->item($i);
        $DOMarray[] = $p;
        $i++;
}

        $i = 1;
        while ($i <= $dataf->length) {
        $p = $dataf->item($i);
        $DOMarray[] = $p;
        $i++;
}

Now I wish to get the value of attribute tag of each element:

echo get_class($DOMarray[$number]);
echo sizeof($DOMarray);
foreach($DOMarray as $DOMe) {
    echo $DOMe->getAttribute("tag");
    echo "hello";
}
// echo $DOMarray[$number]->getAttribute("tag");
}
}
1
  • 1
    Please format your question. It is unreadable. Commented Jan 7, 2017 at 9:44

2 Answers 2

1

The problem is in the set up of your loops.

In your data, there are 2 elements in the $controlf array, however, since you are starting your counter at 1 you are skipping the first. This is why you are getting 008 and not the first value 001.

$i = 1;
    while ($i <= $controlf->length) {
    $p = $controlf->item($i);
    $DOMarray[] = $p;
    $i++;
}

To fix this, start your counter at 0 and use < rather than <=:

$i = 0;
    while ($i < $controlf->length) {
    $p = $controlf->item($i);
    $DOMarray[] = $p;
    $i++;
}

In general, when working with loops, you would usually use < because the indices of arrays are zero-based and the length property is the actual number of elements. So the length property will always be 1 higher than than the highest index value.

Also, you may find using foreach a little cleaner in this case. The following is equivalent to the code above:

foreach ($controlf as $p){
    $DOMarray[] = $p;
}
Sign up to request clarification or add additional context in comments.

Comments

0

Your XML uses a namespace, so you should use the namespace aware methods, too. That means getElementsByTagNameNS(). I suggest defining the namespaces in an associative array. Here is no need to convert the DOMNodeList objects into an array using a loop. DOMNodeList implements iterator and here is a function iterator_to_array() for exactly that job. But why create the array of nodes at all, you can just iterate over the DOMNodeList and read the tag attribute.

$xmlns = [
  'slim' => 'http://www.loc.gov/MARC21/slim'
];
$document = new DOMDocument();
$document->loadXml($xml);

$result = [];
$nodes = $document->getElementsByTagNameNS($xmlns['slim'], 'controlfield');
foreach ($nodes as $node) {
  $result[] = $node->getAttribute('tag');
}
$nodes = $document->getElementsByTagNameNS($xmlns['slim'], 'datafield');
foreach ($nodes as $node) {
  $result[] = $node->getAttribute('tag');
}

var_dump($result);

Output:

array(29) {
  [0]=>
  string(3) "001"
  [1]=>
  string(3) "008"
  [2]=>
  string(3) "010"
  [3]=>
  string(3) "020"
  ...

An even easier way is to use Xpath. Attributes are nodes and Xpath allows you to fetch them directly.

$document = new DOMDocument();
$document->loadXml($xml);
$xpath = new DOMXpath($document);
$xpath->registerNamespace('slim', 'http://www.loc.gov/MARC21/slim');

$result = [];
$expression = '/slim:record/slim:controlfield/@tag|/slim:record/slim:datafield/@tag';
foreach ($xpath->evaluate($expression) as $attribute) {
  $result[] = $attribute->value;
}

var_dump($result);

Xpath 1.0 does not have a concept of a default namespace, so you need to register a prefix for it. After that you can filter the nodes in the DOM with a location path. /slim:record/slim:controlfield fetches all controlfield elements. @tag fetches the tag attribute nodes. The | acts as an OR for the filter expression.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.