1

I want to parse this xml file and use XPATH retrieve some fields (this is how I see xml file by the way, you can download it from here):

I try the following code to get the edition of the book:

$file_copac = "http://copac.ac.uk/search?isn=$isbn&rn=1&format=XML+-+MODS";
$xml = simplexml_load_file($file_copac) or die("cannot_get_data");
$temp = $xml->xpath('//edition');
var_dump($temp);

I also tried 'edition' but the result is empty array for both:

array(0) { }

I tried full path using '/mods/originInfo[1]/edition' which ended with an XPATH error. I solve problem with this notation:

$edition = (string)$xml->mods->originInfo[1]->edition;

However I wonder the problem with xpath.

5
  • 1
    There is no edition tag anywhere in given xml... And originInfo only contains issue date and place Commented Nov 25, 2012 at 15:49
  • That XML has no <edition> I see only two attributes edition="22" Commented Nov 25, 2012 at 15:50
  • There are three originInfo tags and second one has edition tag in it as far as I can see. Commented Nov 25, 2012 at 15:52
  • So post relevant xml in your question, maybe link you have provided serves different content based on location / authentication / other factors, so we are unable to see it. Commented Nov 25, 2012 at 15:57
  • I add a screenshot link to the post, thanks. Commented Nov 25, 2012 at 16:00

1 Answer 1

3

Looks like problem with default (empty) namespace, workaround for this:

$namespaces = $xml->getDocNamespaces(); 
$xml->registerXPathNamespace('__DEFAULT_NS__', $namespaces['']);
$r = $xml->xpath('//__DEFAULT_NS__:edition');

var_dump((string)$r[0]); //string(8) "8th. ed."
Sign up to request clarification or add additional context in comments.

4 Comments

Thanks for your effort. Could you explain it further? I somewhat sense this but I get confused since 'modsCollection' and 'mods' tags have namespace definitons.
Yes, they have namespace definition, but with no prefix. XPath does not use default namespace, so it returns only non namespaced elements. To query elements from default namespace you have to provide prefix for it, and since here prefix is empty, you have to create an alias for it, then use this alias in your query.
Is this a PHP-specific or SimpleXML problem? I can get values without any namespace definitions in a C# program. Thank you again for the clear explanation.
It's not only SimpleXML, in php DOMDocument behaves exactly the same way so and from quick search it appears that in C# also - stackoverflow.com/questions/585812/… so it may depend on implementation

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.