1

How can I write a regular expression to retrieve values from xml node? Actually the node structure is very big. So we can't traverse easily, so I want to read as normal text file and hope I can write a regex to find out the matching elements.

<node1>
 <node2>str</node2>
 <node3>Text</node3>
 <myvalue>Here is the values string..</myvalue>
</node1>

The above is the pattern I want to retrieve values <myvalue></myvalue> but in my xml there are so many other node contains the <myvalue> child. So only way to find out the appropriate node which I want is in the above pattern. The only change in value rest of the node values are same <node2>str</node2>, <node3>Text</node3> are always same.

So how can I write the regex for php?

1
  • Show a real example of the XML document, including the problematic areas (many myvalue nodes, complex structure, etc.). Commented Sep 26, 2010 at 19:49

3 Answers 3

2

Use a XML parser, Regex is not appropriate to do that kind of parsing.

Here's the list of the XML parser you can use :

Here's a simple example with DOM that will find all the myvalue located in the node1.

<?php
    $document = new DOMDocument();
    $document->loadXML(
        '<all>
            <myvalue>Elsewhere</myvalue>
            <node1>
                <node2>str</node2>
                <node3>Text</node3>
                <myvalue>Here is the values string..</myvalue>
            </node1>
        </all>');
    $lst = $document->getElementsByTagName('node1');

    for ($i=0; $i<$lst->length; $i++) {
        $node1= $lst->item($i);
        $myvalue = $node1->getElementsByTagName('myvalue');

        if ($myvalue->length > 0) {
            echo $myvalue->item(0)->textContent;
        }
    }
?>
Sign up to request clarification or add additional context in comments.

3 Comments

but finding that node is bit difficult task.. that is why i prefer regx
@coderex It is easier and you sure to get accurate result everytime.
@coderex you can use XPath to search through the XML once you parse it, for example with SimpleXML: tuxradar.com/practicalphp/12/3/3
1

PHP has a SAX-based XML parser which will let you use a real XML parser without storing an entire DOM tree in memory. XMLReader lets you parse the file without even reading the entire file into memory. Using regex to dig into XML is going to be painful.

Comments

0

If you insist on using regular expression for this, try

preg_match_all('<myvalue>([\s\S]+)<\/myvalue>', $text, $matches);

3 Comments

but i need to check this also
preg_match_all('<node2>str<\/node2><node3>Text<\/node3><myvalue>([\s\S]+)<\/myvalue>' $text, $matches);
but in xml each node have a new line char , it think, so my try fails in this case. so now i need to remove the space an newlines chars

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.