0

I need to delete the offer element with city Moscow in the following XML:

<offer id="14305" available="true">
<param name="City">Moscow</param>
</offer>
<offer id="14306" available="true">
<param name="City">LA</param>
</offer>

How can I do it with PHP regular expressions?

I tried:

preg_replace('/<offer[^(>Moscow<).]+?<\/offer>/s', ''. $string);

but without success.

I read your advices. It is really great. But I have a new problem with greedy:

<offer id="14305" available="true">
<param name="Color">Red</param>
<engine>XYZ</engine>
<param name="City">Moscow</param>
</offer>
<offer id="14306" available="true">
<param name="Color">Red</param>
<param name="City">LA</param>
</offer>
<offer id="14306" available="true">
<weight>1000</weight>
<param name="Color">Red</param>
<param name="City">LA</param>
</offer>

My regexp is too greedy :(

<offer.*?>\s*?<param.*?>\s*?Moscow\s*?<\/param>\s*?<\/offer>
3
  • 2
    Why not use php's xml functions to strip out the element, preg_replace is a lot more error-prone. Commented Mar 23, 2016 at 12:48
  • 2
    regex is not the tool to parse XML. Commented Mar 23, 2016 at 13:11
  • Yes. File is too big. I will use SimpleXML. Commented Mar 24, 2016 at 8:28

2 Answers 2

2

Use this RegEx:

<offer.*?>\s*?<param.*?>\s*?Moscow\s*?<\/param>\s*?<\/offer>

Live Demo on Regexr


How it works:

<offer.*?>    # Select opening <offer> with optional parameters
\s*?          # Optional Whitespace
<param.*?>    # Select opening <param> with options parameters
\s*?          # Optional Whitespace
Moscow        # Select Moscow text
\s*?          # Optional Whitespace
<\/param>     # Select closing </param>
\s*?          # Optional Whitespace
<\/offer>     # Select closing </offer>
Sign up to request clarification or add additional context in comments.

6 Comments

What about new lines and indentation?
@dotancohen You can just put \s* between each element to match any whitespace, no biggie.
@dotancohen I have updated it to work with newlines and indentation / whitespace, using \n?\s*? That should work, yes?
@Druzion: \s already matches newlines, so it's not necessary to put \n next to \s. You can just put \s* between your XML-elements and that will be enough. Your current solution won't match if there is trailing whitespace before the newline.
@klaar I suspected so, thanks! I'll update the answer
|
0
<offer[^>]*>[^<]*<param[^>]*>Moscow<\/param>[^<]*<\/offer>

DEMO

3 Comments

This won't work if the offer-XML-elements contain anything other than suspected in this pattern.
@klaar I didn't get what you mean
It will only match offer items that also and only contain param children items, but no other offer items that might contain other things. As Sebastiaan de Rooij said in a comment, it's better to let this be handled by a proper XML-parser instead of using regexp.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.