0

I want to find and replace values like:

<TAG>heading<foo></foo></TAG><foo>juergen</foo>

goal:

<TAG>heading</TAG><foo>juergen</foo>

I want to remove the <foo> Tags between <TAG></TAG>

Here is my attempt:

replaceAll("</?foo\\b[^>]*>", "");
5
  • 1
    Nice attempt. What's going wrong? Commented May 27, 2013 at 22:54
  • all foo Tags are deleted. but i need only to delete the tags between <TAG></TAG> Commented May 27, 2013 at 22:55
  • 1
    Use an XML Parser for that problem. Regex is not the right tool for that job. Commented May 27, 2013 at 22:57
  • 2
    You aren't trying to parse HTML with regex, are you? Commented May 27, 2013 at 22:58
  • 1
    Please stop generically whining about things that very often have a perfectly valid use case. Parsing HTML with a regex is sometimes a good idea, sometimes not, stop trying to pass it off as evil by definition. Commented May 27, 2013 at 23:36

3 Answers 3

1
String result = searchText.replaceAll("(<f.*><.*o>)(?=<)", "");
Sign up to request clarification or add additional context in comments.

3 Comments

The javascript alternative txt.replace(/(<f.*><.*o>)(?=<)/g,"") doesn't seem to be removing anything: pastebin.com/N2EpQtJ9
@Mr.Russian it seems to not like this text: <TAG>keep!<foo>throw</foo>keep!</TAG><foo>keep!</foo>
@Isaac - your example is considerably more complex, but try this searchText.replace(/(<f.*>.*<.*o>)(?=(.*(?=\<\/T.*)(?=\<\/)))/gm, "")
1

Assuming that foo is empty, you can use:

<([^/][^>]*)></\1>

This searches for an opening tag with an adjacent closing tag of the same name.

You could augment it to allow for whitespace in the middle with:

<([^/][^>]*)>\s*</\1>

1 Comment

Thankz for your reply, but it doesnt work. The name is <foo>. and i need only replace this name.
1

Possible duplicate RegEx match open tags except XHTML self-contained tags

Otherwise, here is the regex, do not even ask me to explain, I barely know myself (this is in javascript, some corrections may need to be made for java):

var txt = "<TAG>a<foo>b</foo>c</TAG>d<foo>e</foo>f<TAG>g<foo>h</foo>i</TAG>j<TAG>k</TAG>";
var res = txt.replace(/(<TAG>.*?)<foo>.*?<\/foo>(.*?<\/TAG>)/gm,"$1$2");
//                     (   $1   )               (    $2    )

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.