0

Suppose I have a String containing static tags that looks like this:

mystring = "[tag]some text[/tag] untagged text [tag]some more text[/tag]"

I want to remove everything between each tag pair. I've figured out how to do so by using the following regex:

mystring = mystring.replaceAll("(?<=\\[tag])(.*?)(?=\\[/tag])", "");

The result of which will be:

mystring = "[tag][/tag] untagged text [tag][/tag]"

However, I'm unsure how to accomplish the same goal if the opening tag is dynamic. Example:

mystring = "[tag parameter="123"]some text[/tag] untagged text [tag parameter="456"]some more text[/tag]"

The "value" of the parameter portion of the tag is dynamic. Somehow, I have to introduce a wildcard to my current regex, but I am unsure how to do this.

Essentially, replace the contents of all pairings of "[tag*]" and "[/tag]" with empty string.

An obvious solution would be to do something like this:

mystring = mystring.replaceAll("(?<=\\[tag)(.*?)(?=\\[/tag])", "");

However, I feel like that would be hacking around the problem because I'm not really capturing a full tag.

Could anyone provide me with a solution to this problem? Thanks!

5
  • Your regex already contains (.*?) that is what you need to put before ] Commented Mar 21, 2018 at 15:46
  • Thanks for your reply. I've tried this, but this does not allow me to compile. The error is "repetition not allowed inside lookbehind." Commented Mar 21, 2018 at 15:53
  • @LanceToth Look behind can't be non fixed width Commented Mar 21, 2018 at 15:58
  • Fair enough. If the tag is always dynamic, you could look for a ] preceded by anything but g (?<=[^g]\])(.*?)(?=\[\/tag]) (mind, I removed some of the escape characters) Commented Mar 21, 2018 at 16:02
  • If you changed square brackets to angle brackets and wrapped the string in a root tag you could use an XML parser. In fact, why don’t you just use XML instead of your only-slightly mutated form of XML? Commented Mar 21, 2018 at 16:13

2 Answers 2

2

I guess I've got it.

I thought long and hard about what @AshishMathew said, and yeah, lookbehinds can't have unfixed, lengths, but maybe instead of replacing it with nothing, we add a ] to it, like so:

mystring = mystring.replaceAll("(?<=\\[tag)(.*?)(?=\\[/tag])", "]");

(?<=\\[tag) is the look-behind which matches [tag

(.*?) is all the code between [tag and [/tag], which may even be the parameters of the tag, all of which is replaced by a ]

When I tried this code by replacing the match with "", I got [tag[/tag] untagged text [tag[/tag] as the output. Hence, by replacing the match with a ] instead of nothing, you get the (hopefully) desired output.

So this is my lazy solution (pardon the regex pun) to the problem.

Sign up to request clarification or add additional context in comments.

11 Comments

This definitely works and exists in some state sort of in between capturing the whole tag and the obvious solution I suggested, lol. I guess I can probably do this. If no one posts an answer that allows me to just replace with emptystring, I'll mark this one the answer. Thank you for your help.
@Andrew Anytime! Also, I'll try to post an answer that works without replacing the match with ] :)
@Andrew I've refined the code a little, but still not sure how to do it without replacing with ]
@Andrew Here's a link to a question about lookbehinds with variable length, and hopefully you'll figure out a solution to the problem.
@Cofeehouse Coder Than you!
|
1

I suggest matching the whole tag with content and replacing with the opening/closing tags without content :

mystring.replaceAll("\\[tag[^\\]]*\\][^\\[]*\\[/tag]", "[tag][/tag]")

Ideone test.

Note that I didn't bother conserving the tag attributes since you mentionned in another answer's comments that you didn't need them, but they could be kept by using a capturing group.

4 Comments

Thanks for your answer! The only issue with this is it will strip the contents of any tag, not specifically tags with the name "tag".
@Andrew I thought tag was a placeholder, I should have been more careful reading your question. Let me fix that :) Oh, and should I go along and just delete the tag as you suggested you were doing afterward in the other question's comments?
It's my mistake. I shouldn't have used the word "tag" because it made the situation confusing.
@Andrew Here, fixed. Much easier than what I had originally done ;)

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.