Creating an entire DOM for this is rather overkill. You'll have the entire XML tree in memory, which can be rather heavy for large documents. I suggest one of the following:
- Parse with SAX or StAX, simply copying stuff to output unless you want it filtered out.
- Apply an XSLT transformation that copies everything by default, but has one or more templates that don't do anything with their input, thus filtering it out.
Option 2 is the easiest and in my experience XSLT in Java is fast and memory-efficient, especially for a simple use-case like this.
These two templates will be what you need:
The default copy
<xsl:template match="node()|@*">
<xsl:copy><xsl:apply-templates select="node()|@*"/><xsl:copy>
</xsl:template>
The "filter":
<xsl:template match="//*[your predicate here]">
<!-- Don't do a thing -->
</xsl:template>
EDIT: I just noticed that you don't just filter out specific names, but those that match a regular expression. XPath functions could be enough to make the predicate selecting the target nodes. But if needed, Java String functions can be used in XSLT through extension. It does make this solution slightly more complicated, but still worth it for taking the XML parsing out of your hands.