0

I am trying to unformat a XML to single line. (Using JAVA)

I trying to use following regex to replace.

input.replaceAll(">\\s+", ">").replaceAll("\\s+<", "<");

However, it also will remove the space in front and behind element. Which is unexpected.

For example:

Scenario 01

Before: <AAA>{space}{space}{space}</AAA>

After: <AAA></AAA>

Scenario 02

Before: <AAA>{space}{space}123{space}{space}</AAA>

After: <AAA>123</AAA>

Scenario 03

Before: <AAA>{space}A{space}B{space}C{space}</AAA>

After: <AAA>A{space}B{space}C</AAA>

Is there any way to unformat and avoid scenario above?

3
  • Related: stackoverflow.com/a/1732454/18356 Commented Oct 1, 2020 at 6:54
  • It's not related. OP is asking to reformat XML by replacing line breaks, not parse anything. This is very possible with regex. Commented Oct 1, 2020 at 7:00
  • 1
    Processing XML using regular expressions is always dangerous. Much better to use an XML parser, every time. Commented Oct 1, 2020 at 8:27

2 Answers 2

1

A Saxon solution:

Processor p = new Processor(false);
DocumentBuilder db = p.newDocumentBuilder();
db.setWhitespaceStrippingPolicy(WhitespaceStrippingPolicy.ALL);
XdmNode doc = db.build(new File(...));
Serializer s = p.newSerializer(new File(...));
s.serialize(doc.asSource());

This gives you quite a lot of control over the format of the output by setting properties on the Serializer object.

Sign up to request clarification or add additional context in comments.

Comments

0

This will only replace vertical whitespaces following tag ends and preceding tag starts, e.g. "\n", "\r" or combinations, and others.

input.replaceAll(">\\v+", ">").replaceAll("\\v+<", "<");

Excerpt from https://www.regular-expressions.info/shorthand.html says:

\v matches “vertical whitespace”, which includes all characters treated as line breaks in the Unicode standard. It is the same as [\n\cK\f\r\x85\x{2028}\x{2029}].

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.