0

I have the following data in a variable.

<ctx>
  <PostCode>XXXXXX</PostCode>
  <Title1>Mr</Title1>
  <Name>John</Name>
  <Order1>£100.00</Order1>
  <Order2>£100.01</Order2>
  <Date>10/10/2010</Date
</ctx>

Using the following regex var payload = ctx.toString().match(/Order[1-9]/g); I get the following results

Order1,Order1,Order2,Order2

How can I make it stop at Order1, Order2 as is counting the second closing tag, also I can't use <Order[1-9]> (opening tag) as my application does not allow me to capture the tags <>. Basically a regex that returns unique values.

So the following regex seems to work to some extend. (Order[0-9])(?!.*\1) (Order[0-9])(?!.*\1)

https://regex101.com/r/6QhFBg/1

2 Answers 2

2

Never parse XML with regex. Wrong tool for the job – leads to brittle solutions.

Instead, use a real XML parser or XPath.

For example, this XPath,

//*[starts-with(local-name(), 'Order')]

will robustly select all elements whose name starts with "Order".

In JavaScript in the browser, XPath expressions are evaluated via document.evaluate:

var orders = document.evaluate('//Order[starts-with(local-name(), 'Order')]', document, 
                               null, XPathResult.ANY_TYPE, null );
var thisOrder = orders.iterateNext();

while (thisOrder) {
  console.log(thisOrder.textContent);
  thisOrder = orders.iterateNext();
}

See also How to use document.evaluate() and XPath to get a list of elements?

For parsing XML stored in a string, see for example:

Sign up to request clarification or add additional context in comments.

Comments

1

let ctx = 
`<ctx>
  <PostCode>XXXXXX</PostCode>
  <Title1>Mr</Title1>
  <Name>John</Name>
  <Order1>£100.00</Order1>
  <Order2>£100.01</Order2>
  <Date>10/10/2010</Date
</ctx>`;

let payload = ctx
  .match(/<Order[1-9]>/g) // e.g. <Order1>
  .map(o => o.substring(1, o.length - 1)); // trim first char `<` and last char `>`

console.log(payload);

4 Comments

Thanks, this would have worked if my application allowed me to capture the whole payload structure including tags and anchors <>, but I can't use <Order[1-9]>
Did you read the next line (.map(...)? It removes the < and >.
Is there a regex variant that matches Order[1-9] but only returns unique values.
Regex isn't going to do operations like 'find uniques' or 'map integers to double their value'. But that's where js comes in.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.