0

I've currently huge amount of data (500 mb each) which I'm using lodash and cheerio to parse and fetch parts of it. Problem with new data is that it has some empty tags being incorrectly replaced.

Example:

<apple></apple>

gets replaced by

</apple>

I want to make sure that the previous formatting remains the same. Any regex that I can use to find these new empty tags and replace it with the old correct format?

1 Answer 1

2

You probably mean that <apple></apple> is replaced by <apple/> (not </apple>).

<apple></apple> and <apple/> are equivalent in XML, and no compliant XML process will treat them differently, so you should not care which is used in your document.

If you truly meant that <apple></apple> is replaced by </apple>, then you have a likely irreparably damaged file as you won't know whether any given end tag for apple should be associated with an empty or nonempty apple element.

For example, doing a string-level replace of "</apple>" to <apple></apple> for

<apple>one</apple>

would result in

<apple>one<apple></apple>

which would not be well-formed.

Sign up to request clarification or add additional context in comments.

4 Comments

I currently have few sections which are </apple> and I want to complete them . I can obviously check if they have any starting tag before them or not . So unless they dont have any starting tag I just want to add that before the ending tag ,irrespective of whether anything was there in between
I can obviously check if they have any starting tag before them or not. No, in general, you cannot, and you've not stated restrictions which would allow you to make such a statement.
Its a statement I'm making that "I can"
Sorry, but the assertiveness of your statement and the naïveté of your query are incongruent. Good luck.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.