7

I'm Using a Sax parser to handle a pre-written XML file....i have no way of changing the XML as it is held by another application but need to parse data from it. The XML file contains a Tag < ERROR_TEXT/> which is empty when no error is occurred. as a result the parser takes the next character after the tag close which is "\n". I have tried result.replaceAll("\n", ""); and result.replaceAll("\n", "");

how do I get SAX to recognize this is an empty tag and return the value as "" ?

3 Answers 3

2

You DO THAT. If you have xml and Java source blow.

<ERROR_TEXT>easy</ERROR_TEXT><ERROR_TEXT/>

Java code

private boolean isKeySet = false;
private String key = "";
@Override
public void characters(
    char[] ch,
    int start,
    int length
) throws SAXException
{
    if (!isKeySet) {
        return;
    }
    isKeySet = false;
    logger.debug("key : [" + key + "], value : [" + value + "]");
}
@Override
public void startElement(
    String uri,
    String localName,
    String qName,
    Attributes attrs
) throws SAXException
{
    key = qName;
    isKeySet = true;
}

@Override
public void endElement(
    String uri,
    String localName,
    String qName
) throws SAXException
{
    if (isKeySet) {
        isKeySet = false;
        logger.debug("key : [" + key + "](EMPTY!!!)");
    }
}

RESULT log:

key : [ERROR_TEXT], value : [easy]

key : [ERROR_TEXT](EMPTY!!!)

Call flow: startElement() -> characters() -> endElement() -> startElement() -> endElement() -> characters()

That's it! THE END

Sign up to request clarification or add additional context in comments.

Comments

1

SAXParser returns cDAta through the characters() event which it calls whenever it encounters 'characters' literally. It's pointless to use that function as it is called after every open tag regardless of whether it actually contains any data. You could use String.trim() and do a String.length()>=0 check before proceeding.

2 Comments

Thanks, it worked for me. But I still think it should just return an empty string if there's no data.
@Frederic 2018 me agrees with you :)
0

You don't. It is SAXs job parse the data, not to make decisions on what the content of that data is supposed to be. In your parseHandler, store the string of the data in all your element, and when you go to process that element, do a string.trim() on the data. if the output of that is blank and your tag is an ERROR_TEXT tag, you know there is no error.

5 Comments

'string.trim()' won't delete \n. The string appears as "\n" when i debug it.
the Sax parser isn't recognising the empty tag rather getting the return character after it.
It should return a start element, and end element and a number of blanks characters in the middle. Is that not what you are getting? If you want to check for \n characters, do a replace for those and space, then do a trim.
no see the tag is like this <ERROR_TEXT/ > and the sax parser is not treating it as <ERROR_TEXT ></ERROR_TEXT > i want it to give me a null but istead it is giving me the first character after <ERROR_TEXT/ > which happens to be \n
You cannot change what it gives you. Why is it a problem ignoring a \n? Are you using a default handler or your own? if you are using your own it is easy establish the tag is empty. If not, it shouldn't be hard to ignore if you are looking for a string and you get whitespace. If it is a major problem for you, use a dom parser instead of a sax

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.