1

I am using Python's elementtree library to parse an .XML file that I exported from MySQL query browser. When I export the result set to a .XML it includes this really weird character that shows up as the letters "BS" highlighted in a green rounded rectangle in my editor. (see screen shot) Anyway I iterate through the file and try to manually replace the character, but it must not be matching because after I do this:

for lines in file:
    lines.replace("<Weird Char>", "").strip();

I get an error from the parse method. However if I replace the character manually in wordpad/notepad etc... the parse call works correctly. I am looking for a way to parse out the character without having to do it manually.

any help would be great: I included two screen shots, one of how the character appears in my editor, and another how it appears in Chrome.

Thanks

screen shot from my editor screen shot from chrome

EDIT: You will probably have to zoom in on the images, sorry.

1
  • python parse.py zombie.xml Unexpected error opening <open file 'zombie.xml', mode 'r' at 0x7ff3e458>: [Errn o 9] Bad file descriptor This is the actual error message Commented Jun 13, 2011 at 15:08

1 Answer 1

1

The backspace character is not a valid XML character and needs to be escaped (&#08;). I'm surprised MySQL is not doing that here, but I'm not familiar with MySQL. You can also check your data and clean it up with an update statement to get rid of that character if it is not valid data for the table.

As far as parsing it out in python, this should work:

lines.replace("\b", "&#08;")
Sign up to request clarification or add additional context in comments.

2 Comments

is it the backspace character? that is what I thought originally, but when I pasted it into chrome it gave me that weird symbol. Also is &#08 the encoding for Backspace?
I don't know what other character it would be. I don't use the same editor as you, but if you open your XML up in a hex editor, you'll probably see the BS character in there (ASCII 8). And, yes, &#08; is the escape code for the backspace character (don't forget the semi-colon).

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.