-1

I have file named **.bxl and I try to read this file in Python as follow:

import chardet
bxl_file = open(bxl_filename,'rb')
bxl_str = bxl_file.readlines()[0]
the_encoding = chardet.detect(bxl_str)['encoding']
bxl_str = bxl_str.decode(the_encoding)
bxl_file.close()

When I print bxl_str , it looks fine like this:enter image description here However it does not work when I directly show it as:enter image description here

What I want is to get a string just as the print(bxl_str) result. Any one who can help me ? Appreciated!! Link for the file

5
  • you have problem here - bxl_file.readlines()[0] but you want your string to look like this - bxl_str.decode(the_encoding) Commented Jul 27, 2019 at 6:14
  • @PySaad Sorry that I do not understand what you mean?pls Commented Jul 27, 2019 at 6:17
  • That looks like a possibly fixed-length binary format. I.e. it’s not meant to be read directly as plain text, but should be parsed to some degree or another. Commented Jul 27, 2019 at 6:19
  • @deceze I have added the file in the description. Could you please have a look?Thank you . Commented Jul 27, 2019 at 6:27
  • 2
    Without knowledge of the purpose of the file, I don't think anyone can tell you anything more than "this is not an encoding; you're doing it wrong". Probably print on your system doesn't produce any visible output for control characters; you could approximate this in Python with something like splitting on control characters and assuming the extracted fragments can simply be decoded as ascii. See also hhe Unix strings utility. Commented Jul 27, 2019 at 6:38

2 Answers 2

1

The behavior you are experiencing is result of the fact that when you insert a variable into the interpreter, it displays it's repr attribute whereas print() takes the str (which are the same in this scenario) and ignores all unprintable characters such as: \x00, \x01 and replaces them with something else (i'm guessing white space).

An optional solution if you don't care about the spacing:

''.join(x for x in bxl_str if x.isprintable())

Or if you do care about spacing:

spaced_str = ''
for char in bxl_str:
    if char.isprintable():
        spaced_str += char
    else:
        spaced_str += ' '

Or in a more pythonic way (thank you Dan):

''.join(char if char.isprintable() else ' ' for char in bxl_str)
Sign up to request clarification or add additional context in comments.

2 Comments

It’s not that str replaces the characters with white space, it’s that the terminal doesn’t print those unprintable characters (well, it does, it just doesn’t look like anything), while repr explicitly visualizes unprintable characters.
Nothing pythonic about map. Rather use a comprehension: ''.join(char if char.isprintable() else ' ' for char in bxl_str)
0

what tripleee said in a comment holds, but this is a guess...

it looks like some sort of "data dictionary", i.e. giving descriptive names to otherwise opaque/numeric values. there seem to be some sort of sectioning going on, with 37 to 44 NUL characters between distinct sets of values. each set of values seems to have key/value pairs also separated by NUL chars, and I'd use the following code to interpret this file:

import re

with open('Links.bxl', encoding='ascii') as fd:
    buf = fd.read()

for section in re.split('\x00{32,}', buf):
    print('Next Section')
    section = section.split('\x00')
    for i in range(1, len(section), 2):
        print(' {0!r} => {1!r}'.format(section[i-1], section[i]))
    if len(section) % 2 == 1:
        print(repr(section[-1]))

the first few lines of output from the above is:

Next Section
'\x01'
Next Section
'Class'
Next Section
 '0' => 'Undefined'
 '11' => 'Freeway'
 '21' => 'Expressway'
 '31' => 'Major Arterial'
 '41' => 'Minor Arterial'
 '51' => 'Local Street'
 '61' => 'Access Road'
 '81' => 'Ramp'
 '82' => 'UnderRoad'
 '83' => 'TAZ_Link'
 '84' => 'Interchange_Ramp'
 '85' => 'UnderLine'
 '' => 'Disabled_AB'
Next Section
 '0' => 'No'
 '1' => 'Yes'
 '' => 'Disabled_BA'

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.