-1

I'm trying to write my "personal" python version of STL binary file reader, according to WIKIPEDIA : A binary STL file contains :

  • an 80-character (byte) headern which is generally ignored.
  • a 4-byte unsigned integer indicating the number of triangular facets in the file.
  • Each triangle is described by twelve 32-bit floating-point numbers: three for the normal and then three for the X/Y/Z coordinate of each vertex – just as with the ASCII version of STL. After these follows a 2-byte ("short") unsigned integer that is the "attribute byte count" – in the standard format, this should be zero because most software does not understand anything else. --Floating-point numbers are represented as IEEE floating-point numbers and are assumed to be little-endian--

Here is my code :

#! /usr/bin/env python3

with open("stlbinaryfile.stl","rb") as fichier :

head=fichier.read(80) 
nbtriangles=fichier.read(4)
print(nbtriangles)

The output is :

b'\x90\x08\x00\x00'

It represents an unsigned integer, I need to convert it without using any package (struct,stl...). Are there any (basic) rules to do it ?, I don't know what does \x mean ? How does \x90 represent one byte ?

most of the answers in google mention "C structs", but I don't know nothing about C.

Thank you for your time.

8
  • 2
    Why the restriction on using import struct? Commented Dec 7, 2016 at 0:05
  • It should be original. I have to start from zero. with only basic functions. Commented Dec 7, 2016 at 14:30
  • 1
    I would argue that struct.unpack is a "basic function." More to the point, it is part of the standard library, available in every Python installation. Commented Dec 7, 2016 at 16:31
  • But if I use it the project I work on will have no meaning, th purpose is to create STL binary file reader, without using : struct.unpack ,int.from_bytes..., All that I need is how (the rules) to convert \x##\x##...... knowing the type. Commented Dec 7, 2016 at 16:54
  • 1
    Also, putting together the floating-point values by hand will be way harder than putting together the integers. Commented Dec 7, 2016 at 17:08

2 Answers 2

1

Since you're using Python 3, you can use int.from_bytes. I'm guessing the value is stored little-endian, so you'd just do:

 nbtriangles = int.from_bytes(fichier.read(4), 'little')

Change the second argument to 'big' if it's supposed to be big-endian.

Mind you, the normal way to parse a fixed width type is the struct module, but apparently you've ruled that out.

For the confusion over the repr, bytes objects will display ASCII printable characters (e.g. a) or standard ASCII escapes (e.g. \t) if the byte value corresponds to one of them. If it doesn't, it uses \x##, where ## is the hexadecimal representation of the byte value, so \x90 represents the byte with value 0x90, or 144. You need to combine the byte values at offsets to reconstruct the int, but int.from_bytes does this for you faster than any hand-rolled solution could.

Update: Since apparent int.from_bytes isn't "basic" enough, a couple more complex, but only using top-level built-ins (not alternate constructors) solutions. For little-endian, you can do this:

def int_from_bytes(inbytes):
    res = 0
    for i, b in enumerate(inbytes):
        res |= b << (i * 8)  # Adjust each byte individually by 8 times position
    return res

You can use the same solution for big-endian by adding reversed to the loop, making it enumerate(reversed(inbytes)), or you can use this alternative solution that handles the offset adjustment a different way:

def int_from_bytes(inbytes):
    res = 0
    for b in inbytes:
        res <<= 8  # Adjust bytes seen so far to make room for new byte
        res |= b   # Mask in new byte
    return res

Again, this big-endian solution can trivially work for little-endian by looping over reversed(inbytes) instead of inbytes. In both cases inbytes[::-1] is an alternative to reversed(inbytes) (the former makes a new bytes in reversed order and iterates that, the latter iterates the existing bytes object in reverse, but unless it's a huge bytes object, enough to strain RAM if you copy it, the difference is pretty minimal).

Sign up to request clarification or add additional context in comments.

6 Comments

Thanks @ShadowRanger for your help, I know it sounds strange, but I should use basic functions only, so I have to discover how to convert (the rules) to create my own "int.from_bytes",For example to convert manually b'\x90\x08\x00\x00' (which I know to be an unsigned integer) what should I do ? : 144+8+0+0=152 ?? I didn't get the meaning of OFFSET, where can I found it ? I have also to convert some 32-bit floating-point numbers from \x## format ( I have edited my answer - 3rd list's item)
@Pylint424: int.from_bytes is a "basic" function (it's exactly as basic as int itself), so it would help if you define "basic" so I know what's available. Regardless, the offset is about shifting the bits; assuming little endian, the first byte requires no shifting before combining, the second requires a shift of 8, the third a shift of 16, etc. 144 + (8 << 8) == 2192. Converting binary representation of a floating point value without assistance from struct module is much more annoying, I have no idea what sort of assignment would encourage you to do that in Python.
@Pylint424: I added alternatives to int.from_bytes. For floating point, I'd suggest asking a new question; floating point is a totally different animal, and it wasn't part of your original question at all.
This is what I was looking for, now I have 3 methods to do the conversion, I still have a small problem : int_from_bytes(b'*\x1c\x02\x00') returns 138282 ==> * is interpreted as 48 ?? thanks for your time, I'll ask another question .
@Pylint424: Read the original part of my answer. The repr of bytes will use the character corresponding to the ASCII value of that byte if the byte is printable ASCII, for brevity (it's one character in the repr instead of four for an escape). If you check a table of ASCII values, you'll see that * has an ordinal of 42 (I have no idea where you got 48, it's not 48).
|
0

The typical way to interpret an integer is to use struct.unpack, like so:

import struct

with open("stlbinaryfile.stl","rb") as fichier :
    head=fichier.read(80)
    nbtriangles=fichier.read(4)
    print(nbtriangles)
    nbtriangles=struct.unpack("<I", nbtriangles)
    print(nbtriangles)

If you are allergic to import struct, then you can also compute it by hand:

def unsigned_int(s):
    result = 0
    for ch in s[::-1]:
        result *= 256
        result += ch
    return result

...
nbtriangles = unsigned_int(nbtriangles)

As to what you are seeing when you print b'\x90\x08\x00\x00'. You are printing a bytes object, which is an array of integers in the range [0-255]. The first integer has the value 144 (decimal) or 90 (hexadecimal). When printing a bytes object, that value is represented by the string \x90. The 2nd has the value eight, represented by \x08. The 3rd and final integers are both zero. They are presented by \x00.

If you would like to see a more familiar representation of the integers, try:

print(list(nbtriangles))

[144, 8, 0, 0]

To compute the 32-bit integers represented by these four 8-bit integers, you can use this formula:

total = byte0 + (byte1*256) + (byte2*256*256) + (byte3*256*256*256)

Or, in hex:

total = byte0 + (byte1*0x100) + (byte2*0x10000) + (byte3*0x1000000)

Which results in:

0x00000890

Perhaps you can see the similarities to decimal, where the string "1234" represents the number:

4 + 3*10 + 2*100 + 1*1000

4 Comments

That's what I am talking about, but unsigned_int() converts b'*\x1c\x02\x00' (another example) to 138282, * represents 48 , I don't know why ?, know I have to interpret the rest of the file (see my answer 3rd item in the list ...
, for the first point the 50-bytes are : ( b'\x9a' b'\xa3' b'\x14' b'\xbe' b'\x05' b'$' b'\x85' b'\xbe' b'N' b'b' b't' b'?' b'\xcd' b'\xa6' b'\x04' b'\xc4' b'\xfb' b';' b'\xd4' b'\xc1' b'\x84' b'w' b'\x81' b'A' b'\xcd' b'\xa6' b'\x04' b'\xc4' b'\xa5' b'\x15' b'\xd3' b'\xc1' b'\xb2' b'\xc7' b'\x81' b'A' b'\xef' b'\xa6' b'\x04' b'\xc4' b'\x81' b'\x14' b'\xd3' b'\xc1' b'Y' b'\xc7' b'\x81' b'A' b'\x00' b'\x00' ) How can I convert this by hand ( some bytes don't start with \x ) Thank you for your time.
If you want to see the decimal equivalents of the integers in the byte array, print them as a list: list(b'*\x1c\x02\x00') == [42, 28, 2, 0]. If you want to convert that by hand, don't print (nbtriangles). Instead print(list(nbtriangles)) or for digit in nbtriangles: print(digit)
yes,yes,yes , but I really don't know how to convert it if it reprsents a floating point,

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.