STL binary file reader with Python

Question

I'm trying to write my "personal" python version of STL binary file reader, according to WIKIPEDIA : A binary STL file contains :

an 80-character (byte) headern which is generally ignored.
a 4-byte unsigned integer indicating the number of triangular facets in the file.
Each triangle is described by twelve 32-bit floating-point numbers: three for the normal and then three for the X/Y/Z coordinate of each vertex – just as with the ASCII version of STL. After these follows a 2-byte ("short") unsigned integer that is the "attribute byte count" – in the standard format, this should be zero because most software does not understand anything else. --Floating-point numbers are represented as IEEE floating-point numbers and are assumed to be little-endian--

Here is my code :

#! /usr/bin/env python3

with open("stlbinaryfile.stl","rb") as fichier :

head=fichier.read(80) 
nbtriangles=fichier.read(4)
print(nbtriangles)

The output is :

b'\x90\x08\x00\x00'

It represents an unsigned integer, I need to convert it without using any package (struct,stl...). Are there any (basic) rules to do it ?, I don't know what does \x mean ? How does \x90 represent one byte ?

most of the answers in google mention "C structs", but I don't know nothing about C.

Thank you for your time.

It should be original. I have to start from zero. with only basic functions. — machine424
– machine424, Commented Dec 7, 2016 at 14:30
I would argue that struct.unpack is a "basic function." More to the point, it is part of the standard library, available in every Python installation. — Robᵩ
– Robᵩ, Commented Dec 7, 2016 at 16:31
But if I use it the project I work on will have no meaning, th purpose is to create STL binary file reader, without using : struct.unpack ,int.from_bytes..., All that I need is how (the rules) to convert \x##\x##...... knowing the type. — machine424
– machine424, Commented Dec 7, 2016 at 16:54
Also, putting together the floating-point values by hand will be way harder than putting together the integers. — Robᵩ
– Robᵩ, Commented Dec 7, 2016 at 17:08

ShadowRanger · Accepted Answer · 2016-12-07 23:03:45Z

1

Since you're using Python 3, you can use int.from_bytes. I'm guessing the value is stored little-endian, so you'd just do:

 nbtriangles = int.from_bytes(fichier.read(4), 'little')

Change the second argument to 'big' if it's supposed to be big-endian.

Mind you, the normal way to parse a fixed width type is the struct module, but apparently you've ruled that out.

For the confusion over the repr, bytes objects will display ASCII printable characters (e.g. a) or standard ASCII escapes (e.g. \t) if the byte value corresponds to one of them. If it doesn't, it uses \x##, where ## is the hexadecimal representation of the byte value, so \x90 represents the byte with value 0x90, or 144. You need to combine the byte values at offsets to reconstruct the int, but int.from_bytes does this for you faster than any hand-rolled solution could.

Update: Since apparent int.from_bytes isn't "basic" enough, a couple more complex, but only using top-level built-ins (not alternate constructors) solutions. For little-endian, you can do this:

def int_from_bytes(inbytes):
    res = 0
    for i, b in enumerate(inbytes):
        res |= b << (i * 8)  # Adjust each byte individually by 8 times position
    return res

You can use the same solution for big-endian by adding reversed to the loop, making it enumerate(reversed(inbytes)), or you can use this alternative solution that handles the offset adjustment a different way:

def int_from_bytes(inbytes):
    res = 0
    for b in inbytes:
        res <<= 8  # Adjust bytes seen so far to make room for new byte
        res |= b   # Mask in new byte
    return res

Again, this big-endian solution can trivially work for little-endian by looping over reversed(inbytes) instead of inbytes. In both cases inbytes[::-1] is an alternative to reversed(inbytes) (the former makes a new bytes in reversed order and iterates that, the latter iterates the existing bytes object in reverse, but unless it's a huge bytes object, enough to strain RAM if you copy it, the difference is pretty minimal).

edited Dec 7, 2016 at 23:03

answered Dec 7, 2016 at 0:10

ShadowRanger

158k12 gold badges221 silver badges315 bronze badges

Sign up to request clarification or add additional context in comments.

6 Comments

machine424 Over a year ago

Thanks @ShadowRanger for your help, I know it sounds strange, but I should use basic functions only, so I have to discover how to convert (the rules) to create my own "int.from_bytes",For example to convert manually b'\x90\x08\x00\x00' (which I know to be an unsigned integer) what should I do ? : 144+8+0+0=152 ?? I didn't get the meaning of OFFSET, where can I found it ? I have also to convert some 32-bit floating-point numbers from \x## format ( I have edited my answer - 3rd list's item)

ShadowRanger Over a year ago

@Pylint424: int.from_bytes is a "basic" function (it's exactly as basic as int itself), so it would help if you define "basic" so I know what's available. Regardless, the offset is about shifting the bits; assuming little endian, the first byte requires no shifting before combining, the second requires a shift of 8, the third a shift of 16, etc. 144 + (8 << 8) == 2192. Converting binary representation of a floating point value without assistance from struct module is much more annoying, I have no idea what sort of assignment would encourage you to do that in Python.

ShadowRanger Over a year ago

@Pylint424: I added alternatives to int.from_bytes. For floating point, I'd suggest asking a new question; floating point is a totally different animal, and it wasn't part of your original question at all.

machine424 Over a year ago

This is what I was looking for, now I have 3 methods to do the conversion, I still have a small problem : int_from_bytes(b'*\x1c\x02\x00') returns 138282 ==> * is interpreted as 48 ?? thanks for your time, I'll ask another question .

ShadowRanger Over a year ago

@Pylint424: Read the original part of my answer. The repr of bytes will use the character corresponding to the ASCII value of that byte if the byte is printable ASCII, for brevity (it's one character in the repr instead of four for an escape). If you check a table of ASCII values, you'll see that * has an ordinal of 42 (I have no idea where you got 48, it's not 48).

|

Robᵩ · Accepted Answer · 2016-12-07 17:12:52Z

0

The typical way to interpret an integer is to use struct.unpack, like so:

import struct

with open("stlbinaryfile.stl","rb") as fichier :
    head=fichier.read(80)
    nbtriangles=fichier.read(4)
    print(nbtriangles)
    nbtriangles=struct.unpack("<I", nbtriangles)
    print(nbtriangles)

If you are allergic to import struct, then you can also compute it by hand:

def unsigned_int(s):
    result = 0
    for ch in s[::-1]:
        result *= 256
        result += ch
    return result

...
nbtriangles = unsigned_int(nbtriangles)

As to what you are seeing when you print b'\x90\x08\x00\x00'. You are printing a bytes object, which is an array of integers in the range [0-255]. The first integer has the value 144 (decimal) or 90 (hexadecimal). When printing a bytes object, that value is represented by the string \x90. The 2nd has the value eight, represented by \x08. The 3rd and final integers are both zero. They are presented by \x00.

If you would like to see a more familiar representation of the integers, try:

print(list(nbtriangles))

[144, 8, 0, 0]

To compute the 32-bit integers represented by these four 8-bit integers, you can use this formula:

total = byte0 + (byte1*256) + (byte2*256*256) + (byte3*256*256*256)

Or, in hex:

total = byte0 + (byte1*0x100) + (byte2*0x10000) + (byte3*0x1000000)

Which results in:

0x00000890

Perhaps you can see the similarities to decimal, where the string "1234" represents the number:

4 + 3*10 + 2*100 + 1*1000

edited Dec 7, 2016 at 17:12

answered Dec 7, 2016 at 0:10

Robᵩ

170k20 gold badges251 silver badges323 bronze badges

4 Comments

machine424 Over a year ago

That's what I am talking about, but unsigned_int() converts b'*\x1c\x02\x00' (another example) to 138282, * represents 48 , I don't know why ?, know I have to interpret the rest of the file (see my answer 3rd item in the list ...

machine424 Over a year ago

, for the first point the 50-bytes are : ( b'\x9a' b'\xa3' b'\x14' b'\xbe' b'\x05' b'$' b'\x85' b'\xbe' b'N' b'b' b't' b'?' b'\xcd' b'\xa6' b'\x04' b'\xc4' b'\xfb' b';' b'\xd4' b'\xc1' b'\x84' b'w' b'\x81' b'A' b'\xcd' b'\xa6' b'\x04' b'\xc4' b'\xa5' b'\x15' b'\xd3' b'\xc1' b'\xb2' b'\xc7' b'\x81' b'A' b'\xef' b'\xa6' b'\x04' b'\xc4' b'\x81' b'\x14' b'\xd3' b'\xc1' b'Y' b'\xc7' b'\x81' b'A' b'\x00' b'\x00' ) How can I convert this by hand ( some bytes don't start with \x ) Thank you for your time.

Robᵩ Over a year ago

If you want to see the decimal equivalents of the integers in the byte array, print them as a list: list(b'*\x1c\x02\x00') == [42, 28, 2, 0]. If you want to convert that by hand, don't print (nbtriangles). Instead print(list(nbtriangles)) or for digit in nbtriangles: print(digit)

machine424 Over a year ago

yes,yes,yes , but I really don't know how to convert it if it reprsents a floating point,

Collectives™ on Stack Overflow

STL binary file reader with Python

2 Answers 2

6 Comments

4 Comments

Your Answer

Linked

Hot Network Questions

Collectives™ on Stack Overflow

2 Answers 2

6 Comments

4 Comments

Your Answer

Sign up or log in

Post as a guest

Linked

Related