0

I am writing a numpy based .PLY importer. I am only interested in binary files, and vertices, faces and vertex colors. My target data format is a flattened list of x,y,z floats for the vertex data and r,g,b,a integers for the color data.

 [x0,y0,z0,x1,y1,z1....xn,yn,zn] 
 [r0,g0,b0,a0,r1,g1,b1,a1....rn,gn,bn,an] 

This allows me to use fasts builtin C++ methods to construct the mesh in the target program (Blender).

I am using a modified version of this code to read in the data into numpy arrays example

valid_formats = {'binary_big_endian': '>','binary_little_endian': '<'}
ply = open(filename, 'rb')
# get binary_little/big or ascii
fmt = ply.readline().split()[1].decode()
# get extension for building the numpy dtypes
ext = valid_formats[fmt]
ply.seek(end_header)
#v_dtype = [('x','<f4'),('y','<f4'), ('z','<f4'), ('red','<u1'), ('green','<u1'), ('blue','<u1'),('alpha','<u1')]
#points_size = (previously read in from header)
points_np = np.fromfile(ply, dtype=v_dtype, count=points_size)

The results being

print(points_np.shape)
print(points_np[0:3])
print(points_np.ravel()[0:3])

>>>(158561,)
>>>[ (20.781816482543945, 11.767952919006348, 15.565438270568848, 206, 216, 186, 255)
     (20.679922103881836, 11.754084587097168, 15.560364723205566, 189, 196, 157, 255)
     (20.72969627380371, 11.823691368103027, 15.51106071472168, 192, 193, 157, 255)]
>>>[ (20.781816482543945, 11.767952919006348, 15.565438270568848, 206, 216, 186, 255)
     (20.679922103881836, 11.754084587097168, 15.560364723205566, 189, 196, 157, 255)
     (20.72969627380371, 11.823691368103027, 15.51106071472168, 192, 193, 157, 255)]

So the ravel (I've also tried flatten, reshape etc) does work and I presume it is because the data types are (float, float, float, int, int, int).

What I have tried -I've tried doing things like vectorizing a function that just pulls out the xyz and rgb separately into a new array. -I've tried stack, vstack etc List comprehension (yuck) -Things like thes take 1 to 10s of seconds to execute compared to hundredths of seconds to read in the data. -I have tried using astype on the verts data, but that seems to return only the first element.

convert to structured array accessing first element of each element Most efficient way to map function over numpy array

What I want to Try/Would Like to Know

Is there a better way to read the data in the data in the first place so I don't loose all this time reshaping, flattening etc? Perhaps by telling np.fromfile to skip over the color data on one pass and then come back and read it again?

Is there a numpy trick I don't know for reshaping/flattening data of this kind

4
  • With a shape of (158561,) your array is already "flat", that is, 1d. So it's a waste of your time to try to change that with ravel, reshape, etc. You haven't taken seriously the meaning of array shape. Commented May 13, 2020 at 17:26
  • 1
    Given the file structure, using fromfile with that compound dtype is the only way. Now you have a 1d structured array. The next question is - what do you need to do with that. The dtype defines fields, which you can access individually or in subsets, points_np['x'] or points_np[['x','y']]. Commented May 13, 2020 at 17:27
  • numpy.org/doc/1.18/user/basics.rec.html Commented May 13, 2020 at 17:30
  • "which you can access individually or in subsets" Thank you, this was the hole in my understanding of the data I was getting back from "fromfile" Commented May 13, 2020 at 17:55

0

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.