0

Using python 3, I am trying to process a set of data in a four-column text file: The first column is the x index, the second column is the y index and the third column is the z index, or depth index. The fourth column is the data value. The values in the text file look like this:

0 0 0 0.0
1 0 0 0.0
2 0 0 2.0
0 1 0 0.0
1 1 0 0.0
2 1 0 2.0
0 2 0 0.0
1 2 0 0.0
2 2 0 2.0
0 0 1 0.0
1 0 1 0.0
2 0 1 2.0
0 1 1 0.0
1 1 1 0.0
2 1 1 2.0
0 2 1 0.0
1 2 1 0.0
2 2 1 2.0

Is there a way to construct a 3D numpy array with shape (2,3,3)?

[[[0 0 0]
  [0 0 0]
  [2 2 2]],
 [[0 0 0]
  [0 0 0]
  [2 2 2]]]

While this example shows 18 rows wanting to be shaped into a (2,3,3) array, my actual data is 512x512x49 (12845056) rows and I'd like to shape them into a (512,512,49) array. If the solution could efficiently parse a greater number of rows, that would be appreciated, but I understand python has some fundamental speed limitations.

This is what I have tried so far:

import numpy as np
f = "file_path.txt"
data = np.loadtxt(f)
data = data.reshape((512,512,49))

but this gives the following error:

ValueError: cannot reshape array of size 51380224 into shape (512,512,49)

I was surprised by this error since 51380224 is not equal to the number of rows in my loaded array (12845056). Also, I suspect numpy needs information that the first, second, and third columns are not values, but indices along which to shape the values in the fourth column. I am not sure how to achieve this, and am open to solutions in either numpy or pandas.

3
  • somehow you've loaded 4x as many elements as expected. Check data.shape and data.dtype. How about a reshape to (512,512,49,4)? Commented Apr 1, 2022 at 20:31
  • data.shape returns (12845056, 4) and data.dtype returns dtype('float64'). The 4x number of rows may be due to numpy incorrectly interpreting my four-column data set. data.reshape(512,512,49,4) returns an array with shape (512,512,49,4), but I still don't think the data are in the proper structure; I accessed the first ''depth'' of the data.reshape(512,512,49,4) using data[:,:,0,0], but it does not look correct. Commented Apr 2, 2022 at 22:32
  • If the file has 128... rows and 4 columns, then numpy has properly loaded it. The interpretation of those columns is another matter. Commented Apr 2, 2022 at 23:29

3 Answers 3

1

With your sample file:

In [94]: txt = """0 0 0 0.0
    ...: 1 0 0 0.0
    ...: 2 0 0 2.0
    ...: 0 1 0 0.0
    ...: 1 1 0 0.0
    ...: 2 1 0 2.0
    ...: 0 2 0 0.0
    ...: 1 2 0 0.0
    ...: 2 2 0 2.0
    ...: 0 0 1 0.0
    ...: 1 0 1 0.0
    ...: 2 0 1 2.0
    ...: 0 1 1 0.0
    ...: 1 1 1 0.0
    ...: 2 1 1 2.0
    ...: 0 2 1 0.0
    ...: 1 2 1 0.0
    ...: 2 2 1 2.0""".splitlines()

The straight forward load detects 4 columns, and makes all values float:

In [95]: data = np.genfromtxt(txt)
In [96]: data.shape
Out[96]: (18, 4)

We could work from those, converting the float indices to integer. Or we can load the file in 2 steps:

In [103]: indices = np.genfromtxt(txt, usecols=[0,1,2], dtype=int)
In [104]: values = np.genfromtxt(txt, usecols=[3])

and use those values to fill in an array:

In [105]: res = np.zeros((2,3,3),float)
In [107]: res[indices[:,2],indices[:,0],indices[:,1]] = values
In [108]: res
Out[108]: 
array([[[0., 0., 0.],
        [0., 0., 0.],
        [2., 2., 2.]],

       [[0., 0., 0.],
        [0., 0., 0.],
        [2., 2., 2.]]])
Sign up to request clarification or add additional context in comments.

1 Comment

This worked, thank you!! Since I am loading my data using np.loadtxt, I extended your solution to my dataset by replacing your indices and values lines with: indices = data[:,0:3].astype('int'), values = data[:,3], and res = np.zeros(49,512,512). These variables produce the structure I am looking for, thank you again!
1

Numpy itself does not have any special function for such case, but the solution is easy.
From the step where you get input to the data all you need to do is to use first, second and third columns as indexes.
Warning: while the cartesian coordinates are notated as (x,y,z), arrays are represented in an inverse way: first dimension is z, second is y and third is x data[z,y,x].
With that in mind you can use the columns in data as indexes:

# Assuming the first dimension are rows in data
x, y, z = data[:,0:3].astype(np.int32).T  # we first need to get indexes as integers 
extracted_data = np.zeros((512,512,49))  # creating array of a desired size
extracted_data[z, y, x] = data[:,3]  # filling it with data 

That should do the trick for you.

2 Comments

Thanks for the answer attempt. Unfortunately, I get the following error after the very first line you gave: ValueError: too many values to unpack (expected 3). I think this has to do with trying to set three variables, xyz , equal to a structure with two columns, data[:,0:3]. I'm not sure how to proceed.
@TWalker fixed the answer, should work fine now.
0

Though I selected @hpaulj answer as the solution, here is a full working example for clarity:

import numpy as np
f = "file_path.txt"
data = np.loadtxt(f)
indices = data[:,0:3].copy().astype('int')
values = data[:,3].copy()
res = np.zeros((49,512,512),float)
res[indices[:,2],indices[:,0],indices[:,1]] = values

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.