Creating numpy array with empty columns using genfromtxt

Question

I am importing data using numpy.genfromtxt, and I would like to add a field of values derived from some of those within the dataset. As this is a structured array, it seems like the most simple, efficient way of adding a new column to the array is by using numpy.lib.recfunctions.append_fields(). I found a good description of this library HERE.

Is there a way of doing this without copying the array, perhaps by forcing genfromtxt to create an empty column to which I can append derived values?

the first parameter to genfromtxt can be a generator, within which, you can create an empty column on each line of your file while you're reading it in. — mtadd
– mtadd, Commented Apr 10, 2013 at 5:29
mtadd, i've just ran into this problem again, and I'm wondering if you could illustrate what you are referring to in an answer. thanks! — ryanjdillon
– ryanjdillon, Commented Apr 8, 2014 at 19:53

mtadd · Accepted Answer · 2014-04-08 22:11:57Z

1

Here's a simple example using a generator to add a field to a data file using genfromtxt

Our example data file will be data.txt with the contents:

1,11,1.1
2,22,2.2
3,33,3.3

So

In [19]: np.genfromtxt('data.txt',delimiter=',')
Out[19]:
array([[  1. ,  11. ,   1.1],
       [  2. ,  22. ,   2.2],
       [  3. ,  33. ,   3.3]])

If we make a generator such as:

def genfield():
    for line in open('data.txt'):
        yield '0,' + line

which prepends a comma-delimited 0 to each line of the file, then:

In [22]: np.genfromtxt(genfield(),delimiter=',')
Out[22]:
array([[  0. ,   1. ,  11. ,   1.1],
       [  0. ,   2. ,  22. ,   2.2],
       [  0. ,   3. ,  33. ,   3.3]])

You can do the same thing with comprehensions as follows:

In [26]: np.genfromtxt(('0,'+line for line in open('data.txt')),delimiter=',')
Out[26]:
array([[  0. ,   1. ,  11. ,   1.1],
       [  0. ,   2. ,  22. ,   2.2],
       [  0. ,   3. ,  33. ,   3.3]])

answered Apr 8, 2014 at 22:11

mtadd

2,55515 silver badges18 bronze badges

Sign up to request clarification or add additional context in comments.

1 Comment

ryanjdillon Over a year ago

Brilliant. If only genfromtxt could take a regex for the delimiter, it would now be a perfect tool for me.

Saullo G. P. Castro · Accepted Answer · 2013-05-09 20:47:14Z

1

I was trying to make genfromtxt read this:

11,12,13,14,15
21,22,
31,32,33,34,35
41,42,43,,45

using:

import numpy as np
print np.genfromtxt('tmp.txt',delimiter=',',filling_values='0')

but it did not work. I had to change the input adding commas to represent the empty columns:

11,12,13,14,15
21,22,,,
31,32,33,34,35
41,42,43,,45

then it worked, returning:

[[ 11.  12.  13.  14.  15.]
 [ 21.  22.   0.   0.   0.]
 [ 31.  32.  33.  34.  35.]
 [ 41.  42.  43.   0.  45.]]

answered May 9, 2013 at 20:47

Saullo G. P. Castro

59.4k28 gold badges191 silver badges244 bronze badges

3 Comments

ryanjdillon Over a year ago

Thanks Saullo. What I am actually looking for is to have an additional row, that does not exist in the data file that I am reading in.

Saullo G. P. Castro Over a year ago

@shootingstars to add additional rows you can use np.vstack((a, np.zeros((num_rows, a.shape[1]))))

ryanjdillon Over a year ago

My problem is that i call this with one of the fields being a datetime object, which prevents the stack and numpy.lib.recfuntions add_field from merging the arrays.

Collectives™ on Stack Overflow

Creating numpy array with empty columns using genfromtxt

2 Answers 2

1 Comment

3 Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

2 Answers 2

1 Comment

3 Comments

Your Answer

Sign up or log in

Post as a guest

Related