Saving data after loop with numpy

Question

My code looks like this at the moment:

new_table = np.zeros(shape=(4,1),dtype=object) 

for i in y:   
    some calculation that produce result
    new_table = np.append(new_table, np.array([result]), axis=0)

After printing new_table result look like this:

array([[0],
       [0],
       [0],
       [0],
       [(1, 61.087293, 33.429379, 0.42581059018640416)],
       [(1, 61.087293, 33.429379, 0.3203261022508016)],
       [(1, 61.087293, 33.429379, 0.45689267865065536)]], dtype=object)

But output should be without those 4 zeros at the beginning of the array:

I am not sure what I am doing wrong, and is there possibility to add the column names to new_table and how to do this?

Thanks.

"output should be without those 4 zeros at the beginning of the array" - looks like some calculation that produce result is responsible for this, but you didn't show it... — ForceBru
– ForceBru, Commented Jun 12, 2022 at 17:08
The calculation produce only results of this form 1, 61.087293, 33.429379, 0.42581059018640416 (one row with 4 numbers .... the are not zero) ... problem is that I do not know how to store this into the one table :) — user16454053
– user16454053, Commented Jun 12, 2022 at 17:14
Wait, but initially new_table is a 4 by 1 matrix of zeros, so you created the zeros in new_table = np.zeros(shape=(4,1),dtype=object) — ForceBru
– ForceBru, Commented Jun 12, 2022 at 17:16
we can help you more if you help us by showing the calculations — lemon
– lemon, Commented Jun 12, 2022 at 17:16
Also note that the four elements given by the calculation seem to be stored inside a tuple. I wouldn't use numpy if I'm not using its structure: better using lists in this case I'd say, or remove the tuple structure. — lemon
– lemon, Commented Jun 12, 2022 at 17:18

Bastian Venthur · Accepted Answer · 2022-06-13 06:54:38Z

1

The problem is that you generate the (4,1) array and then append more rows to it, i.e. you just add more rows. Either you start with an empty table (np.array([])) and append to that, or you change the values in the table in place.

edited Jun 13, 2022 at 6:54

answered Jun 12, 2022 at 17:13

Bastian Venthur

17.6k10 gold badges58 silver badges90 bronze badges

Sign up to request clarification or add additional context in comments.

2 Comments

user16454053 Over a year ago

Ok, please can you show me how to create empty table .... I try but does not work, and please can you show me how to change values in the table in place .... Thanks :)

Bastian Venthur Over a year ago

see my updated my answer, cheers!

Mad Physicist · Accepted Answer · 2022-06-13 07:03:09Z

1

Start with an empty array of the required shape. If your data is rows:

new_table = np.empty((0, 4)) 
for i in y:   
    ...
    new_table = np.append(new_table, np.array([result]), axis=0)

Keep in mind that this keeps reallocating the entire array over and over, which is very inefficient. You're much better off skipping the initial array, accumulating the snippets in a list, and stacking it later:

table_list = []
for ...:
    table_list.append(result)
new_table = np.stack(table_list, axis=0)

answered Jun 13, 2022 at 7:03

Mad Physicist

116k29 gold badges202 silver badges292 bronze badges

Comments

code-lukas · Accepted Answer · 2022-06-13 16:08:19Z

If you are working with large data sets, it might make more sense to preallocate the array and then set the values as opposed to append to a growing array / list. I compared @Mad Physicist 's solution to a different approach.

import timeit
import numpy as np

y = np.random.randint(0, 100, 10000)    # dummy data

starttime1 = timeit.default_timer()
new_table = np.zeros((len(y), 4))

for idx, i in enumerate(y):
    # ... some dummy operation
    new_table[idx] = (i, i**2, i**3, i**4)

print(f"Preallocating : {timeit.default_timer() - starttime1} s")

table_list = []
starttime2 = timeit.default_timer()

for i in y:
    table_list.append((i, i**2, i**3, i**4))
new_table = np.stack(table_list, axis=0)

print(f"np.stack : {timeit.default_timer() - starttime2} s")

It seems that the first way outperforms the second one. I didn't benchmark this properly, but I assume that the time saved is even more signifficant for larger data / arrays.

Preallocating : 0.01815319999999998 s
np.stack : 0.026264800000000033 s

Collectives™ on Stack Overflow

Saving data after loop with numpy

3 Answers 3

2 Comments

Comments

Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

3 Answers 3

2 Comments

Comments

Comments

Your Answer

Sign up or log in

Post as a guest

Related