0

I am doing an analysis of food expenditures by 5 different members belonging to 5 different age group. I want to create a file in .npz format, which should have two variables, viz., 'age' and 'person'. I am trying to get an array containing a list of arrays.

I created a list of 5 members 'person' and a list of 5 corresponding age group as 'age'. However, after accessing the created .npz file, I am getting a combined array of size (5,7).

person1 = np.array([(1, 2, 3, 4),
                    (4, 5, 6, 5),
                    (7, 8, 9, 6),
                    (9, 6, 5, 4),
                    (6, 5, 4, 3),
                    (6, 5, 4, 3),
                    (4, 3, 5, 7)],
                    dtype=[('BF', '<f8'), ('Lunch', '<f8'), ('Snacks', '<f8'), ('Dinner', '<f8')])
person2 = person1
person3 = person1
person4 = person1
person5 = person1

person = [person1, person2, person3, person4, person5]

age = [10, 20, 30, 40, 50]

np.savez('test.npz', age=age, person=person)

with np.load('test.npz', allow_pickle=False) as data:
    list_person = data['person']
    age_group = data['age']
    # df = pd.DataFrame(list_person)
    # df.to_excel('test.xlsx', index=True)

I am expecting 'list_person' as an array of size (5,). Each element of which should have an array of shape (7,4). So that while exporting in excel I get (5,1) data.

1 Answer 1

1

savez makes arrays of all list inputs; so that's what you'll see upon load:

In [105]: np.array(person).shape                                                                             
Out[105]: (5, 7)
In [106]: np.array(person).dtype                                                                             
Out[106]: dtype([('BF', '<f8'), ('Lunch', '<f8'), ('Snacks', '<f8'), ('Dinner', '<f8')])
In [107]: np.array(age).shape                                                                                
Out[107]: (5,)
In [108]: np.array(age).dtype                                                                                
Out[108]: dtype('int64')

person is constructed from 5 copies of person1, so the result is (5,7), and dtype has 4 fields (those aren't dimensions).

In [112]: df = pd.DataFrame(person1)                                                                         
In [113]: df                                                                                                 
Out[113]: 
    BF  Lunch  Snacks  Dinner
0  1.0    2.0     3.0     4.0
1  4.0    5.0     6.0     5.0
2  7.0    8.0     9.0     6.0
3  9.0    6.0     5.0     4.0
4  6.0    5.0     4.0     3.0
5  6.0    5.0     4.0     3.0
6  4.0    3.0     5.0     7.0

Trying to make a dataframe from the (5,7) array produces an error. Flattening it to (35,) does work.

Sign up to request clarification or add additional context in comments.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.