How can I convert pandas DataFrame into the following Numpy array with column names?
array([('Heidi Mitchell', '[email protected]', 74, 52, 'female', '1121', 'cancer', '03/06/2018'),
('Kimberly Kent', 'wilsoncarla@mitchell-gree', 63, 51, 'male', '2003', 'cancer', '16/06/2017')],
dtype=[('name', '<U16'), ('email', '<U25'), ('age', '<i4'), ('weight', '<i4'), ('gender', '<U10'), ('zipcode', '<U6'), ('diagnosis', '<U6'), ('dob', '<U16')])
This is my pandas DataFrame df:
col1 col2
3 5
3 1
4 5
1 5
2 2
I tried to convert it as follows:
import numpy as np
dt = np.dtype([('col1', np.int32), ('col2', np.int32)])
arr = np.array(df.values, dtype=dt)
But it gives me the output as follows:
array([[(3, 5), (3, 1)],
...
dtype=[('col1', '<i4'), ('col2', '<i4')])
For some reason, the rows of data are grouped [(3, 5), (3, 1)] instead of [(3, 5), (3, 1), (4, 5), (1, 5), (1, 2)].