Change pandas DataFrame to numpy array but keeping column names

Question

I have a pandas DataFrame from the sklearn.datasets Boston house price data and am trying to convert this to a numpy array but keeping column names. Here is the code I tried:

from sklearn import datasets ## imports datasets from scikit-learn
import numpy as np
import pandas as pd

data = datasets.load_boston() ## loads Boston dataset from datasets library

df = pd.DataFrame(data.data, columns=data.feature_names)
X = df.to_numpy()
print(X.dtype.names)

However this returns None and therefore column names are not kept. Does anyone understand why?

Thanks

why do you expect column names should be retained when you access an underlying arrays instead of a dataframe? You can store the column names as a dictionary/array if you want access to them later — anky
– anky, Commented May 5, 2020 at 18:24
I assumed the code would create a structured array from the pandas DataFrame. I followed this answer to get there:stackoverflow.com/questions/7561017/… — geds133
– geds133, Commented May 5, 2020 at 18:25
@geds133 No, the corresponding method is to_records. to_numpy doesn't yield a structured array. — user2285236
– user2285236, Commented May 5, 2020 at 18:28
I see, there is a question on Stack that suggests this is the case. I shall comment and ask for correction. Many Thanks — geds133
– geds133, Commented May 5, 2020 at 18:30

K.J Fogang Fokoa · Accepted Answer · 2020-05-06 13:12:02Z

0

try this :

w = (data.feature_names).reshape(13,1)
X = np.vstack((w.T, data.data))
print (X)

answered May 6, 2020 at 13:12

K.J Fogang Fokoa

2191 gold badge5 silver badges15 bronze badges

Sign up to request clarification or add additional context in comments.

Collectives™ on Stack Overflow

Change pandas DataFrame to numpy array but keeping column names

1 Answer 1

Comments

Your Answer

Linked

Hot Network Questions

Collectives™ on Stack Overflow

1 Answer 1

Comments

Your Answer

Sign up or log in

Post as a guest

Linked

Related