3

I am trying to convert a single column of a dataframe to a numpy array. Converting the entire dataframe has no issues.

df

  viz  a1_count  a1_mean     a1_std
0   0         3        2   0.816497
1   1         0      NaN        NaN 
2   0         2       51  50.000000

Both of these functions work fine:

X = df.as_matrix()
X = df.as_matrix(columns=df.columns[1:])

However, when I try:

y = df.as_matrix(columns=df.columns[0])

I get:

TypeError: Index(...) must be called with a collection of some kind, 'viz' was passed

3 Answers 3

3

The problem here is that you're passing just a single element which in this case is just the string title of that column, if you convert this to a list with a single element then it works:

In [97]:
y = df.as_matrix(columns=[df.columns[0]])
y

Out[97]:
array([[0],
       [1],
       [0]], dtype=int64)

Here is what you're passing:

In [101]:
df.columns[0]

Out[101]:
'viz'

So it's equivalent to this:

y = df.as_matrix(columns='viz')

which results in the same error

The docs show the expected params:

DataFrame.as_matrix(columns=None) Convert the frame to its Numpy-array representation.

Parameters: columns: list, optional, default:None If None, return all columns, otherwise, returns specified columns

Sign up to request clarification or add additional context in comments.

3 Comments

Thanks for the explanation. Why did it work for columns=df.columns[1:] without encapsulating that in a list?
Because that returned an array but when it's a single element selection it returned the single object only
Because df.columns[1:] is already a list. Consider: li = [1,2,3,4] li[1:] -> [2,3,4]
3

as_matrix expects a list for the columns keyword and df.columns[0] isn't a list. Try df.as_matrix(columns=[df.columns[0]]) instead.

Comments

-1

Using the index tolist function works as well

df.as_matrix(columns=df.columns[0].tolist())

When giving multiple columns, for example, the ten first, then the command

df.as_matrix(columns=[df.columns[0:10]])

does not work as it returns an index. However, using

df.as_matrix(columns=df.columns[0:10].tolist())

works well.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.