So, given:
>>> df = pd.DataFrame({'a':[1,2,3,4], 'b':[5,6,7,8], 'c':[9,10,11,12]})
>>> i = 1
>>> df
a b c
0 1 5 9
1 2 6 10
2 3 7 11
3 4 8 12
>>> df.iloc[:, i: i + 1]
b
0 5
1 6
2 7
3 8
>>> np.array(df.iloc[:, i: i + 1])
array([[5],
[6],
[7],
[8]])
You could use the .squeeze method, which removes a single dimension from your array:
>>> np.array(df.iloc[:, i: i + 1]).squeeze()
array([5, 6, 7, 8])
Although I'd probably just use:
>>> df.iloc[:, i: i + 1].values.squeeze()
array([5, 6, 7, 8])
Or alternatively, you could always use .reshape, which should be your first instinct when you want to reshape an array:
>>> np.array(df.iloc[:, i: i + 1]).reshape(-1)
array([5, 6, 7, 8])
Note, these will behave differently if you accidentally take an extra column, so:
>>> np.array(df.iloc[:, i: i + 2])
array([[ 5, 9],
[ 6, 10],
[ 7, 11],
[ 8, 12]])
With reshape:
>>> np.array(df.iloc[:, i: i + 2]).reshape(-1)
array([ 5, 9, 6, 10, 7, 11, 8, 12])
With squeeze:
>>> np.array(df.iloc[:, i: i + 2]).squeeze()
array([[ 5, 9],
[ 6, 10],
[ 7, 11],
[ 8, 12]])
Ideally, you'd probably just want that to fail, so if you want to program defensively, use reshape with explicit parameters instead of -1:
>>> np.array(df.iloc[:, i: i + 1]).reshape((df.shape[0],))
array([5, 6, 7, 8])
>>> np.array(df.iloc[:, i: i + 2]).reshape((df.shape[0],))
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
ValueError: cannot reshape array of size 8 into shape (4,)
>>>
However
You could avoid this by not doing an unecessary slice, so:
>>> df.iloc[:, i: i + 1]
b
0 5
1 6
2 7
3 8
>>> df.iloc[:, i + 1]
0 9
1 10
2 11
3 12
Name: c, dtype: int64
The latter gives you a series, which is already one-dimensional, so you could just use:
>>> df.iloc[:, i + 1].values
array([ 9, 10, 11, 12])
Numpyfunction along any axis inPandasusing theapplymethod already so subsetting and putting the output into anumpyarray seems wasteful.