I wrote the following code to normalize a few columns of a data frame:
import pandas as pd
train = pd.read_csv('test1.csv')
header = train.columns.values
print(train)
print(header)
inputs = header[0:3]
trainArr = train.as_matrix(inputs)
print(inputs)
trainArr[inputs] = trainArr[inputs].apply(lambda x: (x - x.mean()) / (x.max() - x.min()))
Some inputs from the code are:
v1 v2 v3 result
0 12 31 31 0
1 34 52 4 1
2 32 4 5 1
3 7 89 2 0
['v1' 'v2' 'v3' 'result']
['v1' 'v2' 'v3']
However, I got the following error:
trainArr[inputs] = trainArr[inputs].apply(lambda x: (x - x.mean()) / (x.max() - x.min()))
IndexError: arrays used as indices must be of integer (or boolean) type
Does any one have any idea what I missed here? Thanks!
print train.head()?