I am trying to do feature selection using scikit-learn library. My data is simple. Rows are samples and Columns are features. Though original Class label is X and Y, I changed them to numeric for linear regression, X to 0 and Y to 1.
G1 G2 G3 ... Gn Class
1.0 4.0 5.0 ... 1.0 0
4.0 5.0 9.0 ... 1.0 0
9.0 6.0 3.0 ... 2.0 1
...
I used library sklearn.linear_model.LinearRegression(), and it was performed well. Now I am using coef_ value for feature selection. In this case, I have 2 questions.
Is it right to use the coef_ value of features? Or are there some other better parameters for feature selection in LinearRegression()?
In addition, is there some kind of rule to decide proper threshold(for example, minimum value of coef_ for feature selection)?
lasso:)