1

I want to use a feature selection method where "combinations" of features or "between features" interactions are considered for a simple linear regression.

SelectKBest only looks at one feature to the target, one at a time, and ranks them by Pearson's R values. While this is quick, but I'm afraid it's ignoring some important interactions between features.

Recursive Feature Elimination first uses ALL my features, fits a Linear Regression model, and then kicks out the feature with the smallest absolute value coefficient. I'm not sure whether if this accounts for "between features" interaction...I don't think so since it's simply kicking out the smallest coefficient one at a time until it reaches your designated number of features.

What I'm looking for is, for those seasoned feature selection scientists out there, a method to find the best subset or combination of features. I read through all Feature Selection documentation and can't find a method that describes what I have in mind.

Any tips will be greatly appreciated!!!!!!

2
  • First you should observe the variance covariance plots. This will give you a sense of pairwise correlation between your features. Commented Sep 10, 2016 at 21:55
  • This would be a great SKlearn library contribution but I'm afraid it doesn't seem to exist at the moment. Elastic Net, while great, is clearly not the same as RFE by p-value in a multivariate regression; you are right that currently the nearest fit is GenericUnivariateSelect(f_classif, 'fwe', param=0.5) # keep features with P <0.5 Commented Oct 22, 2021 at 19:45

2 Answers 2

2

I want to use a feature selection method where "combinations" of features or "between features" interactions are considered for a simple linear regression.

For this case, you might consider using Lasso (or, actually, the elastic net refinement). Lasso attempts to minimize linear least squares, but with absolute-value penalties on the coefficients. Some results from convex-optimization theory (mainly on duality), show that this constraint takes into account "between feature" interactions, and removes the more inferior of correlated features. Since Lasso is know to have some shortcomings (it is constrained in the number of features it can pick, for example), a newer variant is elastic net, which penalizes both absolute-value terms and square terms of the coefficients.

In sklearn, sklearn.linear_model.ElasticNet implements this. Note that this algorithm requires you to tune the penalties, which you'd typically do using cross validation. Fortunately, sklearn also contains sklearn.linear_model.ElasticNetCV, which allows very efficient and convenient searching for the values of these penalty terms.

Sign up to request clarification or add additional context in comments.

Comments

1

I believe you have to generate your combinations first and only then apply the feature selection step. You may use http://scikit-learn.org/stable/modules/preprocessing.html#generating-polynomial-features for the feature combinations.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.