0

I have a dependent variable y and 6 independent variables. I want to make a linear regression out of it. I use sklearn library to do it.

The problem is some of my independent variables have correlation more than 0.5. So I can't have them in my model at the same time

I searched throw internet but didn't find any solution to select best set of independent variables to draw linear regression and output the variables that had been selected.

1
  • One possibility is to first try a fit with all variables, and then remove from the regression the variable with the least significance and then re-run to see what happens to the fitting results. This test is easy to perform and might help in your analytical work. Commented Mar 7, 2019 at 11:23

2 Answers 2

2

If you see that you have a correlation between independent variables. You should consider to remove them.

I see you are working with scikit-learn. If you don't want to do any feature selection manually, you could always use one of the feature selection methods in scikit-learns feature_selection module. There are many ways to automatically remove features, and you should cross-validate to determine which one is best for your problem.

Sign up to request clarification or add additional context in comments.

2 Comments

I know I shouldn't use two variables that are correlated but I don't know which of these variables must be deleted in order to get the best reg line. And I went to the link to documentation of sklearn but didn't find any solution for correlation
You don't know that beforehand. You can only find out by doing cross validation.
1

You are probably looking for a k-fold validation model.

The idea is to randomly select your features, and have a way to validate them against each other.

The idea is to train your model with your feature selection on (k-1) partitions of your data. And validate it against the last partition. You do it for each partition and take the average of your score (MAE / RMSE for instance)

Your score is an objectif figure to compare your models aka your features selections

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.