1

I'm trying to fit a linear regression model using a greedy feature selection algorithm. To be a bit more specific, I have four sets of data:

X_dev, y_dev, X_test, y_test, the first two being the features and labels for the training set and the latter two for the test set. The size of the matrices are (900, 126), (900, ), (100, 126), and (100, ), respectively.

What I mean by "greedy feature selection" is that I would first fit 126 models using one feature each from the X_dev set, choose the best one, then run models using the first one and each of the remaining 125 models. The selection continues until I have obtained 100 of the features that perform best among the original 126.

The problem I'm facing is regarding implementation in Python. The code that I have is for fitting a single feature first:

lin_reg.fit(X_dev[:, 0].reshape(-1, 1), y_dev)
lin_pred = lin_reg.predict(X_test)

Because the dimensions don't match ((100, 126) and (1, )) I'm getting a dimension mismatch error.

How should I fix this? I'm trying to predict how the model performs when using the single feature.

Thank you.

3
  • Post the rest of your code Commented Nov 1, 2018 at 4:18
  • Also, there is undoubtedly an implementation of this already built for you in scikit or some other library. Find it and save yourself the hassle Commented Nov 1, 2018 at 4:18
  • Unfortunately, this is basically the entire code that I have atm. The rest are the usual import statements for NumPy and sklearn with the np.load statements for the data. I'll take your advice though and look for an implementation. Commented Nov 1, 2018 at 4:44

1 Answer 1

1

Apply the same transformation to X_test

lin_reg.fit(X_dev[:, 0].reshape(-1, 1), y_dev)
lin_pred = lin_reg.predict(X_test[:, 0].reshape(-1, 1))

I also don’t think the reshape is necessary.

lin_reg.fit(X_dev[:, 0], y_dev)
lin_pred = lin_reg.predict(X_test[:, 0])

Should work as well

Sign up to request clarification or add additional context in comments.

1 Comment

Thanks @John H your suggestion helped me solve it. The second method you proposed is the one I initially tried, but got the error message ValueError: Expected 2D array, got 1D array instead.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.