5

While practicing Simple Linear Regression Model I got this error:

ValueError: Expected 2D array, got scalar array instead:
array=60.
Reshape your data either using array.reshape(-1, 1) if your data has a single 
feature or array.reshape(1, -1) if it contains a single sample.

This is my code (Python 3.7):

import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
from sklearn.linear_model import LinearRegression
from sklearn.metrics import r2_score
data = pd.read_csv("hw_25000.csv")


hgt = data.Height.values.reshape(-1,1)
wgt = data.Weight.values.reshape(-1,1)

regression = LinearRegression()
regression.fit(hgt,wgt)

print(regression.predict(60))

2 Answers 2

31

Short answer:

regression.predict([[60]])

Long answer: regression.predict takes a 2d array of values you want to predict on. Each item in the array is a "point" you want your model to predict on. Suppose we want to predict on the points 60, 52, and 31. Then we'd say regression.predict([[60], [52], [31]])

The reason we need a 2d array is because we can do linear regression in a higher dimension space than just 2d. For example, we could do linear regression in a 3d space. Suppose we want to predict "z" for a given data point (x, y). Then we'd need to say regression.predict([[x, y]]).

Taking this example further, we could predict "z" for a set of "x" and "y" points. For example, we want to predict the "z" values for each of the points: (0, 2), (3, 7), (10, 8). Then we would say regression.predict([[0, 2], [3, 7], [10, 8]]) which fully demonstrates the need for regression.predict to take a 2d array of values to predict on points.

Sign up to request clarification or add additional context in comments.

1 Comment

that was an amazing response
2

The ValueError is fairly clear, predict expects a 2D array but you passed a scalar.

hgt = np.random.randint(50, 70, 10).reshape(-1, 1)
wgt = np.random.randint(90, 120, 10).reshape(-1, 1)
from sklearn.linear_model import LinearRegression
from sklearn.metrics import r2_score

regression = LinearRegression()
regression.fit(hgt,wgt)

regression.predict([[60]])

You get

array([[105.10013717]])

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.