18

I have a dataset (dataTrain.csv & dataTest.csv) in .csv file with this format:

Temperature(K),Pressure(ATM),CompressibilityFactor(Z)
273.1,24.675,0.806677258
313.1,24.675,0.888394713
...,...,...

And able to build a regression model and prediction with this code:

import pandas as pd
from sklearn import linear_model

dataTrain = pd.read_csv("dataTrain.csv")
dataTest = pd.read_csv("dataTest.csv")
# print df.head()

x_train = dataTrain['Temperature(K)'].reshape(-1,1)
y_train = dataTrain['CompressibilityFactor(Z)']

x_test = dataTest['Temperature(K)'].reshape(-1,1)
y_test = dataTest['CompressibilityFactor(Z)']

ols = linear_model.LinearRegression()
model = ols.fit(x_train, y_train)

print model.predict(x_test)[0:5]

However, what I want to do is multivariable regression. So, the model will be CompressibilityFactor(Z) = intercept + coef*Temperature(K) + coef*Pressure(ATM)

How to do that in scikit-learn?

1
  • Just include both Temperature and Pressure in your xtrain, xtest. x_train = dataTrain[["Temperature(K)", "Pressure(ATM)"]] and then the same for x_test. Commented Feb 5, 2017 at 18:50

2 Answers 2

17

If your code above works for univariate, try this

import pandas as pd
from sklearn import linear_model

dataTrain = pd.read_csv("dataTrain.csv")
dataTest = pd.read_csv("dataTest.csv")
# print df.head()

x_train = dataTrain[['Temperature(K)', 'Pressure(ATM)']].to_numpy().reshape(-1,2)
y_train = dataTrain['CompressibilityFactor(Z)']

x_test = dataTest[['Temperature(K)', 'Pressure(ATM)']].to_numpy().reshape(-1,2)
y_test = dataTest['CompressibilityFactor(Z)']

ols = linear_model.LinearRegression()
model = ols.fit(x_train, y_train)

print model.predict(x_test)[0:5]
Sign up to request clarification or add additional context in comments.

1 Comment

DataFrames don't have a reshape function. To run the above code I have to use values first, eg x_train = dataTrain[['Temperature(K)', 'Pressure(ATM)']].values.reshape(-1,2).
0

That's correct you need to use .values.reshape(-1,2)

In addition if you want to know the coefficients and the intercept of the expression:

CompressibilityFactor(Z) = intercept + coefTemperature(K) + coefPressure(ATM)

you can get them with:

Coefficients = model.coef_
intercept = model.intercept_

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.