How to do Linear Regression and get Standard Deviation (Python)

Question

I have this very simple problem, but somehow I have not found a solution for it yet:

I have two curves, A1 = [1,2,3] A2 = [4,5,6]

I want to fit those curves to another curve B1 = [4,5,3] with Linear Regression so B1 = aA1 + bA2

This can easily be done with sklearn LinearRegression - but sklearn does not give you the standard deviation on your fitting parameters.

I tried using statsmodels... but somehow i cant get the format right

import numpy as np

import statsmodels.api as sm

a = np.array([[1, 2, 3], [4, 5, 6]])
b = np.array([4, 5, 3])

ols = sm.OLS(a, b)

Error : ValueError: endog and exog matrices are different sizes

Do you want the STD of the predictions or of the coefficients? — CutePoison
– CutePoison, Commented Aug 13, 2021 at 14:23

StupidWolf · Accepted Answer · 2021-08-15 16:59:22Z

2

If your formula is B1 = aA1 + bA2, then the array b is your endogenous and the array a is your exogenous. You need to transpose your exogenous:

ols = sm.OLS(b, a.T)
res = ols.fit()
res.summary()

                            OLS Regression Results                            
==============================================================================
Dep. Variable:                      y   R-squared:                       0.250
Model:                            OLS   Adj. R-squared:                 -0.500
Method:                 Least Squares   F-statistic:                    0.3333
Date:                Sat, 14 Aug 2021   Prob (F-statistic):              0.667
Time:                        05:48:08   Log-Likelihood:                -3.2171
No. Observations:                   3   AIC:                             10.43
Df Residuals:                       1   BIC:                             8.631
Df Model:                           1                                         
Covariance Type:            nonrobust                                         
==============================================================================
                 coef    std err          t      P>|t|      [0.025      0.975]
------------------------------------------------------------------------------
x1            -2.1667      1.462     -1.481      0.378     -20.749      16.416
x2             1.6667      0.624      2.673      0.228      -6.257       9.590
==============================================================================
Omnibus:                          nan   Durbin-Watson:                   3.000
Prob(Omnibus):                    nan   Jarque-Bera (JB):                0.531
Skew:                           0.707   Prob(JB):                        0.767
Kurtosis:                       1.500   Cond. No.                         12.3
==============================================================================

From sklearn:

from sklearn.linear_model import LinearRegression
reg = LinearRegression(fit_intercept=False).fit(a.T,b)
reg.coef_
array([-2.16666667,  1.66666667])

edited Aug 15, 2021 at 16:59

answered Aug 14, 2021 at 3:49

StupidWolf

47.1k17 gold badges50 silver badges81 bronze badges

Sign up to request clarification or add additional context in comments.

3 Comments

smallbirds Over a year ago

When I follow your method and write out res, I just get: <statsmodels.regression.linear_model.RegressionResultsWrapper at 0x7f38c12e5240>

StupidWolf Over a year ago

it should be res.summary(), statsmodels.org/stable/examples/notebooks/generated/ols.html

smallbirds Over a year ago

btw is there a way to enforce positive fitting parameters with statsmodels?

CutePoison · Accepted Answer · 2021-08-13 14:38:42Z

1

Assume you have a data-matrix X where each row is an observation and a column vector y, the OLS solution is given by

beta_hat = (X'X)^(-1)*X'*y

where X' is the transpose of X and X^(-1) the inverse.

The variance of the coefficients are given as

Var(beta)= s^2*(X'*X)

with s^2 being the estimated variance of your data.

With that you can use any linalg tool, e.g numpy to easy do the calculations

answered Aug 13, 2021 at 14:38

CutePoison

5,5448 gold badges44 silver badges99 bronze badges

Collectives™ on Stack Overflow

How to do Linear Regression and get Standard Deviation (Python)

2 Answers 2

3 Comments

Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

2 Answers 2

3 Comments

Comments

Your Answer

Sign up or log in

Post as a guest

Related