0

Unable to reach convergence for this linear regression code and also unable to debug what is going wrong with the code. Can anybody help? Process:

  1. Collect x, and y from the dataset
  2. Create x_updated by adding a column of 1 in the front
  3. Applied gradient decent on the squared error loss (Wrote a code separately for calculating gradient and loss)
import pandas as pd
import numpy as np

class LinearReg:
    
    def __init__(self,with_reg=False,learning_rate=0.001,stopping_threshold=0.0001,iterations=100000):
        """
        Initialise the constructor of the linear regression
        """
        
        #check if we need to fit the with reg loss model or not
        self.with_reg=with_reg
        
        #stopping rule for the gadient decent
        self.stopping_threshold=stopping_threshold
        self.iterations=iterations
        
        
        #ddefine the learning rate required for gradient decent 
        self.learning_rate=learning_rate
        
        
    def calc_naive_loss_gradient(self,weight_vector):
        """
        Calculate the gradient decent for the non regularised 
        loss with with given x,y and weights
        """
        first_comp=np.dot(np.dot(self.x_updated.transpose(),self.x_updated),weight_vector)
        second_comp=np.dot(self.x_updated.transpose(),self.y)
        
        return(first_comp-second_comp)
    
    
    
    def calc_naive_loss(self,weight_vector):
        """
        Calculate the naive loss function value
        """
        return(np.sqrt(np.sum(np.dot(self.x_updated,weight_vector)**2)))
    
    
    
    def gradient_decent(self,weight_vector_new,weight_vector_old):
        """
        Function to apply gradient decent on the loss function
        """
        print('Weight vector old: {}'.format(weight_vector_old))
        
        while(True):
            weight_vector_old=weight_vector_new.copy()            
            weight_vector_new=weight_vector_old-self.learning_rate*self.calc_naive_loss_gradient(weight_vector_old)
            
            
            print('Updated loss: {}'.format(self.calc_naive_loss(weight_vector_new)))
            dist_weights=np.sqrt(np.sum((weight_vector_new-weight_vector_old)**2))
            if(dist_weights<self.stopping_threshold):
                break;
                
        return(weight_vector_new)
    
    
    def fit(self,x,y):
        """
        Function to fit the linear regression
        """
        #define a column of vector 1
        one_vector=np.ones(x.shape[0]).reshape(x.shape[0],1)

        #concatenate the x vector with vector of 1
        self.x_updated=np.concatenate((one_vector,x),axis=1)
        self.y=y
        
        #initialise a random weight
        weight_vector=np.random.uniform(0,1,self.x_updated.shape[1])
        
        #run gradient decent to get the best weights
        best_weight=self.gradient_decent(weight_vector_new=weight_vector.copy(),
                                         weight_vector_old=weight_vector.copy())
        
        print('Best loss: {}'.format(self.calc_naive_loss(weight_vector=best_weight)))



a=LinearReg()

import numpy as np
from sklearn.utils import shuffle
from sklearn.datasets import make_regression
x, y = make_regression(n_features=5,n_samples=2010)

a.fit(x,y)

3 Answers 3

1

Update the error and weights within the gradient descent function. Then within the fit. Check the code below.

import numpy as np

class LinearReg:
    
    def __init__(self,with_reg=False,learning_rate=0.0001,
                 stopping_threshold=1e-8,iterations=100000):   
          self.stopping_threshold = stopping_threshold
          self.iterations = iterations
          self.learning_rate = learning_rate
          
          
    def gradient_descent(self):
        direction = self.x_updated.T @ (self.y - self.x_updated @ self.weights)
        new = self.weights + self.learning_rate * direction
        self._error = np.linalg.norm(new - self.weights)
        self.weights = new
    
    
    def fit(self,x,y, intercept = False):
        self.x_updated = np.c_[np.ones((y.size, 1)), x] if intercept else x 
        self.y=y
        self.weights = np.random.uniform(0, 1, self.x_updated.shape[1]) 
        for it in range(self.iterations):
            self.gradient_descent()
            if self._error<self.stopping_threshold:
                print(f"Took {it} iterations to converge")
                break
        print(self.weights)
    
    

a = LinearReg()
from sklearn.datasets import make_regression
x, y = make_regression(n_features=5,n_samples=2010)
LinearReg(learning_rate=0.0005).fit(x,y)
Took 9 iterations to converge
[44.49799439 48.81286468 96.08803245 93.87028819 84.4267467 ]

# compare:    
from sklearn.linear_model import LinearRegression
print(LinearRegression(fit_intercept = False).fit(x, y).coef_)
[44.49799439 48.81286468 96.08803245 93.87028819 84.4267467 ]
Sign up to request clarification or add additional context in comments.

11 Comments

Thanks for the answer. But can you elaborate on why my code fails to give correct results?
@Heisenberg you are subtracting the gradient instead of adding. That is the main problem. Then you just need to clean it up.
weight_vector_new=weight_vector_old-self.learning_rate*self.calc_naive_loss_gradient(weight_vector_old) Isn't this the correct formulation? If not can you correct the equation?
@Heisenberg you should subtract the negative, hence becomes addition.
Still not clear sir. I have to go against the direction of the slope. If the slope is positive then have to go against it else if the direction of the slope is negative then have to go toward it. Hence it should be a subtraction right? If the calculation of the gradient wrong?
|
1

This can help you in creating something very close to the LinearRegression class provided by sklearn.

class LinearRegression():
  betas = None

  def fit(self, x, y, learning_rate = 0.001):
    beta_0 = beta_1 = 0
    n = len(x)

    while True:
      y_pred = beta_0 + (beta_1 * x)
      cost = round(np.mean((y - y_pred) ** 2), 5)
      beta_0_d = (-2/n) * sum(y - y_pred)
      beta_1_d = (-2/n) * sum(x * (y - y_pred))
      beta_0 = beta_0 - (beta_0_d * learning_rate)
      beta_1 = beta_1 - (beta_1_d * learning_rate)
      if cost == 0:
        break
    self.betas = (round(beta_0, 2), round(beta_1, 2))

  def predict(self, x):
    return self.betas[0] + self.betas[1] * x

For more details, you can check out this article on Gradient Descent by ml-concepts.com which explains the whole process of creating the Linear Regression algorithm from Scratch.

If you are only interested in the code then you can refer to this Google Colab notebook instead.

(Full disclosure - I am a part of the ml-concepts.com team)

1 Comment

Thanks for the answer. But can you elaborate on why my code fails to give correct results?
0

The code is alright except for the formula for calculating the loss function. Correct code for loss will be:

def calc_naive_loss(self,weight_vector):
        """
        Calculate the naive loss function value
        """
        
        return(np.sum((self.y-self.x_updated@weight_vector)**2))
    

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.