1

I'm working on a machine-learning code that calculate the cost function and gradient descent , I wrote every function separately as shown:

def costFunction(theta, X, y):
    
    m = y.size

    J = (1/m) * ( np.dot(-y,np.log(sigmoid(np.dot(X,theta))))  -  np.dot((1-y),(np.log(1-sigmoid(np.dot(X,theta))))) ) 
    
    return J
def gradiantDescent(alpha , theta , X , y , num_itr):

    m = y.shape[0]
    J_history = []
    theta = theta.copy()

    for _ in range(num_itr):
        
        tempZero = theta[0]

        theta -= (alpha/m) * (np.dot(X.T , (sigmoid(np.dot(X,theta))-y)))
        theta[0] = tempZero -  ( (alpha/m) * np.sum((sigmoid(np.dot(X,theta))-y)))

        J_history.append(costFunction(theta, X, y))

    return theta , J_history

and when i call the 'cost function' separately it's works as i expected:

intial_theta = np.zeros(X.shape[1])

J = costFunction(intial_theta, X, y):

print(J) # works as expected

but when i call it in gradiantDescent function all J_history will be 'nan' value:

theta , Jvec = gradiantDescent(0.05, intial_theta , X , y , 500)

print(Jvec) #all values are 'nan'

So how can i fix it.

6
  • when you call gradiantDescent do you call costFunction before that? Commented Dec 2, 2020 at 20:24
  • No, first i call gradiantDescent and it doesn't work as i expected, so i call cost function separately to see if it work and it work correctly as it shown before. @illusion Commented Dec 2, 2020 at 20:32
  • In your code only theta[0] is getting updated. Shouldn't it run for all the thetas in theta array? Commented Dec 2, 2020 at 20:48
  • Or rather you are updating only one weight. Shouldn't you update all of them? Commented Dec 2, 2020 at 20:52
  • It's already run for all thetas. as the RHS in (theta = ...) is an array with shape(5,) so the theta array updated in every iterate, and i update theta[0] separately because it should take another value not as the other indices Commented Dec 2, 2020 at 20:54

2 Answers 2

1

Try this in your gradiantDescent function:

for _ in range(num_itr):
    theta = theta - (alpha / m) * np.dot(X.T, (np.dot(X, theta) - y))
    J_history.append(costFunction(theta, X, y))
return theta, J_history

You get a nan value because some calculations are going wrong...

Sign up to request clarification or add additional context in comments.

1 Comment

you are right theta values are going wrong, you alerted me to this point, i tried to find the error and it was the minus operand '-'. I changed it to numpy.subtract() and it's work correctly, Thanks for your effort.
1

The minus Operand was the error in calculating theta, Should use numpy.subtract(arr1, arr2) Old Code:

theta -= (alpha/m) * (np.dot(X.T , (sigmoid(np.dot(X,theta))-y)))

New:

np.subtract( theta ,(alpha/m) * (np.dot(X.T , (sigmoid(np.dot(X,theta))-y))) )

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.