I'm working on a machine-learning code that calculate the cost function and gradient descent , I wrote every function separately as shown:
def costFunction(theta, X, y):
m = y.size
J = (1/m) * ( np.dot(-y,np.log(sigmoid(np.dot(X,theta)))) - np.dot((1-y),(np.log(1-sigmoid(np.dot(X,theta))))) )
return J
def gradiantDescent(alpha , theta , X , y , num_itr):
m = y.shape[0]
J_history = []
theta = theta.copy()
for _ in range(num_itr):
tempZero = theta[0]
theta -= (alpha/m) * (np.dot(X.T , (sigmoid(np.dot(X,theta))-y)))
theta[0] = tempZero - ( (alpha/m) * np.sum((sigmoid(np.dot(X,theta))-y)))
J_history.append(costFunction(theta, X, y))
return theta , J_history
and when i call the 'cost function' separately it's works as i expected:
intial_theta = np.zeros(X.shape[1])
J = costFunction(intial_theta, X, y):
print(J) # works as expected
but when i call it in gradiantDescent function all J_history will be 'nan' value:
theta , Jvec = gradiantDescent(0.05, intial_theta , X , y , 500)
print(Jvec) #all values are 'nan'
So how can i fix it.
gradiantDescentdo you callcostFunctionbefore that?