Overflow Error in Neural Networks implementation

Question

I m trying to build my own implementation of neural network back propagation algorithm. The code i have written for training is this so far,

def train(x,labels,n):
    lam = 0.5
    w1 = np.random.uniform(0,0.01,(20,120))     #weights
    w2 = np.random.uniform(0,0.01,20)
    for i in xrange(n):
            w1 = w1/np.linalg.norm(w1)
            w2 = w2/np.linalg.norm(w2)
            for j in xrange(x.shape[0]):
                    y1 = np.zeros((600))        #output
                    d1 = np.zeros((20))
                    p = np.mat(x[j,:])
                    a = np.dot(w1,p.T)          #activation
                    z = 1/(1 + np.exp((-1)*a))
                    y1[j] = np.dot(w2,z)
                    for k in xrange(20):
                            d1[k] = z[k]*(1 - z[k])*(y1[j] - labels[j])*np.sum(w2) #delta update rule
                            w1[k,:] = w1[k,:] - lam*d1[k]*x[j,:]     #weight update
                            w2[k] = w2[k] - lam*(y1[j]-labels[j])*z[k]
                    E = 1/2*pow((y1[j]-labels[j]),2)                 #mean squared error
            print E
    return 0

No of input units- 120, No of hidden units- 20, No of output units- 1, No of training samples- 600

x is a 600*120 training set, with zero mean and unit variance, with max value 3.28 and min value -4.07. The first 200 samples belong to class 1, the second 200 to class 2 and last 200 to class 3. Labels are the class labels assigned to each sample, n is the number of iterations required for convergence. Each sample has 120 features.

I have initialized the weights between 0 and 0.01 and the input data is scaled to have unit variance and zero mean and still the code throws a Overflow warning, resulting in 'a' i.e. activation values being NaN. I cant understand what seems to be the problem.

Every sample has 120 elements. A sample row of x :

[ 0.80145231  1.29567936  0.91474224  1.37541992  1.16183938  1.43947296
  1.32440357  1.43449479  1.32742415  1.40533852  1.28817561  1.37977183
  1.2290933   1.34720161  1.15877069  1.29699635  1.05428735  1.21923531
  0.92312685  1.1061345   0.66647463  1.00044203  0.34270708  1.05589558
  0.28770958  1.21639524  0.31522575  1.32862243  0.42135899  1.3997094
  0.5780146   1.44444501  0.75872771  1.47334256  0.95372771  1.48878048
  1.13968139  1.49119962  1.33121905  1.47326017  1.47548571  1.4450047
  1.58272343  1.39327328  1.62929132  1.31126604  1.62705274  1.21790335
  1.59951034  1.12756958  1.56253815  1.04096709  1.52651382  0.95942134
  1.48875633  0.87746762  1.45248623  0.78782313  1.40446404  0.68370011

You appear to be using np.dot to multiply a numpy array and a numpy matrix - probably not good practice (see this). Could p be an array instead? I don't know if this is the cause of your problem. — Lee
– Lee, Commented Apr 17, 2014 at 9:12
I did that for debugging. I earlier implemented it with p as array. Still not working. — m_amber
– m_amber, Commented Apr 17, 2014 at 9:22
Thanks, what size/shape is the x array? Example inputs in the question would be good :-) — Lee
– Lee, Commented Apr 17, 2014 at 9:34

jorgenkg · Accepted Answer · 2022-05-19 13:46:00Z

Overflow

The logistic sigmoid function is prone to overflow in NumPy as the signal strength increase. Try appending the following line:

np.clip( signal, -500, 500 )

This will limit the values in NumPy matrices to be within the given interval. In turn, this will prevent the precision overflow in the sigmoid function. I find +-500 to be a convenient signal saturation level.

>>> arr
array([[-900, -600, -300],
       [   0,  300,  600]])
>>> np.clip( arr, -500, 500)
array([[-500, -500, -300],
       [   0,  300,  500]])

Implementation

This is the snippet I'm using in my projects:

def sigmoid_function( signal ):
    # Prevent overflow.
    signal = np.clip( signal, -500, 500 )
    
    # Calculate activation signal
    signal = 1.0/( 1 + np.exp( -signal ))
    
    return signal
#end

Why does the Sigmoid function overflow?

As the training progress, the activation function improves its precision. The sigmoid signal will converge on 1 from below or 0 from above as the accuracy approaches perfection. E.g., either 0.99999999999... or 0.00000000000000001...

Since NumPy is focused on performing highly accurate numerical operations, it will maintain the highest possible precision and thus cause an overflow error.

Note: This error message could be ignored by setting:

np.seterr( over='ignore' )

iperov · Accepted Answer · 2024-02-03 16:31:10Z

0

This code repeats pytorch sigmoid.

import numpy as np
    
def sigmoid(x : np.ndarray) -> np.ndarray:
    positives = x >= 0
    negatives = ~positives
    
    exp_x_neg = np.exp(x[negatives])
    
    y = x.copy()
    y[positives] = 1 / (1 + np.exp(-x[positives]))
    y[negatives] = exp_x_neg / (1 + exp_x_neg)
    
    return y

Test with pytorch

import torch

values = np.random.randint(-500000, 500000,
                           size=(1,3,512,512)).astype(np.float32) / 100.0

x_np = sigmoid(values)
x_t = torch.sigmoid(torch.tensor(values))

err_per_element = np.abs((x_np - x_t.cpu().numpy())).sum() / np.prod(values.shape)
print(err_per_element) # 6.039030965114082e-12

answered Feb 3, 2024 at 16:31

iperov

5237 silver badges10 bronze badges

Collectives™ on Stack Overflow

Overflow Error in Neural Networks implementation

2 Answers 2

Overflow

Implementation

Why does the Sigmoid function overflow?

Comments

Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

2 Answers 2

Overflow

Implementation

Why does the Sigmoid function overflow?

Comments

Comments

Your Answer

Sign up or log in

Post as a guest

Related