0

I'm trying to create a confusion matrix in NumPy, so I initialize a zero-filled matrix with size n x n being n the number of classes in my classifier.

After that I iterate over every output (which is a word, or number, or whatever, but is contained in self.clases__), get the indices in the set of clases that correspond to the real label and the output, and try to increase the value at that position by 1.

This results in an IndexError exception, here's the output for a simple example:

In[99]: e.score([[0.1,0.5,0.7,0.8],[0.4,0.5,0.1,0.1,0.9],[0.1,0.9],
[0.1,0.5,0.8,0.2,0.9]],['Clase 1','Clase 2','Clase 2','Clase 1'])

Clase 1 -> 0
Clase 1 -> 0
(2, 2)
Clase 1 -> 0
Clase 2 -> 1
(2, 2)
Traceback (most recent call last):

File "<ipython-input-99-2b9291b14506>", line 2, in <module>
  e.score([[0.1,0.5,0.7,0.8],[0.4,0.5,0.1,0.1,0.9],[0.1,0.9], 
  [0.1,0.5,0.8,0.2,0.9]],['Clase 1','Clase 2','Clase 2','Clase 1'])

File "redes.py", line 156, in score
conf_matrix[i][j] += 1

IndexError: index 1 is out of bounds for axis 0 with size 1

Here's the code:

def score(self,X,Y):
    salidas_obtenidas = self.clasifica(X)

    conf_matrix = zeros((len(self.clases__),len(self.clases__)),dtype=int)

    indexador = array(range(0,len(self.clases__)))

    for k,obt in enumerate(salidas_obtenidas):
        i = indexador[self.clases__ == obt]
        j = indexador[self.clases__ == Y[k]]
        print("%s -> %d" % (obt,i))
        print("%s -> %d" % (Y[k],j))
        print(conf_matrix.shape)
        conf_matrix[i][j] += 1

    return conf_matrix

The thing that has me stumped is that the shape printed is obviously big enough to be indexed, and yet it raises the exception. I use the numpy classes array and zeros.

Also, to my even greater confusion, conf_matrix doesn't update, if I print it at any point in the execution, it just prints a zero-filled matrix, even though before it raises the exception, it should have at least a position with 1 in it.

That makes me suspect it's not actually accessing the real matrix, but I have no idea why it wouldn't.

I'm using numpy version 1.14.2, and Python 3.6.

Any help is appreciated!

EDIT

I added some code so that hopefully someone can reproduce the error without needing the rest of the code:

import numpy as np
salidas_obtenidas = np.array(['Clase 1', 'Clase 2', 'Clase 1', 'Clase 2'])
salidas_obtenidas = salidas_obtenidas.reshape(len(salidas_obtenidas),1)
clases = np.array(['Clase 1','Clase 2'],dtype='<U7')
Y = ['Clase 1','Clase 2','Clase 2','Clase 1']

conf_matrix = np.zeros((len(clases),len(clases)),dtype=int)

indexador = np.array(range(0,len(clases)))

for k,obt in enumerate(salidas_obtenidas):
   i = indexador[clases == obt]
   j = indexador[clases == Y[k]]
   print("%s -> %d" % (obt,i))
   print("%s -> %d" % (Y[k],j))
   print(conf_matrix.shape)
   conf_matrix[i][j] += 1

print(conf_matrix)
2
  • I don't understand the error either and I can't reproduce it on my computer because I don't have all the code. I can tell you what I would do. I'm using PyCharm as an IDE and it has a wonderful debug option. You can set a breakpoint where the execution will stop and you will be able to see exactly the state of the program with all the variables and do all sorts of operation on it. I would suggest you try that if you haven't yet. You can find PyCharm community edition here, free and open source: jetbrains.com/pycharm/download. Disclaimer: I am not connected to PyCharm in any way. Commented Apr 9, 2018 at 10:04
  • @GianlucaMicchi I'll prepare some sample code for reproductivity. Thanks for the advice, I'm using Spyder myself, which I'm sure has a debug option as well. I'll try that asap, thank you. Commented Apr 9, 2018 at 10:48

2 Answers 2

1

my repo is too low to comment, hence writing here... In your case the conf_matrix seems legit, but there is a really complex way that the indices i & j reach it. Causes can be:

  1. i/j are exceeding the limits: (doesn't seems by the print msg)
  2. Datatype of i is not int: Although the print cmd is printing i, still check if its not list etc. I did a short exp in cmd prompt below. Usually the index error comes with size 1 if there is some underlying issue with the datatype etc:

conf_matrix[0][1] 1 conf_matrix[[0]][1] Traceback (most recent call last): File "<stdin>", line 1, in <module> IndexError: index 1 is out of bounds for axis 0 with size 1

please let us know if this is the case. If i were you, i would check if the indexador is spewing out correct indexes for my use to make sure that it is not giving range or something funny in any case.

Also i would remove the line "conf_matrix[i][j] += 1" & run the for loop below to see if all k,obt & i,j pair produced Pls share the o/p

Sign up to request clarification or add additional context in comments.

1 Comment

Indeed the problem was 2, i and j were not type int, indexador was outputting arrays, not numbers. Kinda mad at myself for not seeing it sooner.
0

I found the issue!

i = indexador[self.clases__ == obt]
j = indexador[self.clases__ == Y[k]]

With this code, i and j are actually arrays of shape (1,), and I was using them as if they were simply scalars.

In case someone finds the same issue, it's fixed changing those lines to:

i = indexador[self.clases__ == obt][0]
j = indexador[self.clases__ == Y[k]][0]

so that you grab as i and j the number inside the array of shape (1,) instead of the whole array.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.