1

I'm trying to use the SVM from the sklearn library to perform some image recognition, but when I call the fit method, I get a "ValueError: setting an array element with a sequence." type of error. My code is as following.

My testing.py file:

import matplotlib.pyplot as plt
import numpy as np
from sklearn import svm
from imageToNumberArray import imageToNumberArray

classAndValuesFile = "../Classes_Values.txt"
classesFiles = "../"

testImage = "ImageToPerformTestOn.png"

x = []
y = []

def main():
    i = 0
    with open(classAndValuesFile) as f:
        for line in f:
            splitter = line.split(",", 2)
            x.append(imageToNumberArray(classesFiles + splitter[0]))
            y.append(splitter[1].strip())

    clf = svm.SVC(gamma=0.001, C=100)
    clf.fit(x,y)
    #print clf.predict(testImage)

The imageToNumberArray file is:

from PIL import Image
from numpy import array


def imageToNumberArray(path):
    img = Image.open(path)
    arr = array(img)
    return arr

And I'm getting the following error:

Traceback (most recent call last):
  File "D:\Research\project\testing.py", line 30, in <module>
main()
  File "D:\Research\project\testing.py", line 23, in main
clf.fit(x,y)
  File "C:\Python27\lib\site-packages\sklearn\svm\base.py", line 139, in fit
X = check_array(X, accept_sparse='csr', dtype=np.float64, order='C')
  File "C:\Python27\lib\site-packages\sklearn\utils\validation.py", line 344, in check_array
array = np.array(array, dtype=dtype, order=order, copy=copy)
ValueError: setting an array element with a sequence.

If I comment the clf.fit line it works just fine.

Also, If I print all the shapes of the matrices in X, I get something like this (some are 2D, some are 3D):

(59, 58, 4)
(49, 27, 4)
(570, 400, 3)
(471, 364)
(967, 729)
(600, 600, 3)
(325, 325, 3)
(386, 292)
(86, 36, 4)
(49, 26, 4)
(578, 244, 3)
(300, 300)
(995, 557, 3)
(1495, 677)
(400, 400, 3)
(200, 230, 3)
(74, 67, 4)
(49, 34, 4)
(240, 217, 3)
(594, 546, 4)
(387, 230, 3)
(297, 273, 4)
(400, 400, 3)
(387, 230, 3)
(86, 62, 4)
(50, 22, 4)
(499, 245, 3)
(800, 566, 4)
(1050, 750, 3)
(400, 400, 3)
(499, 245, 3)
(74, 53, 4)
(47, 26, 4)
(592, 348, 4)
(1050, 750, 3)
(1600, 1600)
(320, 320)
(84, 54, 4)
(47, 25, 4)
(600, 294, 3)
(400, 400, 3)
(1050, 750, 3)
(1478, 761)
(504, 300, 3)
(53, 84, 4)
(36, 42, 4)
(315, 600, 4)
(223, 425, 3)
(194, 325, 3)

The first two numbers are the size of the image.

What can I do the get rid of this error?

2
  • You almost definitely want to extract features from your images before doing any kind of machine learning (although I know KNN can work well for digit recognition). Check this out: codeproject.com/Articles/619039/… Commented Aug 24, 2015 at 16:09
  • Perhaps this can help you. Commented Aug 24, 2015 at 16:12

1 Answer 1

2

You seem to be confused how SVM works. In short, x has to be one, big two-dimensional array, while in your case it is a list of various matrices. SVM will not ever run on such data. First, find a meaningful (in your data sense) way to represent each image as a constant size vector, which is often called feature extraction. One of the basic approaches would be to represent each image as some histogram or as bag of visual words.

Sign up to request clarification or add additional context in comments.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.