Python SVM setting an array element with a sequence error

Question

I'm trying to use the SVM from the sklearn library to perform some image recognition, but when I call the fit method, I get a "ValueError: setting an array element with a sequence." type of error. My code is as following.

My testing.py file:

import matplotlib.pyplot as plt
import numpy as np
from sklearn import svm
from imageToNumberArray import imageToNumberArray

classAndValuesFile = "../Classes_Values.txt"
classesFiles = "../"

testImage = "ImageToPerformTestOn.png"

x = []
y = []

def main():
    i = 0
    with open(classAndValuesFile) as f:
        for line in f:
            splitter = line.split(",", 2)
            x.append(imageToNumberArray(classesFiles + splitter[0]))
            y.append(splitter[1].strip())

    clf = svm.SVC(gamma=0.001, C=100)
    clf.fit(x,y)
    #print clf.predict(testImage)

The imageToNumberArray file is:

from PIL import Image
from numpy import array


def imageToNumberArray(path):
    img = Image.open(path)
    arr = array(img)
    return arr

And I'm getting the following error:

Traceback (most recent call last):
  File "D:\Research\project\testing.py", line 30, in <module>
main()
  File "D:\Research\project\testing.py", line 23, in main
clf.fit(x,y)
  File "C:\Python27\lib\site-packages\sklearn\svm\base.py", line 139, in fit
X = check_array(X, accept_sparse='csr', dtype=np.float64, order='C')
  File "C:\Python27\lib\site-packages\sklearn\utils\validation.py", line 344, in check_array
array = np.array(array, dtype=dtype, order=order, copy=copy)
ValueError: setting an array element with a sequence.

If I comment the clf.fit line it works just fine.

Also, If I print all the shapes of the matrices in X, I get something like this (some are 2D, some are 3D):

(59, 58, 4)
(49, 27, 4)
(570, 400, 3)
(471, 364)
(967, 729)
(600, 600, 3)
(325, 325, 3)
(386, 292)
(86, 36, 4)
(49, 26, 4)
(578, 244, 3)
(300, 300)
(995, 557, 3)
(1495, 677)
(400, 400, 3)
(200, 230, 3)
(74, 67, 4)
(49, 34, 4)
(240, 217, 3)
(594, 546, 4)
(387, 230, 3)
(297, 273, 4)
(400, 400, 3)
(387, 230, 3)
(86, 62, 4)
(50, 22, 4)
(499, 245, 3)
(800, 566, 4)
(1050, 750, 3)
(400, 400, 3)
(499, 245, 3)
(74, 53, 4)
(47, 26, 4)
(592, 348, 4)
(1050, 750, 3)
(1600, 1600)
(320, 320)
(84, 54, 4)
(47, 25, 4)
(600, 294, 3)
(400, 400, 3)
(1050, 750, 3)
(1478, 761)
(504, 300, 3)
(53, 84, 4)
(36, 42, 4)
(315, 600, 4)
(223, 425, 3)
(194, 325, 3)

The first two numbers are the size of the image.

What can I do the get rid of this error?

You almost definitely want to extract features from your images before doing any kind of machine learning (although I know KNN can work well for digit recognition). Check this out: codeproject.com/Articles/619039/… — Ryan
– Ryan, Commented Aug 24, 2015 at 16:09

lejlot · Accepted Answer · 2015-08-25 22:53:48Z

2

You seem to be confused how SVM works. In short, x has to be one, big two-dimensional array, while in your case it is a list of various matrices. SVM will not ever run on such data. First, find a meaningful (in your data sense) way to represent each image as a constant size vector, which is often called feature extraction. One of the basic approaches would be to represent each image as some histogram or as bag of visual words.

answered Aug 25, 2015 at 22:53

lejlot

67k9 gold badges138 silver badges168 bronze badges

Sign up to request clarification or add additional context in comments.

Collectives™ on Stack Overflow

Python SVM setting an array element with a sequence error

1 Answer 1

Comments

Your Answer

Linked

Hot Network Questions

Collectives™ on Stack Overflow

1 Answer 1

Comments

Your Answer

Sign up or log in

Post as a guest

Linked

Related