0

I'm trying to reproduce a tutorial seen here.

Everything work perfectly until I add the .fit methods with my training set.

Here is a sample of my code :

# TRAINING PART

train_dir = 'pdf/learning_set'
dictionary = make_dic(train_dir)

train_labels = np.zeros(20)
train_labels[17:20] = 1
train_matrix = extract_features(train_dir)
model1 = MultinomialNB()
model1.fit(train_matrix, train_labels)


# TESTING PART

test_dir = 'pdf/testing_set'
test_matrix = extract_features(test_dir)
test_labels = np.zeros(8)
test_labels[4:7] = 1
result1 = model1.predict(test_matrix)
print(confusion_matrix(test_labels, result1))

Here is my Traceback:

Traceback (most recent call last):
File "ML.py", line 65, in <module>
model1.fit(train_matrix, train_labels)
File "/usr/local/lib/python3.6/site-packages/sklearn/naive_bayes.py", 
line 579, in fit
X, y = check_X_y(X, y, 'csr')
File "/usr/local/lib/python3.6/site-
packages/sklearn/utils/validation.py", line 552, in check_X_y
check_consistent_length(X, y)
File "/usr/local/lib/python3.6/site-
packages/sklearn/utils/validation.py", line 173, in 
check_consistent_length
" samples: %r" % [int(l) for l in lengths])
ValueError: Found input variables with inconsistent numbers of 
samples: [23, 20]

I would like to know how can I solve this issue ? I'm working on Ubuntu 16.04, with python 3.6.

1 Answer 1

1

ValueError: Found input variables with inconsistent numbers of samples: [23, 20]

That means you have 23 training Vectors (train_matrix has 23 rows) but only 20 training labels (train_labels is an array of 20 values)

change train_labels = np.zeros(20) to train_labels = np.zeros(23) and it should work.

Sign up to request clarification or add additional context in comments.

1 Comment

Thank a lot, it work perfectly ! that was a silly mistake aha

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.