0

I am porting some code from python 2 to python 3. In my code, I define a data type for strings as:

MAX_WORD_LENGTH = 32
DT_WORD = np.dtype([('word', str('U') + str(MAX_WORD_LENGTH))])

Which shows up as:

>> DT_WORD.descr
[('word', '<U32')]

Now, when I create a basic numpy array, I get no errors:

>> import numpy as np
>> np.array(['a', 'b', 'c', 'd'])
array(['a', 'b', 'c', 'd'],
    dtype='<U1')

But when I introduce my data type,

>> np.array(['a','b','c','d'], dtype=DT_WORD)
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
TypeError: a bytes-like object is required, not 'str'

What does this error mean? All strings in python 3 are Unicode by default, so by explicitly stating the data type as Unicode I shouldn't get an error. How do I define my data type so it accepts unicode strings in both python 2 and 3?

1 Answer 1

2

I was able to eventually figure it out:

When using labelled dtypes the array is actually a structured array. Structured arrays arrays are created from a list of tuples (and not simply a list of values). So:

np.array(['a','b','c','d'], dtype=DT_WORD)

Should be:

np.array([('a',), ('b',), ('c',), ('d',)], dtype=DT_WORD)

More concisely, if X is a list of strings, you can use:

np.array(list(zip(X)), dtype=DT_WORD)

Which is compatible with python 2 and 3.

Also, the same code will give a TypeError in python 2 as well:

np.array(['a','b','c','d'], dtype=DT_WORD)
# Will give:
TypeError: expected a readable buffer 

So my question was partly incorrect in the first place. It had less to do with python version than with the distinction between arrays and structured arrays.

Sign up to request clarification or add additional context in comments.

3 Comments

Looks like bytes like object is the Py3 equivalent of a readable buffer. I don't know why they don't give a better error message. Trying to fill a structured array without the tuples must be a common enough error.
np.array([b'ab', b'b', b'c'],dtype=[('word','U3')]) works; but the b'ab' is rendered as a chinese character. docs.python.org/3/glossary.html#term-bytes-like-object. But why does a tuple work?
stackoverflow.com/questions/33532609/… - I gave a similar answer here; but didn't dig further.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.