19

Update: In lastest version of numpy (e.g., v1.8.1), this is no longer a issue. All the methods mentioned here now work as excepted.

Original question: Using object dtype to store string array is convenient sometimes, especially when one needs to modify the content of a large array without prior knowledge about the maximum length of the strings, e.g.,

>>> import numpy as np
>>> a = np.array([u'abc', u'12345'], dtype=object)

At some point, one might want to convert the dtype back to unicode or str. However, simple conversion will truncate the string at length 4 or 1 (why?), e.g.,

>>> b = np.array(a, dtype=unicode)
>>> b
array([u'abc', u'1234'], dtype='<U4')
>>> c = a.astype(unicode)
>>> c
array([u'a', u'1'], dtype='<U1')

Of course, one can always iterate over the entire array explicitly to determine the max length,

>>> d = np.array(a, dtype='<U{0}'.format(np.max([len(x) for x in a])))
array([u'abc', u'12345'], dtype='<U5')

Yet, this is a little bit awkward in my opinion. Is there a better way to do this?

Edit to add: According to this closely related question,

>>> len(max(a, key=len))

is another way to find out the longest string length, and this step seems to be unavoidable...

2
  • Not a solution, but max(len(x) for x in a) is probably faster than constructing a list and calling np.max. Commented Apr 17, 2013 at 15:21
  • 1
    I edited the question just before your comment:D max(a, key=len) is even faster. Commented Apr 17, 2013 at 16:05

2 Answers 2

27

I know this is an old question but in case anyone comes across it and is looking for an answer, try

c = a.astype('U')

and you should get the result you expect:

c = array([u'abc', u'12345'], dtype='<U5')
Sign up to request clarification or add additional context in comments.

Comments

5

At least in Python 3.5 Jupyter 4 I can use:

a=np.array([u'12345',u'abc'],dtype=object)
b=a.astype(str)
b

works just fine for me and returns:

array(['12345', 'abc'],dtype='<U5')

2 Comments

seems like if the array was initialised with dtype == np._str, using astype(str) will not convert the dtype

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.