10

I would like to convert a NumPy array of integers representing ASCII codes to the corresponding string. For example ASCII code 97 is equal to character "a". I tried:

from numpy import *
a=array([97, 98, 99])
c = a.astype('string')
print c

which gives:

['9' '9' '9']

but I would like to get the string "abc".

6 Answers 6

11
print "".join([chr(item) for item in a])

output

abc
Sign up to request clarification or add additional context in comments.

1 Comment

Thanks Ashoka for the nice solution. I was too focused on trying to use a NumPy function, but this seems like an elegant solution.
11

Another solution that does not involve leaving the NumPy world is to view the data as strings:

arr = np.array([97, 98, 99], dtype=np.uint8).view('S3').squeeze()

or if your numpy array is not 8-bit integers:

arr = np.array([97, 98, 99]).astype(np.uint8).view('S3').squeeze()

In these cases however you do have to append the right length to the data type (e.g. 'S3' for 3 character strings).

Comments

7

create an array of bytes and decode the the byte representation using the ascii codec:

np.array([98,97,99], dtype=np.int8).tostring().decode("ascii")

note that tostring is badly named, it actually returns bytes which happens to be a string in python2, in python3 you will get the bytes type back which need to be decoded.

Comments

6
import numpy as np
np.array([97, 98, 99], dtype='b').tobytes().decode("ascii")

Output:

'abc'

Data type objects (dtype)

tostring() is deprecated since version 1.19.0. Use tobytes() instead.

Comments

1
from numpy import array

a = array([97, 98, 99])
print("{0:c}{1:c}{2:c}".format(a[0], a[1], a[2]))

Of course, join and a list comprehension can be used here as well.

2 Comments

But this only works for len(a) == 3, which seems very fragile.
@jonrsharpe i shoud've mentioned that i just wanted to show the "format()" method. Which could be used inside a loop.
1

Solutions that rely on Python loops or string formatting will be slow for large datasets. If you know that all of your data are ASCII, a faster approach could be to use fancy indexing:

import numpy as np
a = np.array([97, 98, 99])
np.array([chr(x) for x in range(127)])[a]
# array(['a', 'b', 'c'], dtype='<U1')

An advantage is that it works for arbitrarily shaped arrays.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.