Format numeric array to string-like array

Question

I'd like to have Numpy efficiently convert each element of a numeric array (e.g. float32) to a formatted array (i.e. string-like). I can make this work as I expect by iterating each element to a list:

import numpy as np
a = (10 ** np.arange(-5, 6, 2, dtype='d') * 3.14159).astype('f')
# array([3.14159e-05, 3.14159e-03, 3.14159e-01, 3.14159e+01, 3.14159e+03,
#        3.14159e+05], dtype=float32)

# Good conversion to a list
print([str(x) for x in a])
# ['3.14159e-05', '0.00314159', '0.314159', '31.4159', '3141.59', '314159.0']
print(list(map(lambda x: str(x), a)))  # also does the same

# Expected result: a string-like Numpy array
print(repr(np.array([str(x) for x in a])))
# array(['3.14159e-05', '0.00314159', '0.314159', '31.4159', '3141.59',
#        '314159.0'], dtype='<U11')

However, this example doesn't easily scale to multidimensional arrays, since map() or list comprehensions don't understand how additional dimensions work. I'd like a result provided as a Numpy array with a string-like datatype, as shown above.

Typically, numpy.vectorize could be used to do this, however each of my attempts with Numpy 1.15 do not return the expected result:

# Bad conversions with np.vectorize, all show the same result
f = np.vectorize(lambda x: str(x))
f = np.vectorize('%s'.__mod__)  # equivalent; gives same result
f = np.vectorize(lambda x: '{!s}'.format(x))  # also same, but modern formatter
print(f(a))
# array(['3.141590059385635e-05', '0.003141589928418398',
#        '0.31415900588035583', '31.4158992767334', '3141.590087890625',
#        '314159.0'], dtype='<U21')

(The reason why these results are bad is that it appears that Numpy upgraded the datatype from float32 to Python's native double precision; similar to [str(x) for x in a.tolist()])

Any ideas on how to either use map()/list comprehensions on arbitrary dimension Numpy arrays and/or fix np.vectorize to achieve an equivalent result?

Numpy has a string type. Does a.astype('|S10') work for you? Note you can change the string length, and my example assumes 10 characters is enough. — svohara
– svohara, Commented Oct 29, 2018 at 3:55
@svohara you are on to something, although more than 10 chars are needed; a.astype(str) gives 32 (either '<U32' or '|S32', depending on which Python version) — Mike T
– Mike T, Commented Oct 29, 2018 at 4:03

ZisIsNotZis · Accepted Answer · 2018-10-29 05:26:46Z

1

How about np.char.mod?

import numpy as np
np.char.mod('%.2f', np.random.rand(8, 8))

It outputs

array([['0.04', '0.86', '0.74', '0.45', '0.30', '0.09', '0.65', '0.58'],
       ['0.96', '0.58', '0.41', '0.29', '0.26', '0.54', '0.01', '0.59'],
       ['0.38', '0.86', '0.37', '0.14', '0.32', '0.57', '0.19', '0.28'],
       ['0.91', '0.80', '0.78', '0.39', '0.67', '0.51', '0.16', '0.70'],
       ['0.61', '0.12', '0.89', '0.68', '0.01', '0.23', '0.57', '0.18'],
       ['0.71', '0.29', '0.08', '0.01', '0.86', '0.03', '0.79', '0.75'],
       ['0.44', '0.84', '0.89', '0.75', '0.48', '0.88', '0.69', '0.20'],
       ['0.36', '0.69', '0.12', '0.60', '0.16', '0.39', '0.15', '0.02']],
      dtype='<U4')

answered Oct 29, 2018 at 5:26

ZisIsNotZis

1,7601 gold badge16 silver badges30 bronze badges

Sign up to request clarification or add additional context in comments.

Comments

kuzand · Accepted Answer · 2018-10-29 09:30:37Z

0

You could simply use astype with dtype 'str'

a.astype(dtype=str)

# array(['3.14159e-05', '0.00314159', '0.314159', '31.4159', '3141.59',
#       '314159.0'], dtype='<U32')

Edit: just saw your comment that you have figured it out by yourself. Nevertheless I will keep my answer.

edited Oct 29, 2018 at 9:30

answered Oct 29, 2018 at 9:22

kuzand

9,8864 gold badges48 silver badges50 bronze badges

Collectives™ on Stack Overflow

Format numeric array to string-like array

2 Answers 2

Comments

Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

2 Answers 2

Comments

Comments

Your Answer

Sign up or log in

Post as a guest

Related