Numpy automatically converting array of strings to array of numbers

Question

I have an array of strings that is composed of number-like strings such as 010. I am trying to build a 2D numpy array by creating an empty numpy array, and then filling in the rows with my array of strings. However, it seems like whenever I assign a row in the numpy array, it converts the number-like strings into numbers. The main issue with this behavior is that I am losing leading zeroes from my strings.

I wrote a simple example to show what is happening:

import numpy as np

num_rows = 5
arr = ["010", "011", "111", "100", "001"]
np_arr = np.empty((num_rows, len(arr)), dtype=str)

for i in range(len(np_arr)):
    np_arr[i] = arr

print(np_arr)

The resulting output is:

[['0' '0' '1' '1' '0']
 ['0' '0' '1' '1' '0']
 ['0' '0' '1' '1' '0']
 ['0' '0' '1' '1' '0']
 ['0' '0' '1' '1' '0']]

vs. the expected output:

[['010' '011' '111' '100' '001']
 ['010' '011' '111' '100' '001']
 ['010' '011' '111' '100' '001']
 ['010' '011' '111' '100' '001']
 ['010' '011' '111' '100' '001']]

I do not understand this behavior and am hoping to find a solution to my problem and understand if this type conversion is being done by numpy or by Python. I have tried quite a few variations to this small example but have not found a working solution.

Thanks!

AJH · Accepted Answer · 2022-04-08 18:36:09Z

2

Here's a solution:

num_rows = 5
arr = ["010", "011", "111", "100", "001"]

# Turn your array into a numpy array with dtype string.
n = np.array(arr, dtype=str)

# Repeat the row as many times as needed.
n = np.tile(n, num_rows).reshape(num_rows, len(n))

Let me know if you have any questions.

A note for the future is that in most cases, you can replace for loops with NumPy functions, which tend to be faster due to vectorisation.

answered Apr 8, 2022 at 18:36

AJH

7991 gold badge5 silver badges16 bronze badges

Sign up to request clarification or add additional context in comments.

4 Comments

Brent Over a year ago

Thanks for the response. Is there a variation to this answer that works with more than 1 array? In the example I just repeated the same array num_rows times, but in practice I will have many different arrays I want to fill the numpy array with. I tried converting each array I have to a numpy array using np.array(...) like you suggested, but if I don't use that tile function then I get the same results as in the question.

AJH Over a year ago

Could you post some of your other arrays please, so I know what format they're in?

RuthC Over a year ago

If the arrays are all the same shape, have a look at numpy.stack

Brent Over a year ago

@AJH they are all identical to the one I have in my example, just with different values. @RuthC numpy.stack looks like a nice way to solve this problem too!

lemon · Accepted Answer · 2022-04-08 18:40:34Z

1

The issue is in the type of the array: you need to set an array-protocol type string, like <U3: if you change dtype=str to dtype='<U3' it will work.

answered Apr 8, 2022 at 18:40

lemon

15.6k6 gold badges23 silver badges42 bronze badges

1 Comment

Brent Over a year ago

This is a good generalized solution that fixed my problem. Thank you.

Collectives™ on Stack Overflow

Numpy automatically converting array of strings to array of numbers

2 Answers 2

4 Comments

1 Comment

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

2 Answers 2

4 Comments

1 Comment

Your Answer

Sign up or log in

Post as a guest

Related