2

I have an array of strings that is composed of number-like strings such as 010. I am trying to build a 2D numpy array by creating an empty numpy array, and then filling in the rows with my array of strings. However, it seems like whenever I assign a row in the numpy array, it converts the number-like strings into numbers. The main issue with this behavior is that I am losing leading zeroes from my strings.

I wrote a simple example to show what is happening:

import numpy as np

num_rows = 5
arr = ["010", "011", "111", "100", "001"]
np_arr = np.empty((num_rows, len(arr)), dtype=str)

for i in range(len(np_arr)):
    np_arr[i] = arr

print(np_arr)

The resulting output is:

[['0' '0' '1' '1' '0']
 ['0' '0' '1' '1' '0']
 ['0' '0' '1' '1' '0']
 ['0' '0' '1' '1' '0']
 ['0' '0' '1' '1' '0']]

vs. the expected output:

[['010' '011' '111' '100' '001']
 ['010' '011' '111' '100' '001']
 ['010' '011' '111' '100' '001']
 ['010' '011' '111' '100' '001']
 ['010' '011' '111' '100' '001']]

I do not understand this behavior and am hoping to find a solution to my problem and understand if this type conversion is being done by numpy or by Python. I have tried quite a few variations to this small example but have not found a working solution.

Thanks!

2 Answers 2

2

Here's a solution:

num_rows = 5
arr = ["010", "011", "111", "100", "001"]

# Turn your array into a numpy array with dtype string.
n = np.array(arr, dtype=str)

# Repeat the row as many times as needed.
n = np.tile(n, num_rows).reshape(num_rows, len(n))

Let me know if you have any questions.

A note for the future is that in most cases, you can replace for loops with NumPy functions, which tend to be faster due to vectorisation.

Sign up to request clarification or add additional context in comments.

4 Comments

Thanks for the response. Is there a variation to this answer that works with more than 1 array? In the example I just repeated the same array num_rows times, but in practice I will have many different arrays I want to fill the numpy array with. I tried converting each array I have to a numpy array using np.array(...) like you suggested, but if I don't use that tile function then I get the same results as in the question.
Could you post some of your other arrays please, so I know what format they're in?
If the arrays are all the same shape, have a look at numpy.stack
@AJH they are all identical to the one I have in my example, just with different values. @RuthC numpy.stack looks like a nice way to solve this problem too!
1

The issue is in the type of the array: you need to set an array-protocol type string, like <U3: if you change dtype=str to dtype='<U3' it will work.

1 Comment

This is a good generalized solution that fixed my problem. Thank you.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.