10

Following is the simplified version of my problem. I want to create a (N, 1) shape numpy array, which would have strings as their values. However, when I try to insert the string, only the first character of the string gets inserted.

What am I doing wrong here?

>>> import numpy as np
>>> N = 23000
>>> Y = np.empty((N, 1), dtype=str)
>>> Y.shape
(23000, 1)
>>> for i in range(N):
...     Y[i] = "random string"
...
>>> Y[10]
array(['r'], dtype='<U1')
0

2 Answers 2

12

By default data type str takes length of 1. So, you will only get one character. we can set max data length by using np.dtype('U100'). Un where U is unicode and n is number of characters in it.

Try below code

>>> import numpy as np
>>> N = 23000
>>> Y = np.empty((N, 1), dtype=np.dtype('U100'))
>>> Y.shape
(23000, 1)
>>> for i in range(N):
...     Y[i] = "random string"
...
>>> Y[10]
array(['random string'], dtype='<U100')
Sign up to request clarification or add additional context in comments.

1 Comment

honestly, who decided that str_ has length 0 -.-
4

Even though you specify dtype=str in np.empty, when you check Y, it isn't string type.

import numpy as np
N = 23000
Y = np.empty((N, 1), dtype=str)
Y

Output:

array([[''],
       [''],
       [''],
       ...,
       [''],
       [''],
       ['']], dtype='<U1')

The dtype is "U1".

This means, its a unicode string of length 1.

You can change it to

Y = np.empty((N, 1), dtype='U25')

Output for Y[10]:

array(['random string'], dtype='<U25')

I have given a random value as 25 for "U25". You can give any number there. 25 over here.

25 in U25 means unicode string of length 25

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.