2

I have created a 2d Numpy string array like so:

a = np.full((2, 3), '#', dtype=np.unicode)
print(a)

The output is:

array([['#', '#', '#'], ['#', '#', '#']], dtype=`'<U1'`)

I would like to pad it with '?' on all sides with a width of 1. I'm expecting output as:

array([
['?', '?', '?', '?', '?'],
['?', '#', '#', '#', '?'],
['?', '#', '#', '#', '?'],
['?', '#', '#', '#', '?'],
['?', '?', '?', '?', '?']],
dtype=`'<U1')

I tried the following:

b = np.pad(a, ((1, 1), (1, 1)), 'constant', constant_values=(('?', '?'), ('?', '?')))

But that gives the following error:

File "<stdin>", line 1, in <module>
File "/usr/lib/python3/dist-packages/numpy/lib/arraypad.py", line 1357, in pad
    cast_to_int=False)
File "/usr/lib/python3/dist-packages/numpy/lib/arraypad.py", line 1069, in _normalize_shape
    return tuple(tuple(axis) for axis in arr.tolist())
AttributeError: 'tuple' object has no attribute 'tolist'

Similar code works for integers. What am I doing wrong for strings?

2
  • This is simply a numpy bug -- it's implementation detail assumes too specific conditions about the array, and it dumps an unhelpful message when it can't go on. As @Kasramvd has shown, you can circumvent it by creating your own padding function. Commented Mar 1, 2018 at 12:53
  • I get a different error in 1.14. Notice that some of the modes involve maximum and interpolation. The 'constant' mode may be just a special case of one of those. It's overkill for a simple padding like this. Commented Mar 1, 2018 at 17:21

2 Answers 2

4

You can't pad your array with string literals. Instead as it's mentioned in documentation you can use a pad_with function as follows:

In [79]: def pad_with(vector, pad_width, iaxis, kwargs):
    ...:     pad_value = kwargs.get('padder', '?')
    ...:     vector[:pad_width[0]] = pad_value
    ...:     vector[-pad_width[1]:] = pad_value
    ...:     return vector
    ...: 

In [80]: 

In [80]: np.pad(a, 1, pad_with)
Out[80]: 
array([['?', '?', '?', '?', '?'],
       ['?', '#', '#', '#', '?'],
       ['?', '#', '#', '#', '?'],
       ['?', '#', '#', '#', '?'],
       ['?', '?', '?', '?', '?']], dtype='<U1')

Note that in line pad_value = kwargs.get('padder', '?') in pad_with function you should use a default padding value in case there's no padding argument provided in np.pad's caller. You an pass the intended padder as a keyword argument to the function.

In [82]: np.pad(a, 1, pad_with, padder='*')
Out[82]: 
array([['*', '*', '*', '*', '*'],
       ['*', '#', '#', '#', '*'],
       ['*', '#', '#', '#', '*'],
       ['*', '#', '#', '#', '*'],
       ['*', '*', '*', '*', '*']], dtype='<U1')
Sign up to request clarification or add additional context in comments.

Comments

0

Even if you can get pad to work, it would be faster to insert a into a blank b. pad is setup for complex padding patterns, and does the job iteratively - row by row, column by column.

In [29]: a = np.full((2,3),'#')
In [30]: a
Out[30]: 
array([['#', '#', '#'],
       ['#', '#', '#']], dtype='<U1')
In [31]: b = np.full((4,5),'?')
In [32]: b
Out[32]: 
array([['?', '?', '?', '?', '?'],
       ['?', '?', '?', '?', '?'],
       ['?', '?', '?', '?', '?'],
       ['?', '?', '?', '?', '?']], dtype='<U1')
In [33]: b[1:-1,1:-1] = a
In [34]: b
Out[34]: 
array([['?', '?', '?', '?', '?'],
       ['?', '#', '#', '#', '?'],
       ['?', '#', '#', '#', '?'],
       ['?', '?', '?', '?', '?']], dtype='<U1')

Here's the clever pad_with solution, with an added print so we can see how often it is called:

In [36]: def pad_with(vector, pad_width, iaxis, kwargs):
    ...:     ...:     print(vector)
    ...:     ...:     pad_value = kwargs.get('padder', '?')
    ...:     ...:     vector[:pad_width[0]] = pad_value
    ...:     ...:     vector[-pad_width[1]:] = pad_value
    ...:     ...:     return vector
    ...: 
In [37]: np.pad(a,1,pad_with)
['' '' '' '']
['' '#' '#' '']
['' '#' '#' '']
['' '#' '#' '']
['' '' '' '']
['?' '?' '?' '?' '?']
['' '#' '#' '#' '']
['' '#' '#' '#' '']
['?' '?' '?' '?' '?']
Out[37]: 
array([['?', '?', '?', '?', '?'],
       ['?', '#', '#', '#', '?'],
       ['?', '#', '#', '#', '?'],
       ['?', '?', '?', '?', '?']], dtype='<U1')

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.