1

I have a numpy ndarray with 6 elements:

['\tblah blah' '"""123' 'blah' '"""' '\t456' '78\t9']

I am trying to replace all tab characters \t with 4 spaces each so that the numpy array would now be:

[' blah blah' '"""123' 'blah' '"""' ' 456' '78 9']

I have considered re.sub but cannot figure out how to implement it when it comes down to an numpy ndarray. Any suggestions/help please?

1 Answer 1

2

You could use NumPy's core.defchararray that deals with string related operations and for this case use replace method, like so -

np.core.defchararray.replace(arr,'\t', '    ')

Sample run -

In [44]: arr
Out[44]: 
array(['\tblah blah', '"""123', 'blah', '"""', '\t456', '78\t9'], 
      dtype='|S10')

In [45]: np.core.defchararray.replace(arr,'\t', '    ')
Out[45]: 
array(['    blah blah', '"""123', 'blah', '"""', '    456', '78    9'], 
      dtype='|S13')
Sign up to request clarification or add additional context in comments.

5 Comments

Quick follow-up; is it possible to get the number of the replacements, i.e. in this case 3?
@nk-fford One solution to that would be : np.core.defchararray.not_equal(output, arr).sum().
Confused with what the output and arr is in this case? Could give a one-liner to explain how this count works please?
@nk-fford output would be the output from np.core.defchararray.replace(arr,'\t', ' ')? Basically we are counting the occurences where changes were made by the replace method.
I see, I was doing everything in one line and got confused. Thanks

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.