Slice 1D Array in Numpy without loop

Question

I have a array x as shown below:

x=np.array(["83838374747412E61E4C202C004D004D004D020202C3CF",
            "8383835F6260127314A0127C078E07090705023846C59F",
            "83838384817E14231D700FAC09BC096808881E1C1BC68F",
            "8484835C535212600F860A1612B90FCF0FCF012A2AC6BF",
            "848484787A7A1A961BAC1E731086005D005D025408C6CF",
            "8484845050620C300D500A9313E613E613012A2A5CC4BF",
            "838383757C7CF18F02192653070D03180318080101BE6F",
            "8584845557570F090E830F4309E5080108012A2A2AC6DF",
            "85858453536B07D608B3124C102A102A1026010101C61F",
            "83838384848411A926791C162048204820484D4444C3BF"], dtype=object)

These are concatenated hex values that I need to slice in order to convert to integers and then apply conversion factors. I want an array such as:

[83,83,83,84,84,84,83,85,85,83]

Which would be the equivalent of x[:,0:2] but I cannot slice in this (10,) array. I am trying to do something similar to what a character array would do in MatLab. I will be doing this over millions of rows which is why I am trying to avoid a loop.

Is there any comma missing by any chance between the rows of x array? — Dalek
– Dalek, Commented Oct 20, 2014 at 19:28

Community · Accepted Answer · 2017-05-23 12:07:17Z

0

If you're just after the first two characters from each hex value, one option is to recast your array to a dtype of '|S2':

>>> x.astype('|S2')
array(['83', '83', '83', '84', '84', '84', '83', '85', '85', '83'], 
  dtype='|S2')

This idea can be generalised to return the first n characters from each string.

Arbitrary slicing of string arrays is much more difficult to do in NumPy. Answers on this Stack Overflow page explain why it isn't the best tool for strings but show what can be possible.

Alternatively, the Pandas library facilitates fast vectorized operations (being built on top of NumPy). It has a number of very useful string operations which makes slicing a whole lot simpler than plain NumPy:

>>> import pandas as pd
>>> s = pd.Series(x)
>>> s.str.slice(2, 9)
0    8383747
1    83835F6
2    8383848
3    84835C5
4    8484787
5    8484505
6    8383757
7    8484555
8    8584535
9    8383848
dtype: object

edited May 23, 2017 at 12:07

CommunityBot

11 silver badge

answered Oct 20, 2014 at 19:29

Alex Riley

178k46 gold badges274 silver badges247 bronze badges

Sign up to request clarification or add additional context in comments.

1 Comment

user3338505 Over a year ago

Thanks that's exactly what I was looking for in the slice thank you! this combined with; intHex = vectorize(int) xIntForm = intHex(xArray,16) On the pandas series converted it/

mrcl · Accepted Answer · 2014-10-21 03:19:33Z

0

Here is a pythonic way of doing it

Consider part of your string

x = "83838374747412E61E4C202C004D004D004D020202C3CF8383835F626012"

You can combine map, join, zip and iter to make it work

xArray = array(map(''.join, zip(*[iter(x)]*2)))

Then you can process your convert your hex values to integer by using a vectorized form of int

intHex   = vectorize(int)
xIntForm = intHex(xArray,16)

I am not sure about the performance of the vectorize function though, which is part of numpy.

Cheers

answered Oct 21, 2014 at 3:19

mrcl

2,19014 silver badges29 bronze badges

1 Comment

user3338505 Over a year ago

Thanks for helping, I used the pandas method above and then the vectorize to convert.

Collectives™ on Stack Overflow

Slice 1D Array in Numpy without loop

2 Answers 2

1 Comment

1 Comment

Your Answer

Linked

Hot Network Questions

Collectives™ on Stack Overflow

2 Answers 2

1 Comment

1 Comment

Your Answer

Sign up or log in

Post as a guest

Linked

Related