0

With numpy's where() function I'm getting the indices of an array by applying a mask. For example, an index list, where (1, 2, 3, 10, 11, 12, 13) where True:

[1, 2, 3, 10, 11, 12, 13]

Is there a way to transform this (or directly get it from a similar function) in separate ranges, for my example:

[ range(1, 4), range(10, 14)  ]
8
  • I am assuming you are not interested in simply iterating it? Commented Nov 10, 2020 at 11:16
  • 2
    Does this answer your question? Identify groups of continuous numbers in a list Commented Nov 10, 2020 at 11:19
  • Could be. Thanks. Commented Nov 10, 2020 at 11:20
  • @FlavioMoraes It's not quite the same because it doesn't discuss a variant of identical problem on numpy. Commented Nov 10, 2020 at 11:46
  • I believe any algo will encounter ambiguity. Eg, with [1, 3, 15, 5, 6], one possible answer could be [range(1,4,2), range(15,16,1), range(5,7,1)], and another possible answer could be [range(1,4,2), range(15,4,-10), range(6,7,1)]. Are you ok with either of these answers? Commented Nov 10, 2020 at 11:55

2 Answers 2

2

Assuming all the ranges include unit differences only, it's worth to start with a search of indexes where to split:

split_idx = np.flatnonzero(np.diff(ls, prepend = ls[0], append=ls[-1])!=1)

Note that it is improved slightly in order to insert starting and ending indexes as well. A further work could be done like so:

bounding_points = np.transpose([split_idx[:-1], split_idx[1:]])
out = [range(ls[n], ls[m-1]+1) for n, m in bounding_points.tolist()]

Sample run:

>>> ls = [1, 2, 3, 10, 11, 12, 13, 16, 18, 19]
>>> split_idx
array([ 0,  3,  7,  8, 10], dtype=int32)
>>> bounding_points
array([[ 0,  3],
       [ 3,  7],
       [ 7,  8],
       [ 8, 10]], dtype=int32)
>>> out
[range(1, 4), range(10, 14), range(16, 17), range(18, 20)]
Sign up to request clarification or add additional context in comments.

5 Comments

I believe you may be able to get the split points also by passing n=2 to np.diff(), and comparing result of np.diff() to 0 rather than 1. That would probably make the solution work even if the range steps are greater than 1. I would try it out myself, but my numpy is old, and doesn't even accept the prepend arg !!
@fountainhead prepend and append arguments are actually good for a demonstration purpose only. They are not efficient. You could do it better using np.r_[item1, arr1, arr2, ..., arrn, item2]. And you can't pass n=2 inside np.diff because it has a different meaning (read the docs).
Ok, I'll check again. I thought passing n=2 would produce second order differences (differences of differences). (Hence that suggestion)
@fountainhead I like extensible functions in answers, but I think extending to sequences with step > 1 is adding a bit too much complexity based on the question
@DanielF -- I believe right now the assumption of step = 1 is based on OP's example rather than explicit statement of fact. Anyways, I'll leave this here. Thanks.
1

reading the link I sent you I got this answer:

from operator import itemgetter
from itertools import groupby

data = [1, 2, 3, 10, 11, 12, 13]
ranges = []
for key, group in groupby(enumerate(data), lambda x: x[0] - x[1]):
    group = list(map(itemgetter(1), group))
    if len(group) > 1:
        ranges.append(range(group[0], group[-1]+1))
    else:
        ranges.append(group[0])

print(ranges)

the output was

[range(1, 4), range(10, 14)]

2 Comments

Will it identify ranges with step > 1 ?
no, because it is based on grouping consecutive numbers

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.