0

The objective is to create an array but by fulfilling the condition of (x=>y) and (y=>z).

One naive way but does the job is by using a nested for loop as shown below

tot_length=200
steps=0.1
start_val=0.0
list_no =np.arange(start_val, tot_length, steps)

a=np.zeros(shape=(1,3))
for x in list_no:
    for y in list_no:
        for z in list_no:
            if (x>=y) & (y>=z):
                a=np.append(a, [[x, y, z]], axis=0)

While no memory requirement issue was thrown, but the execution time is significantly slow.

Other approach that can be considered is by using the code code below. Yet the proposal only able to work flawlessly as long as tot_length is less than 100. More than that, memory issue arise as reported here

tot_length=200
steps=0.1
start_val=0.0
list_no =np.arange(start_val, tot_length, steps)
arr = np.meshgrid ( *[list_no for _ in range ( 3 )] )
a = np.array(list ( map ( np.ravel, arr ) )).transpose()
num_rows, num_cols = a.shape

a_list = np.arange ( num_cols ).reshape ( (-1, 3) )
for x in range ( len ( a_list ) ):
    a=a[(a[:, a_list [x, 0]] >= a[:, a_list [x, 1]]) & (a[:, a_list [x, 1]] >= a[:, a_list [x, 2]])]

Appreciate for any suggestion that can balance the overall execution time as well as memory issue. I also welcome for any suggestion using Pandas if that should make thing work

To determine whether the proposed output produced the intended output, the following parameter

tot_length=3
steps=1
start_val=1

Should produce the output

1   1   1
2   1   1
2   2   1
2   2   2
1
  • For tot_length=200, you are looking at about 30GB memory allocation for a, which is not small. Commented Oct 20, 2020 at 17:27

2 Answers 2

2
tot_length = 200
steps = 0.1
list_no = np.arange(0.0, tot_length, steps)

a = list()
for x in list_no:
    for y in list_no:
        if y > x:
            break

        for z in list_no:
            if z > y:
                break

            a.append([x, y, z])

a = np.array(a)
# if needed, a.transpose()
Sign up to request clarification or add additional context in comments.

2 Comments

I don't see how this is different from OP's solution.
This avoids calling np.append, which is slower than using a list and converting to array at the end
1

Does something like this work?

tot_length=200
steps=0.1
list_no = np.arange(0.0, tot_length, steps)
x, y, z = np.meshgrid(*[list_no for _ in range(3)], sparse=True)
a = ((x>=y) & (y>=z)).nonzero()

This will still use 8GB of memory for the intermediate array of booleans, but avoids repeated calls to np.append which are slow.

1 Comment

Hi Eric, I just notice, all the value in a is rounded. Is this expected? Is it because nonzero() return non floating type? For example if I set tot_length=0.3 , steps=0.1, start_val=0.1 , your proposed solution will return a round value instead of in decimal value.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.