2

So I have a numpy array A of dimensions (8760,12). Basically all the hours of 12 years. I need to sort each month (730 hours) in each year in the array. I haven't found any way to do it inside the array. So my solution was to take out each month, sort it and then create the entire 2d array again. I was thinking of doing something along the lines of what I have below, but it isn't working.

total=np.zeroes([8760,12])
for j in range(1,12):
    for i in range (1,12):
        #here i take out every month of every year
        month=A[730*(i-1):-730*(12-i),(j-1):-(12-j)]
        #here I sort the data
        month_sorted=np.sort(month,axis=0,kind='quicksort')
        #here I try to add the sorted months back into 1 big array
        np.concatenate(total,month_sorted,axis=0)
    np.concatenate(total,month_sorted,axis=1)

Concatenate doesn't work on arrays of different sizes.

And I don't really have a way to place the month of year 2 in row 2 of my array. I guess it should be done with indexing idx or iloc or something like that.

EDIT: My values are integers.

The result should be values ordered from low to high for each 730(hours in a month) values per row. So imagine I would have 3 years instead of 12 and 9 hours instead of 8760 hours which have to be sorted each 3 hours instead of each 730 hours. The array looks like this :

[[30,40,10,20,50,60,80,200,100]
[8,20,5,6,8,1,5,3,2]
[520,840,600,525,430,20,1,506,703]]

And should be converted into :

[[10,30,40,20,50,60,80,100,200]
[5,8,20,1,6,8,2,3,5]
[520,600,840,20,430,525,1,506,703]]

So my current code take out the first part 30,40,10 and sorts it as 10,30,40. But the part that I can't solve is how to create the big array again from all the smaller ones in the 2 loops.

1
  • What are the values of the array integer, dates? Please add a small sample array representing your input data and the expected output. Commented Jan 13, 2019 at 13:52

2 Answers 2

2

You can use python indexes and assignment instead of concatenate if you create the empty array first.

A = np.random.randint(0,99,(8760,12))
total=np.zeros([8760,12])
for j in range(12):
    for i in range (12):
        total[730*i:730*(i+1),j] = np.sort(A[730*i:730*(i+1),j])

If you want the same thing staring from no array and using concatenate-like function i would do it like this

total2=None
for j in range(12):
    app1 = None
    for i in range (12):
        app = np.sort(A[730*i:730*(i+1),j])
        if app1 is None:
            app1 = app
        else:
            app1 = np.hstack((app1,app))
    if total2 is None:
        total2 = app1
    else:
        total2 = np.vstack((total2,app1))
total2 = np.transpose(total2)

EDIT to answer comment(how to apply same sorting to different array)

bs = 3
B2 = np.empty(B.shape)
for j in range(A.shape[1]):
    for i in range(int(A.shape[0]/bs)):
        A2_order = np.argsort(A[bs * i : bs * (i + 1), j])
        B2[bs * i : bs * (i + 1),j] = B[A2_order+i*bs,j]
Sign up to request clarification or add additional context in comments.

4 Comments

method 1 works perfectly. And you make it seem so easy :). Thanks!
And if I wanted to sort another matrix in the same way how would I do that? A2_order[3 * i : 3 * (i + 1), j] = np.argsort(A[3 * i : 3 * (i + 1), j]) A2_order = np.array(A2_orden, dtype=int) B2=np.array(list(map(lambda x, y: y[x], A2_order, B))). I have this code, but because the sorting is in the matrix it doesn't give the required results.
@DavidDeclercq This would actually be a separate question anyway edited the answer with it
wow you are awesome. Thank you so much! My mistake was putting A2_order in slices as well. Thanks
1

You can avoid looping alltogether.

First transpose and reshape the array so that the array indices go from coarse to fine (year->month->hour).

A = np.transpose(A)
A = np.reshape(A, [12, 12, 730])

Now you can select all hours of a month as A[year, month]

Conveniently, the np.sort function by default sorts along the last axis of the array, so you can just call

A = np.sort(A)

and now each list of A[year, month] entries will be sorted.

2 Comments

Why is it necessary to first transpose?
It has to do with numpy storing data in row-major format, i.e. that elements a along rows are stored together in memory. In the [8760,12] shape, the measurements of one year are not stored together in memory, but are scattered throughout. After transposing to shape [12, 8760], the 8760 entries of the first year are stored together along row 0, the next year in row 1, and so on. Now that years are stored together in memory, np.reshape can work its magic. After reshape, years are still stored together, but are now split up further into groups of months.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.