I want to split a numpy array into three different arrays based on a logical comparison. The numpy array I want to split is called x. It's shape looks as follows, but it's entries vary: (In response to Saullo Castro's comment I included a slightly different array x.)
array([[ 0.46006547, 0.5580928 , 0.70164242, 0.84519205, 1.4 ],
[ 0.00912908, 0.00912908, 0.05 , 0.05 , 0.05 ]])
This values of this array are monotonically increasing along columns. I also have two other arrays called lowest_gridpoints and highest_gridpoints. The entries of these arrays also vary, but the shape is always identical to the following:
array([ 0.633, 0.01 ]), array([ 1.325, 0.99 ])
The selection procedure I want to apply is as follows:
- All columns containing values lower than any value in
lowest_gridpointsshould be removed fromxand constitute the arraytemp1. - All columns containing values higher than any value in
highest_gridpointsshould be removed fromxand constitute the arraytemp2. - All columns of
xthat are included in neithertemp1ortemp2constitute the arrayx_new.
The following code I wrote achieves the task.
if np.any( x[:,-1] > highest_gridpoints ) or np.any( x[:,0] < lowest_gridpoints ):
for idx, sample, in enumerate(x.T):
if np.any( sample > highest_gridpoints):
max_idx = idx
break
elif np.any( sample < lowest_gridpoints ):
min_idx = idx
temp1, temp2 = np.array([[],[]]), np.array([[],[]])
if 'min_idx' in locals():
temp1 = x[:,0:min_idx+1]
if 'max_idx' in locals():
temp2 = x[:,max_idx:]
if 'min_idx' in locals() or 'max_idx' in locals():
if 'min_idx' not in locals():
min_idx = -1
if 'max_idx' not in locals():
max_idx = x.shape[1]
x_new = x[:,min_idx+1:max_idx]
However, I suspect that this code is very inefficient because of the heavy use of loops. Additionally, I think the syntax is bloated.
Does someone have an idea for a code which achieve the task outlined above more efficiently or looks concise?
[]for me... it would be nice to have a different input that can be used for comparisons...lowest_gridpointsand another value higher than the one inhighest_gridpoints? Also, did you mean monotonically increasing along the rows?np.argsort(x[i] + [lowest_gridpoints[i]])[-1]. This will give you the index of the first element larger thanlowest_gridpoints[i]. Do it for alliand get the maximum (minimum for thehighest_gridpoints)temp1andtemp2to be mutually exclusive. In my code, this is guaranteed by thebreakcommand after ` if np.any( sample > highest_gridpoints): In doubt, I classify columns ofx` topara2instead ofpara1. I meant monotonically increasing along the second dimension of np.arrays, so thatx[0,i] >= x[0,j]for i > j. I hope (and think) this refers to columns.