if you got a fancy gpu, I can tell you how to compute the top huge k of huge n instances all at the same time, so spread em out on a texture, per instance, and add blend onto a texture with their "height" as the position along the texture.
But note you have to guess an acceptable range or know it, or you wont spread to your max detail you could have had.
you clone positions. (you should get a 2, if there is 2 on it, 10 if there is 10 on it.) across all instances. (just say its all on an 8192x8192 texture, 64x64 of these "height" boxes.) and you also skip slots with 0 counts.
then do a mipped add hierarchy, except you do it like a binary tree, you only treat like its 1 dimension, so take the 2 previous numbers and add them together, and keep doing it for every binary mip.
then we use these mips (which have collected counts) to discover the approximate location of the k, using all mips in the process, do this on a final thread, youll take huge chunks out of it, then slowly use the more detailed mips to find the per pixel value, that k sits at.
it makes more sense to do this, if it were all instanced again, then its a thread per threshold discovery. (just say you were running an ANN 128x128 times at once, (translational invarience anyone?) then it makes perfect sense.
and achieve the threshold height for that count, but its approximate... so you get an approximate k. for n lists.
You can do a little more work to get the exact k, but in a similarity match, but if you can get away with it being approximate, like if it were getting the top ~k activations, then dont worry about it.