Let us consider a set of data with, say, 10 values. We need to estimate its characterstics by Monte Carlo method, that is with a large number of randomly generated permutated sets.
If we'd consider generation of all possible permutations, that would give too big number, almost impractical - even for 10 values n! is 3628800. We can generate a (relatively small) subset of permutations randomly, but this does not take into account that permutations are not equally "distinct".
Here we come to the question: is there an algorithm of permutations generation which would count permutations' dissimilarity and produce most noticable permutations first (so that we can stop it after, say, 1000 cycles)?
I don't have specific requirements on the "similarity" measure, so if anything like a distance, information gain, entropy, Spearman, etc. are involved, that's would be ok.
Currently I'm doing one random permutation like so (pseudocode):
void permutate(const int n, int &out[])
{
for(int k = 0; k < n; ++k) out[k] = k;
for(int k = 0; k < n - 1; ++k)
{
int i1 = rand() % (n - k - 1) + 1;
swap(out, k, k + i1);
}
}
and run this predefined number of times.
Here the probability of swapping neighbouring values is the same as swapping distant values.