59

I'm looking for a function in ANSI C that would randomize an array just like PHP's shuffle() does. Is there such a function or do I have to write it on my own? And if I have to write it on my own, what's the best/most performant way to do it?

My ideas so far:

  • Iterate through the array for, say, 100 times and exchange a random index with another random index
  • Create a new array and fill it with random indices from the first one checking each time if the index is already taken (performance = 0 complexity = serious)
5

12 Answers 12

62

Pasted from Asmodiel's link to Ben Pfaff's Writings, for persistence:

#include <stdlib.h>

/* Arrange the N elements of ARRAY in random order.
   Only effective if N is much smaller than RAND_MAX;
   if this may not be the case, use a better random
   number generator. */
void shuffle(int *array, size_t n)
{
    if (n > 1) 
    {
        size_t i;
        for (i = 0; i < n - 1; i++) 
        {
          size_t j = i + rand() / (RAND_MAX / (n - i) + 1);
          int t = array[j];
          array[j] = array[i];
          array[i] = t;
        }
    }
}

EDIT: And here's a generic version that works for any type (int, struct, ...) through memcpy. With an example program to run, it requires VLAs, not every compiler supports this so you might want to change that to malloc (which will perform badly) or a static buffer large enough to accommodate any type you throw at it:

#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include <time.h>

/* compile and run with
 * cc shuffle.c -o shuffle && ./shuffle */

#define NELEMS(x)  (sizeof(x) / sizeof(x[0]))

/* arrange the N elements of ARRAY in random order.
 * Only effective if N is much smaller than RAND_MAX;
 * if this may not be the case, use a better random
 * number generator. */
static void shuffle(void *array, size_t n, size_t size) {
    char tmp[size];
    char *arr = array;
    size_t stride = size * sizeof(char);

    if (n > 1) {
        size_t i;
        for (i = 0; i < n - 1; ++i) {
            size_t rnd = (size_t) rand();
            size_t j = i + rnd / (RAND_MAX / (n - i) + 1);

            memcpy(tmp, arr + j * stride, size);
            memcpy(arr + j * stride, arr + i * stride, size);
            memcpy(arr + i * stride, tmp, size);
        }
    }
}

#define print_type(count, stmt) \
    do { \
    printf("["); \
    for (size_t i = 0; i < (count); ++i) { \
        stmt; \
    } \
    printf("]\n"); \
    } while (0)

struct cmplex {
    int foo;
    double bar;
};

int main() {
    srand(time(NULL));

    int intarr[] = { 1, -5, 7, 3, 20, 2 };

    print_type(NELEMS(intarr), printf("%d,", intarr[i]));
    shuffle(intarr, NELEMS(intarr), sizeof(intarr[0]));
    print_type(NELEMS(intarr), printf("%d,", intarr[i]));

    struct cmplex cmparr[] = {
        { 1, 3.14 },
        { 5, 7.12 },
        { 9, 8.94 },
        { 20, 1.84 }
    };

    print_type(NELEMS(intarr), printf("{%d %f},", cmparr[i].foo, cmparr[i].bar));
    shuffle(cmparr, NELEMS(cmparr), sizeof(cmparr[0]));
    print_type(NELEMS(intarr), printf("{%d %f},", cmparr[i].foo, cmparr[i].bar));

    return 0;
}
Sign up to request clarification or add additional context in comments.

9 Comments

you could also do array[i] += array[j]; array[j] = array[i] - array[j]; array[i] -= array[j]; if you're not worry about int overflows. I don't want to confuse any new to the language about why XOR'ing works though...
@Hyperboreus - Are you kidding? "Allocating" integers on the stack is as simple as performing addition/subtraction on a register. That itself is going to be fast enough, but further, a decent optimizer will only do that addition/subtraction once for this code, not for every iteration. (Compile this with optimization turned on and look at the disassembly for yourself. I did so with gcc -S and there were exactly two modifications of the stack pointer, once at the start of the function and once at the end.) There is nothing you save by having t and j scoped earlier in the function.
Hm, just in a sandbox swapping two integers in a loop, the difference in time is 20% between assigning it to a temp variable compared to xoring it. Why is that so then? (compiled with gcc)
Note: The formula i + r / (RAND_MAX / (n - i) + 1) introduces additional bias. e.g. j(i=32,n=61,RM=2147483647) --> { with 2147483648 different r, j= 32 to 60 occurs 74051161 each, 61 occurs only 74051140 }. TBD worst case i,n,RAND_MAX. With i+ rnd%(n-i) { j= 32 to 39 occur 74051161 each, j = 40 to 61 occurs 74051160, the worst case distribution for various i,n,RAND_MAX being at most 1 different. As other posts refer to this popular answer, felt this bias was important to note.
@PaulStelian: If RAND_MAX is just 32767, you need to get yourself a better PRNG. One simple step up is the drand48() family of functions; that's a POSIX standard set of functions. You might find you have random() and srandom(), or arc4random(), or maybe you can use /dev/random or /dev/urandom as a source of random values. There are lots of possibilities — but what you're asking is really a new question (or should be asked in a new question).
|
18

The following code ensures that the array will be shuffled based on a random seed taken from the usec time. Also this implements the Fisher–Yates shuffle properly. I've tested the output of this function and it looks good (even expectation of any array element being the first element after shuffle. Also even expectation for being the last).

void shuffle(int *array, size_t n) {    
    struct timeval tv;
    gettimeofday(&tv, NULL);
    int usec = tv.tv_usec;
    srand48(usec);


    if (n > 1) {
        size_t i;
        for (i = n - 1; i > 0; i--) {
            size_t j = (unsigned int) (drand48()*(i+1));
            int t = array[j];
            array[j] = array[i];
            array[i] = t;
        }
    }
}

5 Comments

I would use an int, not size_t, in this case because n represents the number of ints, not the size of the memory block. I prefer using size_t only for sizes in bytes.
@Mk12 The number of elements and the sizeof an array can be much more than INT_MAX. Using size_t here is more robust and portable approach.
Nice, so little code. Is it quick and simple to get this working with Microsoft's C library?
I had to add the following, without quotes, above my #include statements: "#define _XOPEN_SOURCE" Otherwise, I was getting: "implicit declaration of function 'srand48' is invalid in C99"
8

I’ll just echo Neil Butterworth’s answer, and point out some trouble with your first idea:

You suggested,

Iterate through the array for, say, 100 times and exchange a random index with another random index

Make this rigorous. I'll assume the existence of randn(int n), a wrapper around some RNG, producing numbers evenly distributed in [0, n-1], and swap(int a[], size_t i, size_t j),

void swap(int a[], size_t i, size_t j) {
  int temp = a[i]; a[i] = a[j]; a[j] = temp;
}

which swaps a[i] and a[j]. Now let’s implement your suggestion:

void silly_shuffle(size_t n, int a[n]) {
    for (size_t i = 0; i < n; i++)
        swap(a, randn(n), randn(n)); // swap two random elements
}

Notice that this is not any better than this simpler (but still wrong) version:

void bad_shuffle(size_t n, int a[n]) {
    for (size_t i = 0; i < n; i++)
        swap(a, i, randn(n));
}

Well, what’s wrong? Consider how many permutations these functions give you: With n (or 2×_n_ for silly_shuffle) random selections in [0, n-1], the code will “fairly” select one of _n_² (or 2×_n_²) ways to shuffle the deck. The trouble is that there are n! = _n_×(n-1)×⋯×2×1 possible arrangements of the array, and neither _n_² nor 2×_n_² is a multiple of n!, proving that some permutations are more likely than others.

The Fisher-Yates shuffle is actually equivalent to your second suggestion, only with some optimizations that change (performance = 0, complexity = serious) to (performance = very good, complexity = pretty simple). (Actually, I’m not sure that a faster or simpler correct version exists.)

void fisher_yates_shuffle(size_t n, int a[n]) {
    for (size_t i = 0; i < n; i++)
        swap(a, i, i+randn(n-1-i)); // swap element with random later element
}

ETA: See also this post on Coding Horror.

Comments

7

There isn't a function in the C standard to randomize an array.

  • Look at Knuth - he has algorithms for the job.
  • Or look at Bentley - Programming Pearls or More Programming Pearls.
  • Or look in almost any algorithms book.

Ensuring a fair shuffle (where every permutation of the original order is equally likely) is simple, but not trivial.

7 Comments

Really equally likely is very difficult. For example, your random number generator has to have a multiple of N! states.
@Paul: As long as your PRNG "random number between 1 and N" wrapper is correct (uniform distribution), it's easy. However people often screw this one up and create bias.
@Paul Hankin: Is that because you need to generate random numbers from 0 to i where i goes from n to 1?
@ninjalj: No, absolutely not. That's the naive broken algorithm everyone uses. Anything with floating point in it is going to be hell to get right, so the first step to fixing it would be to switch to integers. Then discard any results larger than the largest multiple of 10, minus 1 (call rand again if you get a value you have to discard). There are ways to save and reuse this entropy rather than completely discarding it, but that's more work, and likely worthless when it's just pseudo-random anyway.
@R. glibc rand() has only 2^32 different states, so it can generate at most 2^32 different shuffles of a pack of cards whatever you do. 52! is more like 2^225, so you actually generate a tiny, tiny fraction of all the possibilities.
|
6

The function you are looking for is already present in the standard C library. Its name is qsort. Random sorting can be implemented as:

int rand_comparison(const void *a, const void *b)
{
    (void)a; (void)b;

    return rand() % 2 ? +1 : -1;
}

void shuffle(void *base, size_t nmemb, size_t size)
{
    qsort(base, nmemb, size, rand_comparison);
}

The example:

int arr[10] = { 0, 1, 2, 3, 4, 5, 6, 7, 8, 9 };

srand(0); /* each permutation has its number here */

shuffle(arr, 10, sizeof(int));

...and the output is:

3, 4, 1, 0, 2, 7, 6, 9, 8, 5

10 Comments

Does this guarantee that all permutations are equally likely? I think that's unlikely. A Fisher-Yates shuffle does guarantee that all permutations are equally likely, assuming an unbiassed PRNG.
@JonathanLeffler This is probably hopeless since there is no guarantee on the qsort() algorithm and the quality of rand() in the C standard.
What is use of (void)a; (void)b;?
C lib specifies "When the same objects (consisting of size bytes, irrespective of their current positions in the array) are passed more than once to the comparison function, the results shall be consistent with one another. That is, for qsort they shall define a total ordering on the array,". This answer's rand_comparison() fails to provide a total ordering resulting in undefined behavior including a potential infinite loop.
qsort() with a comparison function that returns a random result got Microsoft in trouble back when they were having legal issues around bundling Internet Explorer with Windows. They used this general approach to offer what was supposed to be a randomly-ordered selection of available browsers, but it turned out that IE was positioned first disproportionately often. Enough so that it was quickly noticed. In the end, it appears that this was an honest mistake, but it was a PR and maybe legal problem for MS at the time.
|
4

Here a solution that uses memcpy instead of assignment, so you can use it for array over arbitrary data. You need twice the memory of original array and the cost is linear O(n):

void main ()
{
    int elesize = sizeof (int);
    int i;
    int r;
    int src [20];
    int tgt [20];

    for (i = 0; i < 20; src [i] = i++);

    srand ( (unsigned int) time (0) );

    for (i = 20; i > 0; i --)
    {
        r = rand () % i;
        memcpy (&tgt [20 - i], &src [r], elesize);
        memcpy (&src [r], &src [i - 1], elesize);
    }
    for (i = 0; i < 20; printf ("%d ", tgt [i++] ) );
}

1 Comment

You could also do this in-place using void * pointers to lower the additional memory requirement and limit copying to single values -- if it is an array of structs on the stack this would reduce the quantity of copies being made. For even lower space requirements, shuffle offsets on the original memory position, permitting the use of ints or smaller (an unsigned short still manages up to 65.5k).
1

Just run the following code first and modify it for your needs:

#include <stdio.h>
#include <stdlib.h>
#include <time.h>
#define arr_size 10

// shuffle array
void shuffle(int *array, size_t n) {
    if (n > 1) {
        for (size_t i = 0; i < n - 1; i++) {
          size_t j = i + rand() / (RAND_MAX / (n - i) + 1);
          int t = array[j];
          array[j] = array[i];
          array[i] = t;
        }
    }
}

// display array elements
void display_array(int *array, size_t n){
    for (int i = 0; i < n; i++)
        printf("%d ", array[i]);
}

int main() {
    srand(time(NULL));       // this line is necessary
    int numbers[arr_size] = {0, 1, 2, 3, 4, 5, 6, 7, 8, 9};
    
    printf("Given array:    ");
    display_array(numbers, arr_size);
    shuffle(numbers, arr_size); 
    printf("\nShuffled array: ");
    display_array(numbers, arr_size);

    return 0;
}

You would have something like:

enter image description here

You get different shuffled arrays every time you run the code:

enter image description here

1 Comment

Does this algorithm have equal probabilities for all combinations or is some result more likely (Even in a single position)?
0

Assuming you may want to just access an array randomly instead of actually shuffling it, you can use the degenerative case of a linear congruential pseudo-random number generator

X_n+1 = (a Xn+c) mod N
where a is coprime to N
generates a random cycle over all values 0:N

Naturally you could store this sequence in an empty array.

uint32_t gcd ( uint32_t a, uint32_t b )
{
  if ( a==0 ) return b;
  return gcd ( b%a, a );
}

 uint32_t get_coprime(uint32_t r){  
     uint32_t min_val = r>>1;  
     for(int i =0;i<r*40;i++){  
         uint64_t sel = min_val + ( rand()%(r-min_val ));  
         if(gcd(sel,r)==1)  
             return sel;  
     }  
     return 0;  
}

uint32_t next_val(uint32_t coprime, uint32_t cur, uint32_t N)
{     
   return (cur+coprime)%N;   
}


// Example output Array A in random order
void shuffle(float * A, uint32_t N){
  uint32_t coprime = get_coprime(N);
  cur = rand()%N;
  for(uint32_t i = 0;i<N;i++){
     printf("%f\n",A[cur]);
     cur = next_val(coprime, cur, N);
}

2 Comments

I'm puzzled about the references to r in get_coprime() — should they be references to N? Also, don't you need a modulus operation in next_val() to prevent values going out of range? Or do you have to use next = next_val(coprime, next) % N;?
You should probably show how the code would be used. It seems likely to me that the gcd() function should be static. Presumably, the operations are similar to: uint32_t cp = get_coprime(N); uint32_t first = rand() % N; uint32_t cur = first; do { …use array[cur]…; cur = next_val(cp, cur); } while (cur != first);.
-1

The same answer like Nomadiq but the Random is kept simple. The Random will be the same if you call the function one after another:

#include <stdlib.h>
#include <time.h>

void shuffle(int aArray[], int cnt){
    int temp, randomNumber;
    time_t t;
    srand((unsigned)time(&t));
    for (int i=cnt-1; i>0; i--) {
        temp = aArray[i];
        randomNumber = (rand() % (i+1));
        aArray[i] = aArray[randomNumber];
        aArray[randomNumber] = temp;
    }
}

2 Comments

Welcome to Stack Overflow. If you decide to answer an older question that has well established and correct answers, adding a new answer late in the day may not get you any credit. If you have some distinctive new information, or you're convinced the other answers are all wrong, by all means add a new answer, but 'yet another answer' giving the same basic information a long time after the question was asked usually won't earn you much credit.
-1

I saw the answers and I've discovered an easy way to do it

#include <stdio.h>
#include <conio.h>
#include <time.h>

int main(void){

    int base[8] = {1,2,3,4,5,6,7,8}, shuffled[8] = {0,0,0,0,0,0,0,0};
    int index, sorted, discart=0;

    srand(time(NULL));
    for(index = 0; index<8; index++){
        discart = 0;
        while(discart==0){
            sorted = rand() % 8;
            
            if (shuffled[sorted] == 0){
                //This here is just for control of what is happening
                printf("-------------\n");
                printf("index: %i\n sorted: %i \n", index,sorted);
                printf("-------------\n");

                shuffled[sorted] = base[index];
                discart= 1;
            }
        }
    }

    //This "for" is just to exibe the sequence of items inside your array
    for(index=0;index<8; index++){
        printf("\n----\n");
        printf("%i", shuffled[index]);
    }

    return 0;
}

Notice that this method doesn't allow duplicated items. And at the end you can use either numbers and letters, just replacing them into the string.

1 Comment

Welcome to Stack Overflow. If you decide to answer an older question that has well established and correct answers, adding a new answer late in the day may not get you any credit. If you have some distinctive new information, or you're convinced the other answers are all wrong, by all means add a new answer, but 'yet another answer' giving the same basic information a long time after the question was asked usually won't earn you much credit.
-1

In the code example, I have a function that takes as parameters a pointer to an int ordered_array and a pointer to int shuffled_array and a number representing the length of both arrays. It picks in each loop a random number from the ordered_array and inserts it into the shuffled array.

void shuffle_array(int *ordered_array, int *shuffled_array, int len){
    int index;

    for(int i = 0; i < len; i++){
        index = (rand() % (len - i));

        shuffled_array[i] = ordered_array[index];

        ordered_array[index] = ordered_array[len-i];
    }
}

1 Comment

Note that this code has a curious interface. It destroys the ordered_array while creating the shuffled_array. Consequently, the code can't have the const attribute applied to either array argument. It also does not conserve the data in the input array. That is, not all the elements of the input array are represented in the output array. This makes the algorithm worthless.
-2

I didn't see it among answers so I propose this solution if it can help anybody:

static inline void shuffle(size_t n, int arr[])
{
    size_t      rng;
    size_t      i;
    int         tmp[n];
    int         tmp2[n];

   memcpy(tmp, arr, sizeof(int) * n);
    bzero(tmp2, sizeof(int) * n);
    srand(time(NULL));
    i = 0;
    while (i < n)
    {
        rng = rand() % (n - i);
        while (tmp2[rng] == 1)
            ++rng;
        tmp2[rng] = 1;
        arr[i] = tmp[rng];
        ++i;
    }
}

1 Comment

When I tested this code on an array of 20 elements, the last element was never swapped, and the second last seldom swapped. When I tested it on an array of 10 elements, 60% of the time the last element was unchanged, and 60% of the time the second last element was unchanged. This does not seem like a good shuffle. (It also uses a lot of extra storage space, with two extra arrays of the same size as the array that is being shuffled. That too is not good.) You should not call srand() in the shuffle function: srand() — why call it only once.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.