It looks like you're having a Java background (or D), as you wrote int [10] A. You may read this as A is a variable to a array of 10 ints type object. But in C those things are a little bit different. In C there are only individual primitive variables, and they may be arranged in sets of continous memory, which is then called an array. That's why the syntax int A[10]; was choosen, which means: A is a pointer to integer, pointing the beginning of automatically allocated storage for 10 integers.
The next problem with your program, though the compiler will not report it is passing an integer through a void* type. In C every pointer can be converted to some integer, but not all integer values make pointers. Pointers in C are abstract things: Most implementations choose to use the memory address as their integer representation. But it would be as valid to map them through a injective LUT. So some integer values may not map to a pointer and thus not properly transfer. You should really pass a real pointer.
Also your use of threads is suboptimal. In your case you're trashing cache lines like mad and the switching overhead will eat up any parallelization gain. As a rule of thumb you should create no more than 'number of CPU cores' + 2 threads. Taking all this into account you could do the following:
#include <stdio.h>
#include <stdlib.h>
#include <pthread.h>
/* some prime number, so show what happens with uneven work sizes */
#define ELEMENTS 17
int M;
int A[ELEMENTS];
struct WorkletInfo {
int *data_base;
int count;
};
void *worklet(void *t)
{
int i;
struct WorkletInfo *wi = t; /* note: no cast here */
if(!wi)
return 0;
for(i=0; i < wi->count; i++)
wi->data_base[i] *= M;
return wi;
}
#define N_CPUS 4
#define NUM_THREADS (N_CPUS+1)
int main(int argc, char *argv[])
{
int t, ec;
pthread_t th[NUM_THREADS];
int remaining_elements;
int const elements_per_worklet = ELEMENTS/NUM_THREADS;
srand(time(0));
M=rand();
/* you must initialize the data a thread will access before starting the thread */
for(t = 0; t < ELEMENTS; t++)
A[t]=t+1; /* 0 * x = 0, so give it some nonzero value */
remaining_elements = ELEMENTS - (elements_per_worklet*NUM_THREADS);
ec = 0;
for(t = 0; t < NUM_THREADS; t++)
{
struct WorkletInfo *wi = malloc(sizeof(*wi));
wi->data_base = &A[ec];
wi->count = elements_per_worklet;
if(t<remaining_elements) {
wi->count++;
}
ec += wi->count;
pthread_create(&th[t], NULL, worklet, wi);
}
ec = 0;
for(t = 0; t < NUM_THREADS; t++)
{
int i;
struct WorkletInfo *wi;
void *retval;
pthread_join(th[t], &retval);
if(!retval)
continue;
wi = retval;
for(i = 0; i < wi->count; i++, ec++)
printf("worklet %d, worklet element %d => total element %d: %d\n",
t, i, ec,
wi->data_base[i]);
free(wi);
}
printf("Thread main\n");
return 0;
}
The output of this program looks like this:
worklet 0, worklet element 0 => total element 0: 300787748
worklet 0, worklet element 1 => total element 1: 601575496
worklet 0, worklet element 2 => total element 2: 902363244
worklet 1, worklet element 0 => total element 3: 1203150992
worklet 1, worklet element 1 => total element 4: 1503938740
worklet 1, worklet element 2 => total element 5: 1804726488
worklet 1, worklet element 3 => total element 6: 2105514236
worklet 2, worklet element 0 => total element 7: -1888665312
worklet 2, worklet element 1 => total element 8: -1587877564
worklet 2, worklet element 2 => total element 9: -1287089816
worklet 3, worklet element 0 => total element 10: -986302068
worklet 3, worklet element 1 => total element 11: -685514320
worklet 3, worklet element 2 => total element 12: -384726572
worklet 3, worklet element 3 => total element 13: -83938824
worklet 4, worklet element 0 => total element 14: 216848924
worklet 4, worklet element 1 => total element 15: 517636672
worklet 4, worklet element 2 => total element 16: 818424420
Thread main exit