In addition to your current answer, your have a number of problems, but your problem with outlier identification is you are using '&&' instead of '||' which prevents any outlier from being found because your test condition always evaluates FALSE, e.g.
if((measurements[i] < (0.5*median)) && (measurements[i] > (1.5*median))){
(the array element can never be both less than (0.5*median) and greater than (1.5*median) at the same time)
Beyond your identification of the outliers, as noted in the comments and in @paddy's answer, you don't need to copy or allocate in your outlier removal function. Instead, remove the outliers by shuffling all elements above the outlier down by one removing the outlier with memmove and before returning from the function, if outliers were removed, you can (optionally) realloc once at the end to trim the allocation size.
(which really isn't needed unless you are working on a memory limited embedded system or have millions of elements you are dealing with)
Tidying up your removal function and passing the address of your array from main() to allow reallocation in the function without having to assign the return, you could do something like:
/* remove outliers from array 'a' given 'median'.
* takes address of array 'a', address of number of elements 'n',
* and median 'median' to remove outliers. a is reallocated following
* removal and n is updated to reflect the number of elements that
* remain. returns pointer to reallocated array on success, NULL otherwise.
*/
double *rmoutliers (double **a, size_t *n, double median)
{
size_t i = 0, nelem = *n; /* index, save initial numer of elements */
while (i < *n) /* loop over all elements indentifying outliers */
if ((*a)[i] < 0.5 * median || (*a)[i] > 1.5 * median) {
if (i < *n - 1) /* if not end, use memmove to remove */
memmove (&(*a)[i], &(*a)[i+1],
(*n - i + 1) * sizeof **a);
(*n)--; /* decrement number of elements */
}
else /* otherwise, increment index */
i++;
if (*n < nelem) { /* if outliers removed */
void *dbltmp = realloc (*a, *n * sizeof **a); /* realloc */
if (!dbltmp) { /* validate reallocation */
perror ("realloc-a");
return NULL;
}
*a = dbltmp; /* assign reallocated block to array */
}
return *a; /* return array */
}
Next, do not roll-you-own sort function. The C library provides qsort which will be orders of magnitude less likely to contains errors than your own (not to mention the orders of magnitude faster). All you need to do is write a qsort compare function, that receives pointers to adjacent elements from your array and then returns -1 if the first sorts before the second, 0 if the elements are equal, and 1 if the second sorts before the first. For numeric comparisons, you can return the result to the two inequalities to avoid potential over/underflow, e.g.
/* qsort compare to sort numbers in ascending order without overflow */
return (a > b) - (a < b);
Noting that a and b will be pointers to double (or float) in your case, to compare doubles, the proper casts before dereference would be:
/* qsort compare function for doubles (ascending) */
int cmpdbl (const void *a, const void *b)
{
return (*((double *)a) > *((double *)b)) -
(*((double *)a) < *((double *)b));
}
That's the only challenge to using qsort after that, to sort your array in ascending order, takes nothing more than:
qsort (array, n, sizeof *array, cmpdbl); /* use qsort to sort */
(done...)
Putting it altogether in a short example that just reads your arrays as lines of input (1024 chars max) and then converts each value to a double using sscanf storing any number of values in a dynamically sized array before sorting, grabbing the median and calling your removal function, could be written as follows.
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#define MAXC 1024 /* max characters to read per-line (per-array) */
#define MAXD 8 /* initial number of doubles to allocate */
/* qsort compare function for doubles (ascending) */
int cmpdbl (const void *a, const void *b)
{
return (*((double *)a) > *((double *)b)) -
(*((double *)a) < *((double *)b));
}
/* remove outliers from array 'a' given 'median'.
* takes address of array 'a', address of number of elements 'n',
* and median 'median' to remove outliers. a is reallocated following
* removal and n is updated to reflect the number of elements that
* remain. returns pointer to reallocated array on success, NULL otherwise.
*/
double *rmoutliers (double **a, size_t *n, double median)
{
size_t i = 0, nelem = *n; /* index, save initial numer of elements */
while (i < *n) /* loop over all elements indentifying outliers */
if ((*a)[i] < 0.5 * median || (*a)[i] > 1.5 * median) {
if (i < *n - 1) /* if not end, use memmove to remove */
memmove (&(*a)[i], &(*a)[i+1],
(*n - i + 1) * sizeof **a);
(*n)--; /* decrement number of elements */
}
else /* otherwise, increment index */
i++;
if (*n < nelem) { /* if outliers removed */
void *dbltmp = realloc (*a, *n * sizeof **a); /* realloc */
if (!dbltmp) { /* validate reallocation */
perror ("realloc-a");
return NULL;
}
*a = dbltmp; /* assign reallocated block to array */
}
return *a; /* return array */
}
int main (void) {
char buf[MAXC];
int arrcnt = 1;
while (fgets (buf, MAXC, stdin)) { /* read line of data into buf */
int offset = 0, nchr = 0;
size_t n = 0, ndbl = MAXD, size;
double *array = malloc (ndbl * sizeof *array), /* allocate */
dbltmp, median;
if (!array) { /* validate initial allocation */
perror ("malloc-array");
return 1;
}
/* parse into doubles, store in dbltmp (should use strtod) */
while (sscanf (buf + offset, "%lf%n", &dbltmp, &nchr) == 1) {
if (n == ndbl) { /* check if reallocation requierd */
void *tmp = realloc (array, 2 * ndbl * sizeof *array);
if (!tmp) { /* validate */
perror ("realloc-array");
break;
}
array = tmp; /* assign reallocated block */
ndbl *= 2; /* update allocated number of doubles */
}
array[n++] = dbltmp; /* assign to array, increment index */
offset += nchr; /* update offset in buffer */
}
qsort (array, n, sizeof *array, cmpdbl); /* use qsort to sort */
median = array[n / 2]; /* get median */
/* output original array and number of values */
printf ("\narray[%d] - %zu values\n\n", arrcnt++, n);
for (size_t i = 0; i < n; i++) {
if (i && i % 10 == 0)
putchar ('\n');
printf (" %5.2f", array[i]);
}
printf ("\n\nmedian: %5.2f\n\n", median);
size = n; /* save orginal number of doubles in array in size */
if (!rmoutliers (&array, &n, median)) /* remove outliers */
return 1;
if (n < size) { /* check if outliers removed */
printf ("%zu outliers removed - %zu values\n\n", size - n, n);
for (size_t i = 0; i < n; i++) {
if (i && i % 10 == 0)
putchar ('\n');
printf (" %5.2f", array[i]);
}
printf ("\n\n");
}
else /* otherwise warn no outliers removed */
fputs ("warning: no outliers found.\n\n", stderr);
free (array); /* don't forget to free what you allocate */
}
}
(note: you should really use strtod as sscanf provides no error handling beyond reporting success/failure of the conversion, but that is for another day or left to you as an exercise)
Example Input File
Note: I didn't use the size: X information in my data file. It wasn't needed. I just used a dynamic allocation scheme to size the arrays as needed. The format of the input file I used contained the measurement values for each array on a separate line, e.g.
23.0 21.5 27.6 2.5 19.23 21.0 23.5 24.6 19.5 19.23 26.01 22.5 24.6 20.15 ... 18.93
11.12 10.32 9.91 14.32 12.32 20.37 13.32 11.57 2.32 13.32 11.22 12.32 ... 13.32
Example Use/Output
$ ./bin/rmoutliers <dat/outlierdata.txt
array[1] - 25 values
2.50 5.23 18.00 18.23 18.93 19.23 19.23 19.50 19.73 20.15
21.00 21.50 22.25 22.50 22.50 23.00 23.26 23.50 24.50 24.60
24.60 26.01 26.60 27.60 45.50
median: 22.25
3 outliers removed - 22 values
18.00 18.23 18.93 19.23 19.23 19.50 19.73 20.15 21.00 21.50
22.25 22.50 22.50 23.00 23.26 23.50 24.50 24.60 24.60 26.01
26.60 27.60
array[2] - 20 values
2.32 8.32 9.91 10.16 10.32 10.91 11.12 11.22 11.57 12.32
12.32 12.58 12.91 13.32 13.32 13.32 14.32 14.56 20.37 35.32
median: 12.32
3 outliers removed - 17 values
8.32 9.91 10.16 10.32 10.91 11.12 11.22 11.57 12.32 12.32
12.58 12.91 13.32 13.32 13.32 14.32 14.56
(note: in any code that dynamically allocates memory, you should run the program through a memory error checking program like valgrind for Linux, other OS's have similar tools. It's simple, just run add valgrind to the start of your command, e.g. valgrind ./bin/rmoutliers <dat/outlierdata.txt and confirm you have freed all the memory you have allocated and that there are no memory errors.)
Look things over and let me know if you have questions.
Memory Use/Error Check
In your comment you seem concerned that what I do about may leak memory -- that's not the case. As mentioned in the question, you can verify the memory use and check for any memory errors with tools such as valgrind, e.g.
$ valgrind ./bin/rmoutliers <dat/outlierdata.txt
==28383== Memcheck, a memory error detector
==28383== Copyright (C) 2002-2015, and GNU GPL'd, by Julian Seward et al.
==28383== Using Valgrind-3.12.0 and LibVEX; rerun with -h for copyright info
==28383== Command: ./bin/rmoutliers
==28383==
array[1] - 25 values
2.50 5.23 18.00 18.23 18.93 19.23 19.23 19.50 19.73 20.15
21.00 21.50 22.25 22.50 22.50 23.00 23.26 23.50 24.50 24.60
24.60 26.01 26.60 27.60 45.50
median: 22.25
3 outliers removed - 22 values
18.00 18.23 18.93 19.23 19.23 19.50 19.73 20.15 21.00 21.50
22.25 22.50 22.50 23.00 23.26 23.50 24.50 24.60 24.60 26.01
26.60 27.60
array[2] - 20 values
2.32 8.32 9.91 10.16 10.32 10.91 11.12 11.22 11.57 12.32
12.32 12.58 12.91 13.32 13.32 13.32 14.32 14.56 20.37 35.32
median: 12.32
3 outliers removed - 17 values
8.32 9.91 10.16 10.32 10.91 11.12 11.22 11.57 12.32 12.32
12.58 12.91 13.32 13.32 13.32 14.32 14.56
==28383==
==28383== HEAP SUMMARY:
==28383== in use at exit: 0 bytes in 0 blocks
==28383== total heap usage: 8 allocs, 8 frees, 1,208 bytes allocated
==28383==
==28383== All heap blocks were freed -- no leaks are possible
==28383==
==28383== For counts of detected and suppressed errors, rerun with: -v
==28383== ERROR SUMMARY: 0 errors from 0 contexts (suppressed: 0 from 0)
If you note above, there were "8 allocations and 8 frees" associated with the memory used above, e.g.:
==28383== total heap usage: 8 allocs, 8 frees, 1,208 bytes allocated
You can also confirm that all memory was freed and there were no leaks in the next line:
==28383== All heap blocks were freed -- no leaks are possible
And finally you can confirm there were no memory errors associated with the use of memory during the program execution:
==28383== ERROR SUMMARY: 0 errors from 0 contexts (suppressed: 0 from 0)
If there is a part of the code that you are having trouble following where the memory is freed, let me know and I'm happy to help further.
memcpy (&array[i], &array[i+1], (n-- - i + 1) * sizeof *array);If it is the last element, thenarray[n-- - 1] = 0reallocto resize an existing allocation. There is no need to have two arrays. Simply condense the array down by shuffling values to remove the outliers, and realloc to resize it (if you want to -- there is often less need for this, as you can simply leave the extra space available for future array growth).