0

I'm trying to generate N random floats between 0 and 1 where N is specified by the user. Then I need to find the mean and the variance of the generated numbers. Struggling with finding the variance.

Already tried using variables instead of an array but have changed my code to allow for arrays instead.

#include <cstdlib>
#include <ctime>
#include <cmath>
using namespace std;


int main(){
  int N, i;
  float random_numbers[i], sum, mean, variance, r;
  cout << "Enter an N value" << endl;
  cin >> N;
  sum = 0;
  variance = 0;

  for (i = 0; i < N; i++) {
    srand(i + 1);
    random_numbers[i] = ((float) rand() / float(RAND_MAX));
    sum += random_numbers[i];
    cout << random_numbers[i] << endl;
    mean= sum / N;
    variance += pow(random_numbers[i]-mean,2);

    }
  variance = variance / N;
  cout << " The sum of random numbers is " << sum << endl;
  cout << " The mean is " << mean << endl;
  cout << " The variance is " << variance << endl;

} 

The mean and sum is currently correct however the variance is not.


4
  • Since C++11, there is now a random header which provides an easier interface to generating random numbers and distribution which allow you to control the std-dev/variance of those random numbers. Commented Oct 17, 2019 at 13:55
  • 1
    Your array is invalid. Use std::vector for dynamic "arrays". This is also not really random. And a few more problems. Commented Oct 17, 2019 at 13:55
  • srand should be called only once. Commented Oct 17, 2019 at 13:57
  • 1. You declare array of length i, where this i is uninitialized. 2. No need to call srand() multiple times, once will be enough (maybe before for loop starts? ) Commented Oct 17, 2019 at 13:57

2 Answers 2

1

The mean you calculate inside the loop is a "running-mean", ie for each new incoming number you calculate the mean up to this point. For the variance however your forumla is incorrect. This:

variance += pow(random_numbers[i]-mean,2);

would be correct if mean was the final value, but as it is the running mean the result for variance is incorrect. You basically have two options. Either you use the correct formula (search for "variance single pass algorithm" or "running variance") or you first calculate the mean and then set up a second loop to calculate the variance (for this case your formula is correct).

Note that the single pass algorithm for variance is numerically not as stable as using two loops, so if you can afford it memory and performance-wise you should prefer the algorithm using two passes.

PS: there are other issues with your code, but I concentrated on your main question.

Sign up to request clarification or add additional context in comments.

Comments

0

The mean that you use inside the variance computation is only the mean of the first to i element. You should compute the mean of the sample first, then do another loop to compute the variance.

Enjoy

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.