0

I am fairly new to parallel programming; I am developing a program that deals with numbers in different bases and I want to parallelize the addition method.

Firstly, it computes the linear convolution and does the carrying later, thus every iteration of the for-loop is independent (Example):

(Little Endianness btw) [2,7,1] + [8,7,5] = [10,14,6] => [0,5,7]

The question is, can I, if there are threads available, make the addition process faster by having the iterations done at the same time in different threads, and how?

2
  • 2
    Don't confuse "asynchronous" with "parallel". Asynchronous is freeing up a thread while you wait for some external task to complete (network request, etc.). It's about how your code waits. Parallel means running two parts of your code at the same time on multiple threads. It's about how your code runs. You want parallel programming. Commented Jan 24, 2022 at 19:29
  • Don't confuse "parallel" (which is what multiprocessing hardware offers) with "concurrent" (which is what threads offer - in that different logical executors can maintain independent state permitting them to execute at the same time correctly.) Commented Jan 24, 2022 at 19:46

2 Answers 2

1

In case of huge arrays (threading has its own overhead), you can try Parallel, e.g. Parallel.For:

  int[] left  = ...
  int[] right = ...
  int[] result = new int[left.Length];

  ...

  Parallel.For(0, left.Length, i => result[i] = left[i] + right[i]);

Let's have a look on the effect:

  int N = 100_000_000;

  int[] left   = new int[N];
  int[] right  = new int[N];
  int[] result = new int[left.Length];

  // To prevent garbage collection while testing
  GC.Collect(2);

  Stopwatch sw = new Stopwatch();

  sw.Start();

  // Parallel version
  //Parallel.For(0, left.Length, i => result[i] = left[i] + right[i]);

  // Standard for loop version
  for (int i = left.Length - 1; i >= 0; --i)
    result[i] = left[i] + right[i];

  sw.Stop();

  Console.Write(sw.ElapsedMilliseconds);

Outcome (.Net 6 IA-64, Realease build, Core i9, 6 cores)

  200 - parallel version
  500 - for loop version
Sign up to request clarification or add additional context in comments.

Comments

1

When you have granular work to do, like adding two integers, and you use the Parallel.For method, you'll find that the synchronization overhead, as well as the overhead of invoking a non-inlinable lambda for each index, negates any performance gained by the paralellization. In this case it's a good idea to chunkify the workload by operating on ranges of indices, instead of one index at a time. Here is how you can use the Parallel.ForEach+Partitioner.Create methods for this purpose:

int[] left = new int[1_000_000];
int[] right = new int[1_000_000];
int[] sum = new int[1_000_000];

ParallelOptions parallelOptions = new()
{
    MaxDegreeOfParallelism = Environment.ProcessorCount
};

Partitioner<Tuple<int, int>> partitioner = Partitioner.Create(0, left.Length);
    
Parallel.ForEach(partitioner, parallelOptions, (Tuple<int, int> range) =>
{
    for (int i = range.Item1; i < range.Item2; i++)
    {
        sum[i] = left[i] + right[i];
    }
});

The Partitioner.Create creates by default about three times as many ranges as the Environment.ProcessorCount value, so on a quad core machine it will create about 12 ranges in total. This is a good compromise between too many ranges (overhead) and too few ranges (unbalanced workload). If you want you can finetune the number of the partitions, by configuring the third optional rangeSize argument.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.