5

I have a class of range

public class avl_range
{
    public long start { get; set; }
    public long end { get; set; }
}

If I use a normal FOR works perfect, but have to wait for each command to finish and each query take 8 seconds, so 10 queries take 80 seconds.

In the Parallel version If I only print the ranges works perfect, but if try to execute the command say is already in progress.

{"An operation is already in progress."}

How can I solve this?

var numbers = new List<avl_range>();
using (var conn = new NpgsqlConnection(strConnection))
    {
        conn.Open();

        Action<avl_range> forEachLoop = number => //Begin definition of forLoop
        {
             // only the console write line works ok
            Console.WriteLine(number.start + " - " + number.end);

            using (var cmd = new NpgsqlCommand())
            {
                cmd.Connection = conn;                            
                cmd.CommandText = String.Format( "SELECT * FROM avl_db.process_near_link({0}, {1});"
                                                 , number.start
                                                 , number.end);
                // here cause the error.
                using (var reader = cmd.ExecuteReader())
                {
                    while (reader.Read())
                    {
                        Console.WriteLine(reader.GetString(0));
                    }
                }
            }
        };

        Parallel.ForEach(numbers, forEachLoop);
    }
 );

FYI: Im trying to solve this issue I post it before

6
  • In Microsoft's SQL server, this would cause an error, because you can't have multiple open record sets on the same connection (without enabling MARS). I don't know if your SQL provider is implemented the same way, but my guess is that it is likely. Commented Jan 27, 2017 at 20:01
  • How about creating a new NpgsqlConnection inside the parallel foreach? Commented Jan 27, 2017 at 20:02
  • @BradleyUffner Can you provide some documentation? I can try, but doesnt seem logic to me create a different connection for each query. Commented Jan 27, 2017 at 20:03
  • That wold work in Microsoft's SQL implementation. I can't say for sure with postressql, because I've never used it, but it's worth a shot. As long as it pools connections, it shouldn't have too much a a performance penalty. Commented Jan 27, 2017 at 20:04
  • This stackoverflow.com/questions/8803733/… seems to indicate that prostgresql has no MARS equivalent functionality. Commented Jan 27, 2017 at 20:05

2 Answers 2

4

An Npgsql connection can't be used concurrently - only one command may be running at any given point in time (in other words, no MARS support).

It may definitely make sense to open multiple connections to perform your queries in parallel. Although establishing a new physical connection is expensive, connection pooling is extremely lightweight, so there's very little overhead in reusing physical connections. The main reason not to do this is if you need your multiple operations to be in the same transaction.

Sign up to request clarification or add additional context in comments.

8 Comments

Are you sure that Npgsql implments connection pooling? It is up to the ADO.NET provider if they want to support it or not. I honestly don't know the answer if they do or not.
Yep, I'm the guy who wrote it :)
Oh, well ok then :)
Hi Shay, Im dropyghost in the Npgsql forum :) . I just move the connection creation inside the forEachLoop, and now is working. Is that Ok or I should do something else, because you mention reuse connections?
@JuanCarlosOropeza if all your code does is write to the console that might be the part slowing you down. The Console is the slowest peice. Try it again with a normal foreach loop without parallel but store the strings in a list instead of writing them to the console and comment out the Console.WriteLine(number.start + " - " + number.end); and then let us know what time you get.
|
4

Even if you could get it to work with MARS, connection objects are almost never thread safe anyway, you need to have a connection per thread. Parallel.ForEach has overloads to make this easy which have functions that run at the start of a thread and at the end.

var numbers = new List<avl_range>();

Func<NpgsqlConnection> localInit => () => 
{
    var conn = new NpgsqlConnection(strConnection);
    conn.Open();
};

Action<NpgsqlConnection> localFinally = (conn) => conn.Dispose();

Func<avl_range, ParallelLoopState, NpgsqlConnection, NpgsqlConnection> forEachLoop = (number, loopState, conn) => //Begin definition of forLoop
{
     // only the console write line works ok
    Console.WriteLine(number.start + " - " + number.end);

    using (var cmd = new NpgsqlCommand())
    {
        cmd.Connection = conn;                            
        cmd.CommandText = String.Format( "SELECT * FROM avl_db.process_near_link({0}, {1});"
                                         , number.start
                                         , number.end);
        // here cause the error.
        using (var reader = cmd.ExecuteReader())
        {
            while (reader.Read())
            {
                Console.WriteLine(reader.GetString(0));
            }
        }
    }
    return conn;
};

Parallel.ForEach(numbers, localInit, forEachLoop, localFinally);

That being said, most of the time doing concurrent connections to a database is not the right idea, the bottleneck is likely elsewhere and you should use a profiler to see what is really slowing your program down and focus your efforts there.


Sample code for comments:

var numbers = GetDataForNumbers();
List<string> results = new List<string>();

Func<List<string>> localInit => () => new List<string>();

Func<avl_range, ParallelLoopState, List<string>, List<string>> forEachLoop = (number, loopState, localList) => //Begin definition of forLoop
{
    using (var conn = new NpgsqlConnection(strConnection))
    {
        conn.Open();

        //This line is going to slow your program down a lot, so i commented it out.
        //Console.WriteLine(number.start + " - " + number.end);

        using (var cmd = new NpgsqlCommand())
        {
            cmd.Connection = conn;                            
            cmd.CommandText = String.Format( "SELECT * FROM avl_db.process_near_link({0}, {1});"
                                             , number.start
                                             , number.end);
            using (var reader = cmd.ExecuteReader())
            {
                while (reader.Read())
                {
                    //Add a object to the thread local list, we don't need to lock here because we are the only thread with access to it.
                    localList.Add(reader.GetString(0));
                }
            }
        }
    }
    return localList;
};

Action<List<String>> localFinally = localList => 
{
    //Combine the local list to the main results, we need to lock here as more than one thread could be merging at once.
    lock(results)
    {
        results.AddRange(localList);
    }
};

Parallel.ForEach(numbers, localInit, forEachLoop, localFinally);

//results now contains strings from all the threads here.

9 Comments

Thanks Scott, is there any difference between this setup and move the create connection inside?
If NpgsqlConnection did not support connection pooling this way would be faster because it would create less total connections (a single thread may be reused for many iterations of the body). However because Shay confirmed it does connection pooling I would rather put it inside the foreach body because it makes the code a lot more simple and readable and will have the exact same performance. All this code is doing is creating a new "pool" like effect by having a connection per thread, because the provider has its own pool we don't need to make our own.
Ok then I will keep the using .... but glad to know about localInit, localFinally that maybe usefull in other step later.
A common use case is you create a object like a var combinedList = new List<T> outside of the foreach then in the localinit you also do a return new List<T> and write to that local list in the loop body. In your localFinally you do localList => { lock(combinedList) { combinedList.AddRange(localList); } } to merge the indivdual thread lists in to the master list in a thread safe way then the only place you have a lock is in the localFinally and have no locks in the loop body.
Yes, that is exactly what I would need. Can you refer me to a working sample code?
|

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.