4

I have a list of about 7030 items. I save the items from the list to a table in SQL server. I thought, I could use multithreading to speed up the process, which it did, however, there emerged an issue.

The number of items being uploaded to the database does not change, but when I query the number of records in the table after running my code, its always different, say one time it will have uploaded 6925, next time 6831 etc. I cannot see why this is happening.

In the class where I get the data

 void DatabaseUploadMultiThreading()
    {
        DateTime dtUpload = Program.UploadDate();

        int numThread = 8;          
        int splitNum = _holdingList.Count / numThread;
        int leftOver = _holdingList.Count - (splitNum * (numThread - 1));

        DatabaseWriter[] dbArray = new DatabaseWriter[numThread];
        List<Holding>[] holdingArray = new List<Holding>[numThread];
        Task[] taskDB = new Task[numThread];

        for (int i = 0; i < holdingArray.Length; i++)
        {
            dbArray[i] = new DatabaseWriter(i + 1, dtUpload);

            if (i == (numThread - 1))
                holdingArray[i] = _holdingList.GetRange(i * splitNum, leftOver);
            else
                holdingArray[i] = _holdingList.GetRange(i * splitNum, splitNum);
        }

        for (int i = 0; i < taskDB.Length; i++)
            taskDB[i] = Task.Factory.StartNew(dbArray[i].UploadHoldings, holdingArray[i]);

        try
        {
            Task.WaitAll(taskDB);                   // wait for all the threads to complete
        }
        catch (AggregateException ex)
        {
            ExceptionDispatchInfo.Capture(ex.InnerException).Throw();
        }

    }

The DatabaseWriter class snipet

 class DatabaseWriter : IDisposable
{
    #region variable declaration
    private SqlConnection _connection;
    private SqlCommand _command;
    private static readonly string _connectionString = "myConnectionString";

    public void UploadHoldings(object objHoldingList)
    {
        List<Holding> holdingList = (List<Holding>)objHoldingList;

        using (_connection = new SqlConnection(_connectionString))
        {
            _connection.Open();

            DataReImported(_dtUpload);

            for (int i = 0; i < holdingList.Count; i++)
            {
                string cmdText = "INSERT INTO HOLDINGS([FUND_CD], [SEDOLCHK], [NOMINAL], [CURR], [PRICE], [DATEU]) " +
                                    "VALUES(@fundcode, @sedol, @nominal, @curr, @price, @dtUpload)";

                _command = new SqlCommand(cmdText, _connection);
                _command.Parameters.Add("@fundCode", SqlDbType.VarChar).Value = holdingList[i].FundCode;
                _command.Parameters.Add("@sedol", SqlDbType.VarChar).Value = holdingList[i].IdSedol;
                _command.Parameters.Add("@nominal", SqlDbType.Decimal).Value = holdingList[i].Nominal;
                _command.Parameters.Add("@curr", SqlDbType.VarChar).Value = holdingList[i].Currency;
                _command.Parameters.Add("@price", SqlDbType.Decimal).Value = holdingList[i].Price;
                _command.Parameters.Add("@dtUpload", SqlDbType.Date).Value = _dtUpload;
                _command.ExecuteNonQuery();

                Console.WriteLine("Thread Number:" + _threadNum + " Security Number uploaded: " + i + " of " + holdingList.Count);
            }
            _connection.Close();
        }
    }



}
13
  • 1
    Have you considered doing a Bulk Insert? Just write all the data to a temp file that the server can reach and import. Commented Apr 3, 2014 at 14:16
  • 1
    Since you're trying to optimize, have you considered constructing SqlCommand and its params once, and then simply updating values on each execution, rather than rebuilding entire SqlCommand hierarchy for each record? Commented Apr 3, 2014 at 14:19
  • 5
    If your use SQL Server 2008 or greater you can use table-valued parameters. It will be mach faster. Commented Apr 3, 2014 at 14:23
  • 1
    Also, have you considered inserting more than one record per execution? E.g. INSERT INTO tbl(fn1, fn2) VALUES (f11, f12), (f21, f22), .... See Table Value Constructor MSDN article. Commented Apr 3, 2014 at 14:23
  • 1
    Well - if the numbers imported vary, then you have some access to a shared resource which aren't synchronized and thus generating a race condition when trying to write it to the database. My advice is to start smaller. Wrap your entire UploadHoldings as a static and take all it need as input paramaters to avoid mixing non-thread safe variables/object references with threads. Then I'd do a parallel.foreach instead of task and make sure that parallel similar does not mix class-member variables with the threading. From there you can then expand. Commented Apr 14, 2014 at 6:02

1 Answer 1

1

I would suggest that this, is not an optimal place to use multiple Tasks. Your code as shown above is a little inefficient, and splitting it into Task objects and running them in parallel, in many cases only serves to make them stumble over top of one another and slows you down or causes one of them to not execute. Like when the Three Stooges all try to rush into a door at the same time.

Do the basic, obvious optimizations as the other answerers have suggested above, ie create your SqlCommand only once, and do an absolute minimum within the loop (or, try a bulk-loading method). And check the value returned by ExecuteNonQuery to verify the number of records written.

Sign up to request clarification or add additional context in comments.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.