3

I am working on some enterprise applications that crunches large amount of data each day and to do that it has WINDOWS SERVICE application written in C# .NET 4. It also has connection to SQL SERVER 2008 R2 but for some reason it (randomly) throws me this error in synchronization table which stores JSON serialized data:

Exception of type 'System.OutOfMemoryException' was thrown.
at System.Data.SqlClient.TdsParser.ReadPlpUnicodeChars(Char[]& buff, Int32 offst, Int32 len, TdsParserStateObject stateObj)
at System.Data.SqlClient.TdsParser.ReadSqlStringValue(SqlBuffer value, Byte type, Int32 length, Encoding encoding, Boolean isPlp, TdsParserStateObject stateObj)
at System.Data.SqlClient.TdsParser.ReadSqlValue(SqlBuffer value, SqlMetaDataPriv md, Int32 length, TdsParserStateObject stateObj)
at System.Data.SqlClient.SqlDataReader.ReadColumnData()
at System.Data.SqlClient.SqlDataReader.ReadColumn(Int32 i, Boolean setTimeout)
at System.Data.SqlClient.SqlDataReader.GetValueInternal(Int32 i)
at System.Data.SqlClient.SqlDataReader.GetValues(Object[] values)

This table is fairly general table to keep LOB data:

CREATE TABLE [dbo].[SyncJobItem](
 [id_job_item] [int] IDENTITY(1,1) NOT NULL,
 [id_job] [int] NOT NULL,
 [id_job_item_type] [int] NOT NULL,
 [id_job_status] [int] NOT NULL,
 [id_c] [int] NULL,
 [id_s] [int] NULL,
 [job_data] [nvarchar](max) NOT NULL,
 [last_update] [datetime] NOT NULL,
CONSTRAINT [PK_SyncJobItem] PRIMARY KEY CLUSTERED)

LOB record that is failing has 36.231.800 characters of data in job_data column, which is (if we say that 1 character is 2 bytes, UTF-8) about 70MB of data which is not much.

Please consider that changing storage of data for job (e.g. disk) or something similar is not an option for me. I would like to fix this error so if anyone knows anything please help!

Also this error happens randomly on the same data, the system running is vmWare-vCloud that is, I think, some big blade system. We have about 6GB of RAM dedicated for our vm (service at most uses about 1-2GB), service is compiled as x64 and system is x64 Windows 2008R2 Standard. I have made sure that no single object has more than 2GB in memory so that is not it, also error is inside SqlClient and in my 15y of dev experience I have never seen it and Google turns out nothing. Also the error is not on DB side since DB has over 32GB of RAM and uses only 20GB peak. For specifics that I use in this system which are not usual is multi-threading and GC.Collect() after each job step (there are multiple steps on data).

EDIT:

Here is the full code that is doing this problem:

    internal static void ExecuteReader(IConnectionProvider conn, IList destination, IObjectFiller objectBuilder, string cmdText, DbParameterCollection parameters, CommandType cmdType, int cmdTimeout)
    {
        IDbCommand cmd = CreateCommand(conn.DBMS, cmdText, parameters, cmdType, cmdTimeout);
        cmd.Connection = conn.Connection;

        bool connIsOpennedLocally = EnsureOpenConnection(conn);
        try
        {
            AssignExistingPendingTransactionToCommand(conn, cmd);
            using (IDataReader reader = cmd.ExecuteReader(CommandBehavior.SingleResult))
            {
                objectBuilder.FillCollection(reader, destination);
                PopulateOutputParameterValues(parameters, cmd);
            }
        }
        finally
        {
            CloseConnectionIfLocal(conn, connIsOpennedLocally);
            cmd.Dispose();
        }
    }

...

    private void FillFromAlignedReader(ICollection<TEntity> collection, IDataReader openedDataReader, IDbTable table)
    {
        // Fastest scenario: data reader fields match entity field completely.
        // It's safe to reuse same array because GetValues() always overwrites all members. Memory is allocated only once.
        object[] values = new object[openedDataReader.FieldCount];
        while (openedDataReader.Read())
        {
            openedDataReader.GetValues(values);
            TEntity entity = CreateEntity(table, EntityState.Synchronized, values);
            collection.Add(entity);
        }
    }
6
  • Can you clarify what you mean by "crunches large amount of data"> Are you reading from the table you have advised and doing something with the data? Or are you reading from other tables and writing to this table? What are you doing with your data once it has been crunched, does it stay in memory before a final flush, or is it written and flushed from memory on a row by row basis? Commented Aug 27, 2013 at 9:47
  • I am loading the data from SQL Server to DataTable. This is what breaks then. It is simple select statement. Before and after I am doing some calculations with the data. Basically it goes like this: 1) Load the data from NoSQL db (Couchbase) 2) Aggregate data with map-reduce 3) Serialize the aggregate result object to JSON object 4) Save to db to this table 5) Go to next step that loads the JSON from SQL (here it breaks with OOM) Commented Aug 27, 2013 at 11:41
  • When you say I have made sure that no single object has more than 2GB in memory, does that include the DataTable in step 5? Could you achieve the same thing with a forward only SqlDataReader so that you only have one row at a time in memory? I think it would be useful to see the code block that is throwing the error as well as the stack trace. Commented Aug 27, 2013 at 11:59
  • Updated. It is using SqlDataReader for reading data. I am not sure if it is forward only SqlDataReader. Commented Aug 27, 2013 at 13:22
  • 1
    I read that :). I found this, seems similar to my problem: stackoverflow.com/questions/15124034/… Commented Aug 27, 2013 at 14:12

2 Answers 2

2

For those who experience this problem after lots of testing and MSDN (link) I have come to conclusion that maximum single field size capable of being read by SqlDataReader in normal reading mode is around 70MB on x64 machine, after this it needs to switch its SqlCommand to CommandBehavior.SequentialAccess and stream the field contents.

Example code that would work like that:

    ...
    behaviour = CommandBehavior.SequentialAccess;
    using (IDataReader reader = cmd.ExecuteReader(behaviour))
    {
       filler.FillData(reader, destination);
    }

When you read data in a loop you need to fetch columns in order and when you reach BLOB column you should call something like this (depending on data types):

    ...
    private string GetBlobDataString(IDataReader openedDataReader, int columnIndex)
    {
        StringBuilder data = new StringBuilder(20000);
        char[] buffer = new char[1000];
        long startIndex = 0;

        long dataReceivedCount = openedDataReader.GetChars(columnIndex, startIndex, buffer, 0, 1000);
        data.Append(buffer, 0, (int)dataReceivedCount);
        while (dataReceivedCount == 1000)
        {
            startIndex += 1000;
            dataReceivedCount = openedDataReader.GetChars(columnIndex, startIndex, buffer, 0, 1000);
            data.Append(buffer, 0, (int)dataReceivedCount);
        }

        return data.ToString();
    }

    private byte[] GetBlobDataBinary(IDataReader openedDataReader, int columnIndex)
    {
        MemoryStream data = new MemoryStream(20000);
        BinaryWriter dataWriter = new BinaryWriter(data);

        byte[] buffer = new byte[1000];
        long startIndex = 0;

        long dataReceivedCount = openedDataReader.GetBytes(columnIndex, startIndex, buffer, 0, 1000);
        dataWriter.Write(buffer, 0, (int)dataReceivedCount);
        while (dataReceivedCount == 1000)
        {
            startIndex += 1000;
            dataReceivedCount = openedDataReader.GetBytes(columnIndex, startIndex, buffer, 0, 1000);
            dataWriter.Write(buffer, 0, (int)dataReceivedCount);
        }

        data.Position = 0;
        return data.ToArray();
    }

This should work for data up to around 1GB-1.5GB, afterwards it will break on single object not being able to reserve continuous memory block of enough size so either then flush directly to disk from buffer or split data to multiple smaller objects.

Sign up to request clarification or add additional context in comments.

Comments

-1

I think for these big amount of data you should use the db-type Text. Only use nvarchar if you need to do searches/like on it. Note this could give strange behaviour when full-text-search is enabled.

1 Comment

Microsoft advise that Text, NText, and Image Datatypes will be removed from future versions of sql-server and that VARCHAR(MAX), NVARCHAR(MAX) and VARBINARY should be used in their place. I doubt this is the solution!

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.