0

I am trying to capture a spreadsheet data in to a 2D array. I am using VSTO.

int rc = 1048576;
int cc = 1638;

string[,] arr = new string[rc, cc];

The last line throws Out of Memory exception. I would like to show message telling the user only 'X' elements can be captured.

Checked MSDN and there is a row count limit mentioned of 16,777,216. No Column count limitation for datatable. Cant find limit either for 2D array.

My issue is not with WHY the exception. What I am looking for is if you are doing VSTO development, and had to capture a worksheet in a DataTable to perform In-Memory joins etc, you will need to do this:

string[,] arr = new string[rc, cc]; 
Microsoft.Office.Interop.Excel.Range selection 
arr = selection.Value as string[,]; 

and then copy the data from that array to datatable. Now what will be the ideal limit for number of elements a user should select. So I can set that rowcount/columncount lmits and display message when selection exceeds this criteria.

5
  • My downvote senses indicate that you shouldn't really expect anything different when you try to allocate a block of memory the size of the entire address space of the machine. Commented Oct 10, 2012 at 16:09
  • 3
    The OP shouldn't be surprised at the results, but I'm not sure I agree with all the downvotes. It's a fair question, and a simple explanation would suffice. Commented Oct 10, 2012 at 16:12
  • 1
    Perhaps. But this question seems to indicate poor research effort. The exception indicated exactly what was wrong, and its fairly obvious from a mathematical standpoint that if you're going to be allocating 2 billion of something, it had better be darned small. Commented Oct 10, 2012 at 16:19
  • buffer_overflow, I would recommend creating new question as original edition turned out to be semi-useful by itself and your edit completely changed what the question is about (even if it is the same question from your point of view). And get good title for new one like "Dealing with large selection ranges in Excel interop". Commented Oct 10, 2012 at 16:36
  • Thanks. I 've put the new question under stackoverflow.com/questions/12843173/… Commented Oct 11, 2012 at 15:30

2 Answers 2

16

Let's do the math. You are trying to allocate a 2D string array with 1048576 * 1638 = 1717567488 elements. The size of string reference on x64 is 8 bytes => a total of 1717567488 * 8 = 13740539904 bytes. Which is about 13 GB of continuous memory space. Maximum size for single allocation for CLR is 2GB, so you are getting OutOfMemoryException as such single block can't be allocated.

Note that such amount of strings even when all are 1-2 characters long will take 30GB for string values in addition to references. What else than an OutOfMemoryException did you expect to get?

Sign up to request clarification or add additional context in comments.

10 Comments

are strings stored byreference? If they are, this would only allocate about 6GB (still too big, but smaller than 28GB)
I agree with Wug: string is a reference type and creating an array of strings initializes each item in the array with null rather than with string.Empty.
An empty string is an 8-byte object header (4-byte SyncBlock and a 4-byte type descriptor) + An int32 field for the length of the string (this is returned by String.Length) + An int32 field for the number of chars in the character buffer + The first character of the string in a System.Char = 18 bytes. So even if the string was null (which is the case here) you would still need the same amount.
@DanielHilgarth: Even if it did initialize them all to string.Empty, it would probably create a 6GB array of references to the same object.
@DarinDimitrov: Still: With new string[rc, cc]; you don't create rc * cc string instances. Each item in the array will be initialized with null. Even if each item would be string.Empty it wouldn't change anything because of string interning.
|
0

Aleks, your are asking for a limit of your memory allocation. Let me suggest to change your point of view and limit your input either

  • 1.1 by using only the user selected cells
  • 1.2 by using only the indeed used cells within your xl.worksheet

Additionally to your question, let me advise to mind the difference of Range.Value and Range.Value2: Range.Value contains the user input e.g. a formula like =SUM(A1:B10), while Range.Value2 contains the result of that formula e.g. 10.

1.1 Using only the user selected cells
As stated here use Range userSelectedRange = Application.Selection;

1.2 Using only the indeed used cells within your xl.worksheet
Use the Worksheet.Cells.SpecialCells property.
Maybe Worksheet.UsedRange is even better.

A code example:

int lastUsedRowIndex = xlWorksheet.Cells.SpecialCells(XlCellType.xlCellTypeLastCell).Row;
int lastUsedColIndex = xlWorksheet.Cells.SpecialCells(XlCellType.xlCellTypeLastCell).Column;
string rangeString = string.Format("A0:{0}{1}", SomeIndexToExcelColConverterFunc(lastUsedColIndex), lastUsedRowIndex));
Xl.Range range = xlWorksheet.Range[rangeString];
object[,] objectMatrix = range.Value2 as object[,];
Marshal.ReleaseComObject(range);    
// convert to string here if needed

1 Comment

I downvoted this because even though you cast your range.Value2to an object[,], by default the array is empty. Populating the array will still cause a System.OutOfMemoryException if, like in my case, you end up with 154 columns in one Workbook

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.