2

So i have this excel with around 2200 rows that i need to read and write to a txt file, the problem is that it takes way too much time, I've been told that reading/writing files it usually takes time because it's nature, so i tried read only once the excel file, use a stringBuilder and write per line (haven't tried storing all the text and writing to whole .txt file)

But, is there any way i can speed this up?

Selecting smaller ranges, like only 1 row? Builing a gigantic string with \n as line-breaks and then write all that to the .txt?

Here's a sample of my code

using Excel = Microsoft.Office.Interop.Excel;
[...]
xlApp = new Excel.Application();
xlWorkBook = xlApp.Workbooks.Open("C:/Users/MyUser/Desktop/SomeFolder/my_excel.xlsx", 0, true, 5, "", "", true, Microsoft.Office.Interop.Excel.XlPlatform.xlWindows, "\t", false, false, 0, true, 1, 0);
xlWorkSheet = (Excel.Worksheet)xlWorkBook.Worksheets.get_Item(1);
Excel.Range allRange = xlWorkSheet.UsedRange;
try
{
    System.IO.StreamWriter file = new System.IO.StreamWriter("C:\\test.txt");
    String line = "";
    //StringBuilder line;
    for (int row = 1; row <= allRange.Rows.Count; row++) //These are up to thousand sometimes
    {
        if (allRange.Value2[row, 1] != "")
        {
            //line = new StringBuilder();
            for (int column = 1; column <= 6; column++)
            {
                //Console.WriteLine(allRange.Value2[row, column]);
                line += allRange.Value2[row, column];
                if (column != 6)
                {
                    line += "|";
                    //line.Append("|");
                }
            }
            file.WriteLine(line);
            line = "";
        }
        else
        {
            MessageBox.Show("Should've not reached here.");
            break;
        }
    }
    file.Close();
    }
catch (Exception ex)
{
    MessageBox.Show("Couldn't write file: " + ex.ToString());
}

Btw I'm using .NET v4.0.30319... i think (Says on Environment.Version.ToString())

Or .NET v4.5.51209 (Says on "Help" > "About Microsoft Visual Studio")

4
  • 2,200 rows doesn't sound like many. Why can't you just read / write the whole file in one go? That is fastest so if speed is the issue do it that way. Commented Mar 17, 2015 at 20:00
  • I taught that was what i was doing, at least on reading the the xlsx file, but not writing, is it really faster to build a "gigantic" string? (About 45~50 per line plus line breaks "\n" times 2200 rows it's about 100,000 characters in one string/stringBuilder variable) Commented Mar 17, 2015 at 20:11
  • As far as writing definitely faster. 100,000 characters is really not "gigantic", it's about 200k which in modern memory terms is peanuts - about the size of a single banner image on a web page. As far as reading, you are being slowed down by the excel interop, which is probably the main bottleneck. Perhaps you should separate the reading phase from the writing phase. First read all the data into memory and then write - so you can profile both operations and see which is the slowest and needs most attention. Commented Mar 17, 2015 at 21:59
  • Yes, i just did the profiling, and apparently the problem was the Reading, selecting a "big" range Excel.Range allRange = xlWorkSheet.UsedRange; ([A,1],[AD,2210]) and reading the value cell by cell, takes much more times than selecting a smaller range (i only need the first 6 columns). While writing to file takes a bit less than 1 sec. So now i can try the solution @sławomir-rosiek sugested, using OpenXML SDK Commented Mar 17, 2015 at 22:31

2 Answers 2

2

I think that the main reason that this code is slow is bacuse of usage of Excel Interop. It's very slow. Instead of that try to use OpenXML SDK - it's library to manipulate Office 2007+ documents (including *.xlsx). It's a lot of faster that ExcelInterop and it doesn't require instance of Excel installed on machine. The main disadvantage is that it can't open XLS file. Here is sample how to read large document: https://msdn.microsoft.com/EN-US/library/office/gg575571.aspx

Also try to use StopWatch or any profiler and measure what's the slowest part of the code.

Sign up to request clarification or add additional context in comments.

Comments

0

I am still pretty new to Excel Interop, but here is some code I recently improved. The performance went from something like 30 seconds, down to under 2 seconds.

                        //This method is very slow.
                        // Storing Each row and column value to excel sheet
                        //for (int k = 0, k2 = 2; k < table.Rows.Count; k++, k2++)
                        //{
                        //    for (int l = 0, l1 = 1; l < table.Columns.Count; l++, l1++)
                        //    {
                        //        //ExcelApp.Cells[k2, l1] =
                        //        //    table.Rows[k].ItemArray[l].ToString();
                        //        ExcelApp.Cells[k2, l1] =
                        //            table.Rows[k][l].ToString();
                        //    }
                        //}

                        ////////////////

                        //See if this method is faster
                        // transform formated data into string[,]
//                        var excelData = new string[table.Rows.Count, table.Columns.Count];
                        var excelData = new object[table.Rows.Count, table.Columns.Count];
                        for (int rowJ = 0; rowJ < table.Rows.Count; rowJ++)
                        {
                            for (int colI = 0; colI < table.Columns.Count; colI++)
                            {
//                                excelData[rowJ, colI] = table.Rows[rowJ][colI].ToString();
                                excelData[rowJ, colI] = table.Rows[rowJ][colI];
                                //excelData[colI, rowJ] = "test";
                            }
                        }
                        //<Code to set startLoc and endLoc removed>

                        Range valRange = ExcelApp.get_Range(startLoc, endLoc);
                        valRange.Value2 = excelData;

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.