2

I would like to "clean" a CSV file:

  • deleting empty rows
  • deleting empty columns

The rows or columns are not completely empty, they have, for example: "","","","","","","","","","","","","","", (in a row form) OR "","","","","","","","","","", (in a row form) OR

"",

"",

"",

"",

"",

"",

"",

(in a columns form)

These rows or columns can be anywhere in the CSV file.

What I have so far:

private void button1_Click(object sender, EventArgs e)
        {

            string sourceFile = @"XXXXX.xlsx";
            string worksheetName = "Sample";
            string targetFile = @"C:\Users\xxxx\xls_test\XXXX.csv";

            // Creates the CSV file based on the XLS file
            ExcelToCSVCoversion(sourceFile, worksheetName, targetFile);

            // Manipulate the CSV: Clean empty rows
            DeleteEmptyRoadFromCSV(targetFile);
        }

        static void ExcelToCSVCoversion(string sourceFile, string worksheetName,
            string targetFile)
        {
            string connectionString = @"Provider =Microsoft.ACE.OLEDB.12.0;Data Source=" + sourceFile
                + @";Extended Properties=""Excel 12.0 Xml;HDR=YES""";
            OleDbConnection connection = null;
            StreamWriter writer = null;
            OleDbCommand command = null;
            OleDbDataAdapter dataAdapter = null;

            try
            {
                // Represents an open connection to a data source. 
                connection = new OleDbConnection(connectionString);
                connection.Open();

                // Represents a SQL statement or stored procedure to execute  
                // against a data source. 
                command = new OleDbCommand("SELECT * FROM [" + worksheetName + "$]",
                                            connection);
                // Specifies how a command string is interpreted. 
                command.CommandType = CommandType.Text;
                // Implements a TextWriter for writing characters to the output stream 
                // in a particular encoding. 
                writer = new StreamWriter(targetFile);
                // Represents a set of data commands and a database connection that are  
                // used to fill the DataSet and update the data source. 
                dataAdapter = new OleDbDataAdapter(command);

                DataTable dataTable = new DataTable();
                dataAdapter.Fill(dataTable);

                for (int row = 0; row < dataTable.Rows.Count; row++)
                {
                    string rowString = "";
                    for (int column = 0; column < dataTable.Columns.Count; column++)
                    {
                        rowString += "\"" + dataTable.Rows[row][column].ToString() + "\",";
                    }
                    writer.WriteLine(rowString);
                }

                Console.WriteLine();
                Console.WriteLine("The excel file " + sourceFile + " has been converted " +
                                  "into " + targetFile + " (CSV format).");
                Console.WriteLine();
            }
            catch (Exception exception)
            {
                Console.WriteLine(exception.ToString());
                Console.ReadLine();
            }
            finally
            {
                if (connection.State == ConnectionState.Open)
                {
                    connection.Close();
                }
                connection.Dispose();
                command.Dispose();
                dataAdapter.Dispose();
                writer.Close();
                writer.Dispose();
            }
        }

        static void DeleteEmptyRoadFromCSV(string fileName)
        {
            //string nonEmptyLines = @"XXXX.csv";
            var nonEmptyLines = File.ReadAllLines(fileName)
                        .Where(x => !x.Split(',')
                                     .Take(2)
                                     .Any(cell => string.IsNullOrWhiteSpace(cell))
                         // use `All` if you want to ignore only if both columns are empty.  
                         ).ToList();

        File.WriteAllLines(fileName, nonEmptyLines);
        }

Finally, I tried to use the ideas from: Remove Blank rows from csv c# . But my ouput is not changing at all.

Any help is welcome!

Thank you.

6
  • 2
    Why are you re-inventing the wheel? That's a lot of work when you could use a text file parser which would be much more robust also. Commented Jun 20, 2017 at 13:00
  • 2
    Also, File.ReadAllLines is probably dangerous unless you know for sure you are dealing with small files. Commented Jun 20, 2017 at 13:00
  • Linq can probably help here, where you can skip empty rows/columns. It may also help clean up your code a bit. Commented Jun 20, 2017 at 13:03
  • Hey @rory.ap, I will take a look at text file parser! thanks Commented Jun 20, 2017 at 13:10
  • 1
    Hi @GibralterTop, Actually, my file maximum 50 lines, this is why I choose ReadAllLines but I will keep in mind your tip. thank you Commented Jun 20, 2017 at 13:11

2 Answers 2

2

You could delete columns/rows from table before saving csv. Method is not tested, but you should get the concept.

 static void ExcelToCSVCoversion(string sourceFile, string worksheetName,
       string targetFile)
    {
        string connectionString = @"Provider =Microsoft.ACE.OLEDB.12.0;Data Source=" + sourceFile
            + @";Extended Properties=""Excel 12.0 Xml;HDR=YES""";
        OleDbConnection connection = null;
        StreamWriter writer = null;
        OleDbCommand command = null;
        OleDbDataAdapter dataAdapter = null;

        try
        {
            // Represents an open connection to a data source. 
            connection = new OleDbConnection(connectionString);
            connection.Open();

            // Represents a SQL statement or stored procedure to execute  
            // against a data source. 
            command = new OleDbCommand("SELECT * FROM [" + worksheetName + "$]",
                                        connection);
            // Specifies how a command string is interpreted. 
            command.CommandType = CommandType.Text;
            // Implements a TextWriter for writing characters to the output stream 
            // in a particular encoding. 
            writer = new StreamWriter(targetFile);
            // Represents a set of data commands and a database connection that are  
            // used to fill the DataSet and update the data source. 
            dataAdapter = new OleDbDataAdapter(command);

            DataTable dataTable = new DataTable();
            dataAdapter.Fill(dataTable);
            var emptyRows =
                dataTable.Select()
                    .Where(
                        row =>
                            dataTable.Columns.Cast<DataColumn>()
                                .All(column => string.IsNullOrEmpty(row[column].ToString()))).ToArray();
            Array.ForEach(emptyRows, x => x.Delete());

            var emptyColumns =
                dataTable.Columns.Cast<DataColumn>()
                    .Where(column => dataTable.Select().All(row => string.IsNullOrEmpty(row[column].ToString())))
                    .ToArray();
            Array.ForEach(emptyColumns, column => dataTable.Columns.Remove(column));
            dataTable.AcceptChanges();

            for (int row = 0; row < dataTable.Rows.Count; row++)
            {
                string rowString = "";
                for (int column = 0; column < dataTable.Columns.Count; column++)
                {
                    rowString += "\"" + dataTable.Rows[row][column].ToString() + "\",";
                }
                writer.WriteLine(rowString);
            }

            Console.WriteLine();
            Console.WriteLine("The excel file " + sourceFile + " has been converted " +
                              "into " + targetFile + " (CSV format).");
            Console.WriteLine();
        }
        catch (Exception exception)
        {
            Console.WriteLine(exception.ToString());
            Console.ReadLine();
        }
        finally
        {
            if (connection.State == ConnectionState.Open)
            {
                connection.Close();
            }
            connection.Dispose();
            command.Dispose();
            dataAdapter.Dispose();
            writer.Close();
            writer.Dispose();
        }
    }
Sign up to request clarification or add additional context in comments.

3 Comments

Hey This sounds a good idea but I was not able to run. On the part: emptyRows.ForEach(x => x.Delete()); and emptyColumns.ForEach(column => dataTable.Columns.Remove(column)); I keep having the error: "There is no argument that corresponds to the required formal parameter 'action' of 'Array.ForEach,T>(T[], Action <T>)'" -> Exceptions: ArgumentNullExpection I tried, with no success: foreach (var etRows in emptyRows) { (x => x.Delete()); }
I have updated ForEach statements. Please check now.
It is working! I had the idea that working with the XLS, instead of CSV, would be much more complicated but with your solution, it is working! I really appreciate!
0

Please check if the following query is working.I am getting all the rows:

var nonEmptyLines = File.ReadAllLines(FileName)
                        .Where(x => !x.Split(',')
                                     .Take(2)
                                     .Any(cell => string.IsNullOrWhiteSpace(cell))
                         // use `All` if you want to ignore only if both columns are empty.  
                         ).ToList();

I think you can use something as:

  var nonEmptyLines = File.ReadAllLines(File).
                        SkipWhile(cell=>{var arr=cell.Split(',');if(string.IsNullOrWhiteSpace(cell)){
                            return true;
                        }
                            else
                        {
                            return false;
                        }
                        });

1 Comment

Hey I tried both; They are executing fine but the output (CSV) remains the same. The debugging can be seen: drive.google.com/drive/folders/… (Sorry, but my image attachments never work here)

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.