A few considerations
In my example, I'm using a CSV instread of an Excel spreadsheet, just to simplify things. No matter which you use, ensure you use a library to handle to parsing of the file (eg. Using string.Split(',') for CSVs would not properly handle all possible variations in a CSV.
For any library you use, for my algorithm, things would be made easier if you can access any given cell using x,y co-ordinates.
I'm no expert on determining column types based on data like this, maybe there's a fool-proof way of doing so. But it occurs to me that this is always going to be pretty shaky and incarruate. Take for example, converting from INT vs. BIGINT. The value you determine the type based on might only be 123, but the person who filled out that spreadsheet might except to be able to input values like 123456784305, but you won't know if you're converting based on 123.
The same goes for specifying maximum lengths or using VARCHAR vs. TEXT. It's difficult to do so without first iterating over the ENTIRE record set, and determining the maximum possible values for each column.
Based on the last two points, I would say it's going to end up being easier to store everything as VARCHAR in order to keep your table flexible, then convert things at runtime.
A possible solution
Happy to provide further details here, but hopefully the code + comments explains things well enough.
class Program
{
static void Main(string[] args)
{
string fileName = "Products.csv";
/* First, put all of our lines and columns into a list...
* My code assumes that any library you use for this would allow you to access a specific cell using x,y co-ordinatates.
*/
List<List<string>> lines = new List<List<string>>();
using (StreamReader csvReader = new StreamReader(fileName))
{
string line;
while ((line = csvReader.ReadLine()) != null)
{
List<string> columns = new List<string>(line.Split(','));
lines.Add(columns);
}
}
/* Now, iterate through each line.
* 1) Break it into a further list of the colums for that line
* 2) If this is the first line, assume we have headers, which will be the names of the table columns
* 3) Check the second row of that column and determine it's data type. If the value is empty, then go to the next row and check that.
* 4) Keep checking down the rows for each column until you find a value that can determine the data type.
* 5) Use this information to write out the appropriate CREATE TABLE command, and execute.
*/
//Use this dictionary to keep track of the type of each column later on.
Dictionary<string, string> tableTypes = new Dictionary<string, string>();
StringBuilder tableQuery = new StringBuilder("CREATE TABLE " + getTableName(fileName) + " (");
for (int row = 0; row < lines.Count; row++)
{
List<string> currentColumns = lines[row];
for (int column = 0; column < currentColumns.Count; column++)
{
//If this is the first row, need to determine the table structure for this column.
if (row == 0)
{
string columnName = currentColumns[column];
//Now check the same column for the row below, and try to determine it's type.
for (int checkRow = 1; checkRow < lines.Count; checkRow++)
{
string typeValue = getType(lines[checkRow][column]);
if (typeValue != null)
{
//Found a valid type for this column, add to query.
tableQuery.Append(columnName + " " + typeValue);
if (column < (currentColumns.Count - 1))
{
tableQuery.Append(",");
}
else
{
tableQuery.Append(")");
//We're done building the query... Execute it.
Console.WriteLine("Creating new table: " + tableQuery.ToString());
}
tableTypes.Add(columnName, typeValue);
//Stop looking for a non-empty value...
break;
}
}
}
//We're not in the first row anymore, use the dictionary created above to determine what column names to put into your INSERT queries.
else
{
//Insert the rest of the rows here...
}
}
}
Console.ReadLine();
}
/// <summary>
/// This method will determine the Type of an object based on a string value
/// </summary>
/// <param name="Value">The string representation of a value. Eg. "1", "1.0", "foo", etc</param>
/// <returns></returns>
static string getType(string Value)
{
if (String.IsNullOrEmpty(Value) || String.IsNullOrWhiteSpace(Value))
{
return null;
}
long longValue;
decimal decimalValue;
if (Int64.TryParse(Value, out longValue))
return "BIGINT";
if (Decimal.TryParse(Value, out decimalValue))
return "DECIMAL";
//If none of the above worked, just return the type of string
return "NVARCHAR";
}
static string getTableName(string Value)
{
string[] values = Value.Split('.');
string name = values[0];
name = name.Replace(" ", "");
return name;
}
}