I am working on some functionality in my application that queries our database and pulls the data to one datatable, then opens an excel file and populates another datatable.
Because the excel file contains no usable ID, I cannot sort the data, and probably cannot use DataTable.Merge().
Here is the code for the matching algorithm I have created.
private void RunMatchingAlgorithm()
{
// Initialize variables
string partNumber = "";
DateTime expiration_date = DateTime.Now;
decimal contract_cost = 0;
string contract_no = "";
string partNumber2 = "";
DateTime expiration_date2 = DateTime.Now;
decimal contract_cost2 = 0;
string contract_no2 = "";
//Get values from DataBase
for (int i = 0; i < dtFromTableContracts.Rows.Count; i++)
{
partNumber2 = dtFromTableContracts.Rows[i]["supplier_part_no"].ToString();
contract_no2 = dtFromTableContracts.Rows[i]["contract_no"].
expiration_date2 = Convert.ToDateTime(dtFromTableContracts.Rows[i]["con_end_date"]).Date;
//Get Values from converted Excel table
for (int j = 0; j < dtConversion.Rows.Count; j++)
{
contract_no = dtConversion.Rows[j]["vend_contract_no"].ToString();
//If we have even a partial match, check for a part number match
if (contract_no2.StartsWith(contract_no))
{
partNumber = dtConversion.Rows[j]["vend_item_id"].ToString();
//If the values match, populate from both tables
if (partNumber == partNumber2)
{
dtConversion.Rows[j]["wpd_expiration_date"] = expiration_date2.Date;
dtConversion.Rows[j]["wpd_cont_cost"] = dtFromTableContracts.Rows[i]["contract_cost"];
dtConversion.Rows[j]["wpd_contract_no"] = dtFromTableContracts.Rows[i]["contract_no"];
dtConversion.Rows[j]["wpd_item_id"] = dtFromTableContracts.Rows[i]["supplier_part_no"];
dtConversion.Rows[j]["wpd_item_no"] = dtFromTableContracts.Rows[i]["item_id"];
dtConversion.Rows[j]["discontinued"] = dtFromTableContracts.Rows[i]["discontinued"];
dtConversion.Rows[j]["job_no"] = dtFromTableContracts.Rows[i]["job_no"];
}
}
}
}
}
If you're curious, a later method removes any unmatched lines and we display only the matched records in a DGV.
This currently works as expected but if my Big O notation is correct, i'm dealing with O(m*n) which gets quite slow with larger data sets, and is extremely processor intensive.
I am looking for a more efficient way to accomplish this than looping over every single row as some of the excel spreadsheets we work with are close to 40,000 rows. This algorithm takes about 6 minutes to complete with that size of a set.
contract_nois faster than this approach. Note: Sorting is O(n log n)