4

I have created a PowerShell function that bulk copies data from a .csv file (first row is the header), and inserts the data in to a SQL Server database table.

See my code:

function BulkCsvImport($sqlserver, $database, $table, $csvfile, $csvdelimiter, $firstrowcolumnnames) {
    Write-Host "Bulk Import Started." 
    $elapsed = [System.Diagnostics.Stopwatch]::StartNew()  
    [void][Reflection.Assembly]::LoadWithPartialName("System.Data") 
    [void][Reflection.Assembly]::LoadWithPartialName("System.Data.SqlClient") 

    # 50k worked fastest and kept memory usage to a minimum 
    $batchsize = 50000 

    # Build the sqlbulkcopy connection, and set the timeout to infinite 
    $connectionstring = "Data Source=$sqlserver;Integrated Security=true;Initial Catalog=$database;" 

    # Wipe the bulk insert table first
    Invoke-Sqlcmd -Query "TRUNCATE TABLE $table" -ServerInstance $sqlserver -Database $database

    $bulkcopy = New-Object Data.SqlClient.SqlBulkCopy($connectionstring, [System.Data.SqlClient.SqlBulkCopyOptions]::TableLock) 
    $bulkcopy.DestinationTableName = $table 
    $bulkcopy.bulkcopyTimeout = 0 
    $bulkcopy.batchsize = $batchsize 

    # Create the datatable, and autogenerate the columns. 
    $datatable = New-Object System.Data.DataTable 

    # Open the text file from disk 
    $reader = New-Object System.IO.StreamReader($csvfile) 
    $columns = (Get-Content $csvfile -First 1).Split($csvdelimiter) 

    if ($firstrowcolumnnames -eq $true) { $null = $reader.readLine() } 

    foreach ($column in $columns) {  
        $null = $datatable.Columns.Add() 
    } 

    # Read in the data, line by line 
    while (($line = $reader.ReadLine()) -ne $null)  { 
        $null = $datatable.Rows.Add($line.Split($csvdelimiter)) 

        $i++; 
        if (($i % $batchsize) -eq 0) {  
            $bulkcopy.WriteToServer($datatable)  
            Write-Host "$i rows have been inserted in $($elapsed.Elapsed.ToString())." 
            $datatable.Clear()  
        }  
    }  

    # Add in all the remaining rows since the last clear 
    if($datatable.Rows.Count -gt 0) { 
        $bulkcopy.WriteToServer($datatable) 
        $datatable.Clear() 
    } 

    # Clean Up 
    $reader.Close(); 
    $reader.Dispose() 

    $bulkcopy.Close(); 
    $bulkcopy.Dispose() 

    $datatable.Dispose() 

    Write-Host "Bulk Import Completed. $i rows have been inserted into the database." 
    # Write-Host "Total Elapsed Time: $($elapsed.Elapsed.ToString())" 
    # Sometimes the Garbage Collector takes too long to clear the huge datatable. 
    $i = 0
    [System.GC]::Collect()
}

I am looking to modify the above though so that the column names in the .csv file match up with the column names in the SQL Server database table. They should be identical. At the moment the data is being imported in to the incorrect database columns.

Could I get some assistance as what I need to do to modify the above function to achieve this?

4
  • 1
    So much code for what could be easily done using bulk insert. sqlservercentral.com/articles/… Commented May 1, 2019 at 18:00
  • or just import flat file in ssms. however, the problem is likely that the ordinals of the table columns do not match those in the csv file. Also you're not specifying a format file to bulk insert, so you're asking for trouble. Commented May 1, 2019 at 18:19
  • @user1443098 SSMS/SSIS may not be always the option. What is author want to automate it using PS? Commented May 1, 2019 at 18:23
  • Forgot to install the SqlServer module via Install-Module SqlServer -Scope CurrentUser and was getting "The term 'Invoke-Sqlcmd' is not recognized as the name of a cmdlet" when trying to use Invoke-Sqlcmd. Commented Dec 4, 2019 at 14:47

2 Answers 2

3

I would use existing open source solution:

Import-DbaCsv - dbatools.io

Import-DbaCsv.ps1

Efficiently imports very large (and small) CSV files into SQL Server.

Import-DbaCsv takes advantage of .NET's super fast SqlBulkCopy class to import CSV files into SQL Server.


Parameters:

-ColumnMap

By default, the bulk copy tries to automap columns. When it doesn't work as desired, this parameter will help.

PS C:\> $columns = @{
>> Text = 'FirstName'
>> Number = 'PhoneNumber'
>> }
PS C:\> Import-DbaCsv -Path c:\temp\supersmall.csv 
        -SqlInstance sql2016 -Database tempdb -ColumnMap $columns 
        -BatchSize 50000 -Table table_name -Truncate

The CSV column 'Text' is inserted into SQL column 'FirstName' and CSV column Number is inserted into the SQL Column 'PhoneNumber'. All other columns are ignored and therefore null or default values.

Sign up to request clarification or add additional context in comments.

Comments

0

I've used the below code and it is very fast

param (
[Parameter(Mandatory=$true)][string]$location,
[Parameter(Mandatory=$true)][string]$server,
[Parameter(Mandatory=$true)][string]$database,
[string]$schemaname 
)

Function BulkInsertCSVFiles($location, $server, $database, $schemaname)
{
###Adding header to log file
    $message = "################################Starting the CSV file load from 
" +$location +"################################"
    Set-Content -Path '.\csvInjestionLog.log' -Value $message
    $message = "Lastupdated                 Message"
    Add-Content  -Path '.\csvInjestionLog.log' -Value $message 

###loop through each file###

    $Files = Get-ChildItem -Path $location -Filter *.csv* -file
    ForEach ($file in $Files) 
    {
### get full path and file name     
    $full = $location +"\"+$file.name
### the the header row of the csv file
    $all = Get-Content $full
    $columns = $all[0]
    $columns = $columns.Replace(" ","") -replace '"',''
    $colname = $columns
    $shortfile=$file.name
    $shortfile = $shortfile.Substring(0,$shortfile.Length-4)
    $isWindows="0"
    $contents = [System.IO.File]::ReadAllText($full)
    if ($contents -cmatch '\r\n') {
        $isWindows="1"
    }
    $shortfile = $shortfile -replace " ",""
    $tblname = "tbl_"+$shortfile
    $columns = $columns.Replace(",","] VARCHAR(max), [")
###creating table for csv file
    $CreateTable = "CREATE TABLE " +$schemaname + "." + $tblname  + "([" + 
$columns + "] VARCHAR(max));"
    try {
        Invoke-Sqlcmd -ServerInstance $server -Database $Database -Query 
$CreateTable
        $message = $(Get-Date -format 'u') +"        The table " + $schemaname 
+ "." + $tblname  + " has been created"
        Add-Content -Path '.\csvInjestionLog.log' -Value $message
    }
    catch {
        $message = $(Get-Date -format 'u') + "        The table " + $schemaname 
+ "." + $tblname  + " creation failed"
        Add-Content -Path '.\csvInjestionLog.log' -Value $message
    
    }
### Bulk insert csv to table
    if($isWindows -eq "0")
        {
         $bulkinset = "BULK INSERT " + $schemaname + "."+ $tblname + " 
            FROM '" + $full  +"' 
            WITH (
            FORMAT ='CSV'
            ,FIELDQUOTE = '`"'
            ,FIELDTERMINATOR = ','
            ,ROWTERMINATOR = '0x0a'
            ,FIRSTROW=2
            );"
        }
     if($isWindows -eq "1") 
        {
         $bulkinset = "BULK INSERT " + $schemaname + "."+ $tblname + " 
            FROM '" + $full + "' 
            WITH (
            FORMAT ='CSV'
            ,FIELDQUOTE = '`"'
            ,FIELDTERMINATOR = ','
            ,ROWTERMINATOR = '\n'
            ,FIRSTROW=2
            );"
        }
    try {
        $message = $(Get-Date -format 'u') +"        The file " + $full + " 
started to load to the table " + $schemaname + "." + $tblname
        Add-Content -Path '.\csvInjestionLog.log' -Value $message
        Invoke-Sqlcmd -ServerInstance $server -Database $Database -Query 
$bulkinset
        $message = $(Get-Date -format 'u') +"        The file " + $full + " has 
been loaded to the table " + $schemaname + "." + $tblname
        Add-Content -Path '.\csvInjestionLog.log' -Value $message
    }
    catch {
        $message = $(Get-Date -format 'u') +"        The file " + $full + " 
failed to loaded to the table " + $schemaname + "." + $tblname
        Add-Content -Path '.\csvInjestionLog.log' -Value $message
    
    }
    
    ###alter columns to fit the max length of each column
    $message = $(Get-Date -format 'u') +"        The column width in the table 
" + $schemaname + "." + $tblname  + " started to be corrected"
    Add-Content -Path '.\csvInjestionLog.log' -Value $message

    $columnnames = $colname.Split(",")
    Foreach ($columnname in $columnnames)
    {
        $Query = "SELECT isnull(max(LEN(" + $columnname +")),49) as 
columnlength FROM " + $Schemaname + "."+ $tblname 
        $maxlen = Invoke-Sqlcmd -ServerInstance $server -Database $Database - 
Query $Query
        $num = $maxlen.columnlength +1
        $altercolumn = "Alter table " + $Schemaname + "." + $tblname + "  ALTER 
Column " + $columnname+ " varchar(" + $num + ")"
        Invoke-Sqlcmd -ServerInstance $server -Database $Database -Query 
$altercolumn
    }
    ######
    $message = $(Get-Date -format 'u') +"        The column width in the table 
" + $schemaname + "." + $tblname  + " have been corrected"
    Add-Content -Path '.\csvInjestionLog.log' -Value $message
}

}
BulkInsertCSVFiles -location $location -server $server -database $database - 
schemaname $schemaname

1 Comment

Thank you for contributing to the Stack Overflow community. This may be a correct answer, but it’d be really useful to provide additional explanation of your code so developers can understand your reasoning. This is especially useful for new developers who aren’t as familiar with the syntax or struggling to understand the concepts. Would you kindly edit your answer to include additional details for the benefit of the community?

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.