I receive Excel spreadsheets in a specific format that includes two columns that have the same column header name - which I do not need but the format of the spreadsheet will not change so I have to work around these two columns.
I am trying to use PowerShell and the Import-Excel module to select only the columns I need and then export to a new spreadsheet so that an SSIS process can pick them up.
I feel like I am almost there but there is something in my script that is importing the columns that should be excluded for some reason. Looking at my screenshots below, it looks like it is trying to bring in the header from the spreadsheet and I am not sure how to get around that. I did try to add a -NoHeader tag on the export but that did not work.
The PS script:
Clear-Host
$SourceFileDirectory = "C:\STLFeeReport\Test\"
$CurrentDate = Get-Date -Format "yyyyMMdd"
$TestFile = "Test2"
$ExcelExt = ".xlsx"
$ExcelFiles = Get-ChildItem $SourceFileDirectory -Filter *.xlsx
foreach ($file in $ExcelFiles)
{
$ImportFile = -JOIN($SourceFileDirectory,$file)
$DestinationFile = -JOIN($SourceFileDirectory,$TestFile,"_",$CurrentDate,$ExcelExt)
Write-Host $ImportFile
Write-Host $DestinationFile
$data = Import-Excel -Path $ImportFile -HeaderName "CompanyID", "CompanyName", "CreateDate", "FileName", "ReferenceNumber" | Select-Object "CompanyID", "CompanyName", "CreateDate", "FileName", "ReferenceNumber"
Write-Host $data
Import-Excel -Path $ImportFile -HeaderName "CompanyID", "CompanyName", "CreateDate", "FileName", "ReferenceNumber" | Select-Object "CompanyID", "CompanyName", "CreateDate", "FileName", "ReferenceNumber" | Export-Excel -Path $DestinationFile | Select-Object "CompanyID", "CompanyName", "CreateDate", "FileName", "ReferenceNumber"
}
This is what the data in the original spreadsheet looks like:
And this is what the data looks like after the script has been run:
Below is what my desired output should be:


