14

I have a log file that is formatted as a CSV with no headers. The first column is basically the unique identifier for the issues being recorded. There may be multiple lines with different details for the same issue identifier. I would like to remove lines where the first column is duplicated because I don't need the other data at this time.

I have fairly basic knowledge of PowerShell at this point, so I'm sure there's something simple I'm missing.

I'm sorry if this is a duplicate, but I could find questions to answer some parts of the question, but not the question as a whole.

So far, my best guess is:

Import-Csv $outFile | % { Select-Object -Index 1 -Unique } | Out-File $outFile -Append

But this gives me the error:

Import-Csv : The member "LB" is already present. At C:\Users\jnurczyk\Desktop\Scratch\POImport\getPOImport.ps1:6 char:1 + Import-Csv $outFile | % { Select-Object -InputObject $_ -Index 1 -Unique } | Out ... + ~~~~~~~~~~~~~~~~~~~ + CategoryInfo : NotSpecified: (:) [Import-Csv], ExtendedTypeSystemException + FullyQualifiedErrorId : AlreadyPresentPSMemberInfoInternalCollectionAdd,Microsoft.PowerShell.Commands.ImportCsvCommand

2
  • Post what you've got. Commented Dec 11, 2013 at 17:59
  • Using the foreach loop means you are selecting each row out of one row. Try removing the %{ }. Commented Dec 11, 2013 at 19:08

3 Answers 3

32

Because your data has no headers, you need to specify the headers in your Import-Csv cmdlet. And then to select only unique records using the first column, you need to specify that in the Select-Object cmdlet. See code below:

Import-Csv $outFile -Header A,B,C | Select-Object -Unique A

To clarify, the headers in my example are A, B, and C. This works if you know how many columns there are. If you have too few headers, then columns are dropped. If you have too many headers, then they become empty fields.

Sign up to request clarification or add additional context in comments.

4 Comments

I tried something similar, but couldn't figure out where to put the header name (A in your example). The only issue I have is that it appends a bunch of spaces to the end of each line when I output to a file. While annoying, I don't mind this much.
Try using trim() to remove spaces.
trim() doesn't work on non-string objects. Might figure out how to make it interpret the output of Select-Object as a string.
Not sure where your whitespace is coming from, but check this out: stackoverflow.com/questions/17180955/…
3

Every time I look for a solution to this issue I run across this thread. However the solution accepted here is more generic that I would like. The function below Increments each time it sees the same header name: A, B, C, A1 D, A2, C1 etc.

Function Import-CSVCustom ($csvTemp) {
    $StreamReader = New-Object System.IO.StreamReader -Arg $csvTemp
    [array]$Headers = $StreamReader.ReadLine() -Split "," | % { "$_".Trim() } | ? { $_ }
    $StreamReader.Close()

    $a=@{}; $Headers = $headers|%{
        if($a.$_.count) {"$_$($a.$_.count)"} else {$_}
        $a.$_ += @($_)
    }

    Import-Csv $csvTemp -Header $Headers
}

2 Comments

Just mentioning that System.IO.StreamReader requires .NET.
Just a heads up, this solution isn't compliant with RFC-4180. It will break if the headers are escaped and contain commas. tools.ietf.org/html/rfc4180
0

To expand upon Benjamin Hubbard's post here is a little Sql Script (assuming that you will be inserting this data into a table in a database of course!) I use to create the header property in my script:

SELECT
        '-Header '
            + STUFF((SELECT
                    ',' + QUOTENAME(COLUMN_NAME, '"')
                    + CASE WHEN C.ORDINAL_POSITION % 5 = 0 THEN ' `' + CHAR(13) + CHAR(10) ELSE '' END
                FROM 
                    INFORMATION_SCHEMA.COLUMNS C
                WHERE
                    TABLE_NAME = '<Staging Table Name>'
            FOR XML PATH (''), type).value('.', 'nvarchar(max)'), 1, 1, '')

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.