0

I'd like to use the following script to reduce a huge CSV file to a useful state, but it's eliminating the header row of the CSV. I understand from reading past questions/solutions here that I can use Select -Skip 1 or Select -Skip 1 to preserve the header, but I'm not sure how to integrate Select into my script gracefully. Do I need to start over with this or does someone have a simple solution?

$SourceFile = 'C:\Temp\Monthly_Report.CSV'
$Pattern = '.GBL|.aspx'

(Get-Content $SourceFile) | Where-Object {
    $_ -match $Pattern
} | Set-Content $SourceFile

This is the content of "Monthly_Report.CSV" before I run the script:

Monthy_Report.CSV

4
  • 1
    why not use import-csv ? Commented May 16, 2018 at 19:36
  • I might do that, Jimbo if I don't discover a simple addition. I'm hoping to use this script for a variety of things so wanted to cast a net first. Thanks! Commented May 16, 2018 at 19:48
  • @Carlos Jimbo is correct. This will treat your file like a text file and not a true CSV. Commented May 16, 2018 at 20:41
  • Just a hint: the dots in $Pattern will match any character in a RegEx. I'd manually escape them with a backslash. Commented May 16, 2018 at 20:47

3 Answers 3

1

You do not need to use the -skip parameter to preserve the header at all. I think you are misunderstanding it. The reason your headers are missing from the content is because it does not match your $Pattern variable and it's getting filtered out.

You need to do something like so:

$header = (Get-Content $SourceFile) | Select-Object -First 1
Write-Output $header 

$content = (Get-Content $SourceFile) | Where-Object { $_ -match $Pattern } 
$header + "`n" + $content | Set-Content $SourceFile
Sign up to request clarification or add additional context in comments.

9 Comments

G'day AussieJoe. Thanks for the response. I tried that solution earlier today, without success. The script gets the data, but the headers are gone. Just in case I've missed something, I tried using your script to no avail. Headers are wiped out. I guess I could just inject them in there with a import/export script, but I'm still hoping there's a simple add here.
@Carlos $header = (Get-Content $SourceFile) | Select-Object -First 1 will get your header row. I have tested this code and it works fine.
Thanks, AussieJoe, though it's sticking all the content in the header, or first row as well. Any fine tuning you can suggest would be appreciated. I'm trying to get it tweaked now.
@Carlos sorry, i thought I had fixed that...look at the updated answer now. Thanks!
Aw, so close, but still getting everything in the first row. I though the ~n was a great thought, though.
|
1

There are several solutions to your problem.

  • Use Import-Csv and Export-Csv, which will convert the input CSV to a list of objects and back.

    (Import-Csv $SourceFile) | Where-Object {
        $_.SomeProperty -match $Pattern
    } | Export-Csv $SourceFile -NoType
    

    This is arguably the cleanest approach, although not the most efficient one. The conversions make this slower than plain text processing. Still, this is the most readable code, so I'd recommend using this unless you encounter serious performance issues.

  • Since you're reading the entire file into memory anyway (due to Get-Content being in parentheses) you could just as well store the content in a variable and selectively write it back:

    $data = Get-Content $SourceFile
    
    $data | Select-Object -First 1 | Set-Content $SourceFile
    $data | Where-Object {
        $_ -match $Pattern
    } | Add-Content $SourceFile
    
  • The Where-Object scriptblock can contain not just conditions, but also other statements like assignment operations, so you could use a "first line" indicator like this:

    $script:firstline = $true
    (Get-Content $SourceFile) | Where-Object {
        $script:firstline -or $_ -match $Pattern
        $script:firstline = $false
    } | Set-Content $SourceFile
    
  • You could include a header match in your regular expression:

    $Pattern = '^UserID|.GBL|.aspx'
    
    (Get-Content $SourceFile) | Where-Object {
        $_ -match $Pattern
    } | Set-Content $SourceFile
    

    This feels rather hack-ish to me, though, so I wouldn't recommend actually doing this.

1 Comment

Thanks for the helpful advice, Ansgar Wiechers.
0

Here is a simple solution that uses multiple assignment to split the header from the body and relies on the fact that -match works on collections:

$SourceFile = 'C:\Temp\Monthly_Report.CSV'
$Pattern = '\.GBL|\.aspx'

$header, $body = Get-Content $SourceFile
$body =  @($body) -match $Pattern
$header, $body | Set-Content $SourceFile

1 Comment

Very nice solution! And quick too! Thanks, Bruce Paylette.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.