1

I need to split a large file upload into many parallel processes and want to use a single CSV file as input. Is it possible to access blocks of rows from an Import-Csv object, something like this:

$SODAData = Import-Csv $CSVPath -Delimiter "|" |
            Where $_.Rownum == 20,000..29,999 | 
            Foreach-Object { ... }

What is the syntax for such an extraction? I'm using Powershell 5.

1 Answer 1

1

Import-Csv imports the file as an array of objects, so you could do something like this (using the range operator):

$csv = Import-CSv $CSVPath -Delimiter '|'
$SOAData = $csv[20000..29999] | ForEach-Object { ... }

An alternative would be to use Select-Object:

$offset = 20000
$count  = 10000
$csv = Import-Csv $CSVPath -Delimiter '|'
$SODAData = $csv |
            Select-Object -Skip $offset -First $count |
            ForEach-Object { ... }

If you want to avoid reading the entire file into memory you can change the above to a single pipeline:

$offset = 20000
$count  = 10000
$SODAData = Import-Csv $CSVPath -Delimiter '|' |
            Select-Object -Skip $offset -First $count |
            ForEach-Object { ... }

Beware, though, that with this approach you need to read the file multiple times for processing multiple chunks of data.

Sign up to request clarification or add additional context in comments.

4 Comments

@Barry Remember that indexes are zero-based, so row 2000 in excel is 1999 in the array. :-)
Excellent! Works perfectly! Thx @Ansgar
Strangely, though, @Frode, the first row that was extracted from $csv[20..29] was row 22? There was a header in the csv file...
That's correct. The header line becomes the names of the object properties when using Import-Csv, so index 20 is data row 21 or line 22.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.