1

So I’ve had a a request to edit a csv file by replacing column values with a a set of unique numbers. Below is a sample of the original input file with a a header line followed by a couple of rows. Note that the rows have NO column headers.

HH ### SLH304 01100001 2 20200606 20200605 011100
DD|GZFD|IE|USD|20200605|EUR200717||
DD|GZFD|IE|USD|20200605|EUR200717||

What I’m looking to do is change say the values in column 3 with a unique number.

So far I have the following …

$i=0

$txtin = Get-Content "C:\Temp\InFile.txt" | ForEach {"$($_.split('|'))"-replace $_[2],$i++} |Out-File C:\Temp\csvout.txt

… but this isn’t working as it removes the delimiter and adds numbers in the wrong places …

HH0###0000000SLH30400110000100000002000000202006060202006050011100
1D1D1 1G1Z1F1D1 1I1E1 1U1S1D1 12101210101610151 1E1U1R1210101711171 1 1
2D2D2 2G2Z2F2D2 2I2E2 2U2S2D2 22202220202620252 2E2U2R2220202721272 2 2

Ideally I want it to look like this, whereby the values of 'IE' have been replaced by '01' and '02' in each row ...

HH ### SLH304 01100001 2 20200606 20200605 011100

DD|GZFD|01|USD|20200605|EUR200717||

DD|GZFD|02|USD|20200605|EUR200717||

Any ideas on how to resolve would be much appreciated.

1
  • Theo's answer is an effective solution. Your primary problem is that you're stringifying the array that the .Split() method call creates, which results in a single string in which the elements are space-separated (by default); therefore, use just $_.split('|') instead of "$($_.split('|'))", but note that this still won't work as intended if there is another column with the same value as the 3rd one. Commented Jun 11, 2020 at 12:54

1 Answer 1

2

I think by spreading this out to multiline code will make it easier:

$txtin = Get-Content 'C:\Temp\InFile.txt'
# loop through the lines, skipping the first line
for ($i = 1; $i -lt $txtin.Count; $i++){
    $parts = $txtin[$i].Split('|')      # or use regex -split '\|'
    if ($parts.Count -ge 3) {
        $parts[2] = '{0:00}' -f $i      # update the 3rd column
        $txtin[$i] = $parts -join '|'   # rejoin the parts with '|'
    }
}

$txtin | Out-File -FilePath 'C:\Temp\csvout.txt'

Output will be:

HH ### SLH304 01100001 2 20200606 20200605 011100
DD|GZFD|01|USD|20200605|EUR200717||
DD|GZFD|02|USD|20200605|EUR200717||

Updated to use the more robust check suggested by mklement0. This avoids errors when the line does not have at least three parts in it after the split

Sign up to request clarification or add additional context in comments.

4 Comments

@BobbyS To update the 3rd column, you need to use $parts[2], as indices count from zero. We start the loop at row index 1, because the first line in the file apparently is extra info about the content, not a data row as such. Check the larger file you have to see if no other lines are present before the data rows. If more lines without delimniter character are present, you should skip those too.
@BobbyS I have added an extra check to see if the line has the delimiter character, before trying to split it.
Perhaps the most robust way to prevent potential index-out-of-bounds errors is to test for if ($parts.Count -ge 3)
Thanks very much for your help guys, very helpful info!

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.