Converting CSV from Data Table to Key Value Pairs in Powershell

Question

I'm working on a script that will need to process the same type of file, but with different content at different times. I have a CSV file that looks something like the example below. Not every field may contain a value.

record,title,creator,date,subject,location 0,Title1,Creator1,2018-08-17,Subject1,Location1 1,Title2,Creator2,2018-08-17,,Location1 2,Title3,Creator3,,Subject2,Location2

I need to convert this CSV from a data table to a list of key-value pairs, per record, ONLY if there is a value present. The header will be kind of generic, with field,value repeating for each key-value pair in the rows. For example:

record,field,value,field,value,field,value,field,value,field,value 0,title,Title1,creator,Creator1,date,2018-08-17,subject,Subject1,location,Location1 1,title,Title2,creator,Creator2,date,2018-08-17,location,Location1,,, 2,title,Title3,creator,Creator3,subject,Subject2,location,Location2,,,

I can read the CSV in with Import-CSV, but I'm having a hard time changing the structure. Every path I've tried to go down leads no where, same with searching for solutions. At this point, it seems like it might be easiest to build the new CSV manually, but that didn't seem right, so I thought I'd ask here. Can anyone point me in the right direction?

I can find a lot of CSV, hashtable, and key-value pair questions on StackOverflow, but nothing quite like this.

You have an inconsistency in your second CSV example. You show key/value for everything except 'record'. Is that on purpose? — TheMadTechnician
– TheMadTechnician, Commented Aug 17, 2018 at 22:35
Yes, that's intentional. The record column is an identifier for a digital object and the key/values are metadata properties. I have to convert a data table exported from one system into a set of key/value pairs but still associated with one record. (And in CSV.) — Nathan
– Nathan, Commented Aug 18, 2018 at 2:48

TheMadTechnician · Accepted Answer · 2018-08-17 23:28:45Z

1

I think you misunderstand how Import-Csv works. It does not create a hashtable, it creates an array of objects. Each object will have a set of properties, as defined by the header. Since the data was imported from a CSV it simplifies things by guaranteeing that each object has the same properties (they may not have values, but the properties exist and are identical). Because of that we can get a list of the properties of the first object as a baseline set, then loop through each record and build a string for each record based on that baseline. As you suggested we would be building the CSV manually.

$DataIn = Import-Csv C:\Path\To\File.csv
$Props = $DataIn[0].psobject.properties.name
$DataOut = ('record,'+$((2..$Props.Count|%{'field,value'}) -join ',')),$(For($i=0;$i -lt $DataIn.count;$i++){
    [array]$tmpRecord = Switch($Props){
        'Record' {$DataIn[$i].record;continue}
        {[string]::IsNullOrEmpty($DataIn[$i].$_)} {continue}
        default {'{0},{1}' -f $_, $DataIn[$i].$_}
    }
    If(($tmpDiff = $Props.count - $tmpRecord.count) -gt 0){$tmpRecord += ','*($tmpDiff*2-1)}
    $tmpRecord -join ','
})
$DataOut | Set-Content C:\Path\To\Output.csv

So that does exactly what I suggested, while retaining your example output of not doing key/value for the record column. The switch checks against each potential property, and if it is the 'record' property it just outputs the record value and continues on to the next property. If it is is anything else it checks to see if that property is blank, and if so it moves to the next property. If it isn't blank it outputs field,value, and then all of those outputs (record, and any field/value combo) are joined by commas into a single line per record. It also adds on extra commas for fields that are null. Each record's line is collected in $DataOut, along with a calculated header line.

Mind you, PowerShell will not want to read that file in with Import-Csv because of the duplicate columns since the header row is mainly 'field,value' repeated over and over. I assume that you are saving in this format for some external program that needs that format for input.

answered Aug 17, 2018 at 23:28

TheMadTechnician

36.5k3 gold badges48 silver badges63 bronze badges

Sign up to request clarification or add additional context in comments.

4 Comments

Nathan Over a year ago

Thank you. I misspoke when I used hashtable before and should have said data table. I wasn't referring to how import-csv parses the data, but how the data is modeled in the original CSV itself. This solution sounds like it will work, thank you. I look forward to testing it. The CSV will be used for another program and the repeated header may be problematic, but I'm hoping it will just get skipped over. I'll report back.

Nathan Over a year ago

Huzzah, it works as described! I didn't need to have add 'record,+ into $DataOut since that column came in with $Props.

Nathan Over a year ago

I do have a secondary problem though. Some of the values are double-quoted and contain spaces and commas, so the splitting into key/value pairs is incorrect. For example, " the key/value of Title/"Chapter 4, Page 1" is coming out Title/Chapter 4 and Page 1/Format. (Format/book is the next key/value.)

Nathan Over a year ago

Got it, [char]34 +!

Collectives™ on Stack Overflow

Converting CSV from Data Table to Key Value Pairs in Powershell

1 Answer 1

4 Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

1 Answer 1

4 Comments

Your Answer

Sign up or log in

Post as a guest

Related