0

I have some array with lot of rows (thousands and more). It have columns like 'ColumnXXXXX'

Row example:

Group_name = "proxy_users"
Column1   = "domain\Igor"
Column2   = null
.......
Column989 = 'domain\Andrew'
Column999 = 'domain\Mike'

What is the right and FAST way to create new variable that will be the sum of 'ColumnXXX' ignoring 'null' values?

Like "domain\igor, domain\Andrew, domain\mike"

I can use smth like $group | select -Property "Column*"...but how to sum and how to ignore null ?

2
  • Is it actually $null or a "null" string-value? Commented Feb 18, 2018 at 12:34
  • "null" string-value Commented Feb 18, 2018 at 12:51

2 Answers 2

2

You can list all properties using ex. $_.psobject.properties, filter out the ones you want and use -join to combine the values. Ex

$o = [pscustomobject]@{
    Group_Name = "Test"
    Column1 = "Foo"
    Column2 = $null
    Column3 = "1"
}, [pscustomobject]@{
    Group_Name = "Test2"
    Column1 = $null
    Column2 = "Bar"
    Column3 = "2"
}

#Add property as constant
$o | ForEach-Object {
    $_ | Add-Member -NotePropertyName Summed -NotePropertyValue (($_.psobject.Properties | Where-Object { $_.Name -ne 'Group_name' -and $_.Value -ne 'null' } | Select-Object -ExpandProperty Value) -join '' )
}
$o | ft

Or you can use a ScriptProperty to calculate the value on every call

#Remember to exclude itself to avoid infinite recursion
$o | Add-Member -MemberType ScriptProperty -Name Summed -Value {($this.psobject.Properties | Where-Object { $_.Name -ne 'Group_name' -and $_.Name -ne 'Summed' -and $_.Value -ne 'null' } | Select-Object -ExpandProperty Value) -join '' }
$o | ft

Result:

Group_Name Column1 Column2 Column3 Summed
---------- ------- ------- ------- ------
Test       Foo             1       Foo1
Test2              Bar     2       Bar2

As an alternative, you can use $_.Name -like 'Column*' -and $_.Value -ne 'null' as the property-filter if they are actually called ColumnXXX.

Sign up to request clarification or add additional context in comments.

Comments

0

To complement Frode F.'s elegant ScriptProperty-based solution with a more concise PSv4+ version that also performs better due to avoiding the use of cmdlets (a pipeline) in the property-definition script block in favor of operators (with a large number of input objects, this may matter):

$o | Add-Member -PassThru -MemberType ScriptProperty -Name Joined -Value { 
 @($this.psobject.Properties.Where({$_.Name -match 'Column*'}).Value) -ne 'null' -join ', ' 
}

Note the use of:

  • The .Where() collection operator (PSv4+), a faster alternative to the Where-Object cmdlet.

  • Member-access enumeration (PSv3+), where accessing a property at the collection level yields an array of the elements' property values;
    e.g., $this.psobject.Properties.Name yields the .Name property values of all elements of the $this.psobject.Properties collection.

  • Applying comparison operator -ne to an array-valued LHS, in which case the operator acts as a filter: the operator is applied to each element, and the matching elements are returned as an array; note the @(...) around the LHS, which ensures that it is treated as an array even if it happens to return only a single value.

With the sample data in the question, the above yields (look for property Joined):

Group_name : proxy_users
Column1    : domain\Igor
Column2    : null
Column989  : domain\Andrew
Column999  : domain\Mike
Joined     : domain\Igor, domain\Andrew, domain\Mike

With the above optimizations, you could even consider a simpler Select-Object solution that constructs new custom-object instances that contain all of the input object's properties (*) plus a new calculated property that contains the combined column values of interest:

$o | Select-Object *, @{ n='Joined'; e={ 
  @(($_.psobject.Properties.Where({$_.Name -match 'Column*'})).Value) -ne 'null' -join ', ' } 
}

This solution is slower than the one above, but not by much.
A caveat is that if you collect all output in memory, the newly constructed objects take up space in addition to the originals.


Optional reading: Performance comparison; the impact of pipeline use:

The code at the bottom times the various approaches from Frode's and my answer.

Here's a sample timing from my machine - the absolute numbers aren't important, but their ratios are (though I don't know how factors such as CPU core count and disk speed would come into play) - input set size is 1000 objects:

Approach                                 TotalSeconds
--------                                 ------------
add-member w/ scriptproperty - operators 0.35
select-object w/ noteproperty            0.40
add-member w/ scriptproperty - pipeline  0.72
add-member w/ noteproperty - pipeline    0.98

Conclusions:

  • The solutions that use pipelines in the property definition are noticeably slower, by a factor of about 2, which doesn't seem to vary even with larger input counts.

  • The optimized Select-Object-based solution that creates new objects (rather than adding properties to the input objects) is only slightly slower than the optimized Add-Member solution.

Source code for the tests:

# The number of input objects to test with.
# Adjust to experiment.
$objCount = 1000

Write-Verbose -Verbose "# of input objects: $objCount"

$ndx = 0; $approaches = 'select-object w/ noteproperty', 
                        'add-member w/ scriptproperty - operators', 
                        'add-member w/ scriptproperty - pipeline', 
                        'add-member w/ noteproperty - pipeline'
$tempFile = New-TemporaryFile # requires PSv5+
$(foreach($approach in $approaches) {

  # Create test collection (recreate it in every iteration to support repeated Add-Member operations)
  $al = [System.Collections.ArrayList]::new()
  for ($i = 0; $i -lt $objCount; ++$i) {
    $null = $al.add([pscustomobject] @{ one = 1; Column1 = 'col1'; Column2 = 'col2'; Column3 = 'null'; Column4 = 'col4' })
  }

  Measure-Command {

    Write-Verbose -Verbose "Running: $approach..."
    switch ($ndx) {
      0 { # select-object w/ noteproperty
        $al | Select-Object *, @{ n='Joined'; e={ @(($_.psobject.Properties.Where({ $_.Name -match 'Column*'})).Value) -ne 'null' -join ', ' } } | 
          Export-Csv $tempFile  
        break
      }
      
      1 { # add-member w/ scriptproperty - operators
        $al | Add-Member -PassThru -MemberType ScriptProperty -Name Joined -Value { @($this.psobject.Properties.Where({ $_.Name -match 'Column*'}).Value) -ne 'null' -join ', ' } |
          Export-Csv $tempFile          
        break
      }
      
      2 { # add-member w/ scriptproperty - pipeline
        $al | Add-Member -PassThru -MemberType ScriptProperty -Name Joined -Value { ($this.psobject.Properties | Where-Object { $_.Name -match 'Column*'  -and $_.Value -ne 'null' } | Select-Object -ExpandProperty Value) -join ', ' }  | 
         Export-Csv $tempFile
         break
        }
        
        3 { # add-member w/ noteproperty - pipeline; requires an intermediate ForEach-Object call
          $al | ForEach-Object {
            $_ | Add-Member -PassThru -NotePropertyName Joined -NotePropertyValue (($_.psobject.Properties | Where-Object { $_.Name -match 'Column*' -and $_.Value -ne 'null' } | Select-Object -ExpandProperty Value) -join ', ' )
          } |
            Export-Csv $tempFile
        break
      }
      default { Throw "What are you doing here?" }
    }

    # Import-Csv $tempFile | Select-Object -First 1 Joined | Write-Verbose -Verbose

    ++$ndx
        
  } | Select-Object @{ n='Approach'; e={ $approach }}, @{ n='TotalSeconds'; e={ '{0:N2}' -f $_.TotalSeconds } }

}) | Sort-Object { [double] $_.TotalSeconds }

Remove-Item $tempFile

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.