2

Curious if there a construct in PowerShell that does this?

I know you can do this:

$arr = @(1,1,1,1,2,2,2,2,3,3,3,3,3,4,4,4,4,4)
$arr = $arr | Get-Unique

But seems like performance-wise it would be better to ignore the value as you are entering it into the array instead of filtering out after the fact.

3
  • 1
    So what's the question? If you want a data structure that guarantees no duplicates, then use a hashtable rather than an array. (A hashtable, by definition, cannot have duplicate keys.) Commented Oct 30, 2017 at 18:52
  • 1
    I’d go for the hash table as well. Easy to use, and nice performance. That being said, it might be cheaper to insert the dublicates and then remove them when you need to read the data. Things done in bulk by an optimized algorithm (like sort -unique) can be much cheaper for the system than you own code in the enterpreter. Generally: go for readability and only focus on performance when you are able to measure a bottleneck. Commented Oct 30, 2017 at 19:06
  • $arr|Sort-Object -Unique -CaseSensitive Commented Oct 30, 2017 at 20:54

3 Answers 3

3

If are you inserting a large number of items in to an array (thousands) the performance does drop, because the array needs to be reinitialized every time you add to it so it may be better in your case, performance wise, to use something else.

Dictionary, or HashTable could be a way. Your single dimensional unique array could be retrieved with $hash.Keys For example:

$hash = ${}
$hash.Set_Item(1,1)
$hash.Set_Item(2,1)
$hash.Set_Item(1,1)
$hash.Keys
1
2

If you use Set_Item, the key will be created or updated but never duplicated. Put anything else for the value if you're not using it, But maybe you'll have a need for a value with your problem too.

Sign up to request clarification or add additional context in comments.

Comments

1

You could also use an Arraylist:

     Measure-Command -Expression {
    $bigarray = $null
    $bigarray =  [System.Collections.ArrayList]@()
    $bigarray = (1,1,1,1,2,2,2,2,3,3,3,3,3,4,4,4,4,4)
    $bigarray | select -Unique
} 

Time passed:

  • TotalSeconds : 0,0006581
  • TotalMilliseconds : 0,6581

     Measure-Command -Expression {
    $array = @(1,1,1,1,2,2,2,2,3,3,3,3,3,4,4,4,4,4)
    $array | select -Unique
    }
    

Time passed:

  • TotalSeconds : 0,0009261
  • TotalMilliseconds : 0,9261

Comments

0

I'm not sure how it compares performance-wise, but the HashSet collection is an array that requires unique values.

$arr = [Collections.Generic.HashSet[int]]::new()
3,1,1,1,1,2,2,2,2,3,3,3,3,3,4,4,4,4,4 | &{process{
  [void]$arr.Add($_)
}}
"$arr" #returns: 1 2 3 4

If you want it to be automatically sorted as well, replace "HashSet" with "SortedSet".

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.