1

I'm trying to remove objects from an array that contain duplicates and only keep the ones with the highest number in TasteCode. The example below is highly simplified, but it shows the problem.

Example:

$Fruits
Name   | Color  | TasteCode
-----    ------   ---------
Apple  | Red    | 2
Apple  | Red    | 3
Peer   | Green  | 0
Banana | Yellow | 1
Banana | Yellow | 0
Banana | Yellow | 3

Desired solution:

Name   | Color  | TasteCode
-----    ------   ---------
Apple  | Red    | 3
Peer   | Green  | 0
Banana | Yellow | 3

I've already succeeded in gathering the ones that need to be removed, but it doesn't work out quite well:

$DuplicateMembers = $Fruits | Group-Object Name | Where Count -GE 2
$DuplicateMembers | ForEach-Object {

    $Remove = $_.Group | Sort-Object TasteCode | Select -First ($_.Group.Count -1)

    $Fruits = Foreach ($F in $Fruits) {
        Foreach ($R in $Remove) {
            if (($F.Name -ne $R.Name) -and ($F.TasteCode -ne $R.TasteCode)) {
                $F
            }
        }
    }
}

Thank you for your help.

3
  • What do you mean that it doesn't work out quite well? You always take the highest tastecode in this data? Commented Jun 15, 2015 at 13:08
  • Yes, I always take the object with the highest TasteCode for that specific Fruit.Name. The problem is that in the end there are duplicate rows in the Array because of the Foreach on the remove part I think. Commented Jun 15, 2015 at 13:10
  • 1
    Only one Color for one Name or multiple allowed? Commented Jun 15, 2015 at 13:22

2 Answers 2

1

How about not performing a remove, but just sort on tastecode descending and taking just one first result?

$DuplicateMembers = $Fruits | Group-Object Name 
$DuplicateMembers | ForEach-Object {
    $Outcome = $_.Group | Sort-Object TasteCode -descending | Select -First 1
    $Outcome
}

This way you should not bother to remove anything form the result of this query.

Sign up to request clarification or add additional context in comments.

3 Comments

Thank you Vesper, great tip! Same as jisaak so a +1 for the good thinking, as I totally didn't see that.
There's a subtle difference in our approaches - I followed your code that first groups then sorts, he first sorts then groups, the difference is in performance, and only prominent if there's a real lot of data in that $fruits equivalent of yours.
You're right Vesper, but it's not containing so much data so performance is not the issue now. But I'll keep it in mind for the future, where it can come in handy.
1

Sort the fruits by TasteCode, group it by name and select the first 1 of each group:

$result = $Fruits | sort TasteCode -Descending | group Name | % { $_.Group | select -first 1}

6 Comments

Yes, you're right this is a lot simpler than I initially though. Thank you very much for this great tip!
@DarkLite1 I had just assumed you were looking at something more complicated and was going to recommend an older question of your where you created a true clone of an object. Glad you found your answer.
Hmm, I wonder if this is slower than multiple sorts on a large database. Say there are 10k groups with 10k items in each, totalling 100m, I think sorting 100m objects first is a lot slower than grouping first, then sorting each group. If you need the output sorting by taste code, just sort it once collected.
No problem Matt, been in the code a bit too long today I think. Didn't see the trees through the wood anymore. And you're right, I did a question about fruits already, but it's easy to demonstrate like this. You know I like pears right? ;)
@DarkLite1 I remembered because of "peers"
|

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.