0

I am trying to get items in an array object(has all the data and way more rows) (array object 1) which also exists in a different array (Array object 2)(has some of the fields, and fewer rows)

How would I get the data from array object 1 using multiple fields since many fields repeat?

Important point: No single Entry in a field is unique, but combinations of 4 fields are

Specifically how to do this in a few steps/lines of code and without much time and/or resource consumption (I can figure out how to do this using loops but that's inefficient)

Example

ARRAY OBJECT 1

Number : 10
Place  : UT
Color  : yellow
Zone   : FAHIVT
Group  : 20


Number : 29
Place  : NY
Color  : white
Zone   : HWOUKK
Group  : 8


Number : 66
Place  : TX
Color  : black
Zone   : KZZGKI
Group  : 2


number : 127
Place  : AL
Color  : white
Zone   : DMXDZR
Group  : 14

ARRAY OBJECT 2

Place : NY
Color : white
Group : 2
Zone  : TVQJPN

Place : PA
Color : blue
Group : 4
Zone  : AAYYSN

Place : NJ
Color : red
Group : 17
Zone  : DXKSVE

The result I seek after matching or filtering is like

filtered Array Object 1

number : 1730
Place  : NY
Color  : white
Zone   : TVQJPN
Group  : 2



number : 2199
Place  : PA
Color  : blue
Zone   : AAYYSN
Group  : 4



number : 2746
Place  : NJ
Color  : red
Zone   : DXKSVE
Group  : 17

What I ended up doing was some "Feature Engineering" and combining the 4 fields that are unique as a combo into one string and added that as a property to each of the Array Objects. Combo = Place + Color + Zone + Group

Example

number : 1730
Place  : NY
Color  : white
Zone   : TVQJPN
Group  : 2
Combo  : NYwhiteTVQJPN2

Assume $wholearray is the bigger more fuller array and $partialobject is the one smaller one we need to filter by.

The Following is my CURRENT CODE

    $wholearray.ForEach({
        $thisline = $_ ;
        $combo = $thisline.Place + $thisline.Color + $thisline.Zone + $thisline.Group;
        $wholearray.Where({$_ -eq $thisline}) | Add-Member -NotePropertyName Combo -NotePropertyValue $combo
    }

    $partialobject.ForEach({
        $thisline = $_ ;
        $combo = $thisline.Place + $thisline.Color + $thisline.Zone + $thisline.Group;
        $partialobject.Where({$_ -eq $thisline}) | Add-Member -NotePropertyName Combo -NotePropertyValue $combo
    }

    $filtereddata = $wholearray.Where({$_.Combo -in $($partialobject.Combo)}) 

It works with an average processing time of 0.08secs per instance in the loop But its still Slow. It will take 5 mins just for a couple of thousand rows/instances. When I have to do this with 20k or more it would be a disaster. All The while it could be done in 15 secs with even more rows in Excel using vlookup. And POSH or CLI should be faster than Excel.

How can I make this more efficient?

How can I make this faster?

0

2 Answers 2

1

A single Compare-Object could accomplish your task.

As Compare-Object isn't known for speed you've got at least an alternative to measure time.

$Filtered = Compare-Object -Ref $WholeArray -Diff $PartialArray `
                           -Property Place,Color,Zone,Group `
                           -IncludeEqual -ExcludeDifferent -PassThru

To remove the inserted property SideIndicator you might do:

$Data = $Filtered | Select-Object -Property * -ExcludeProperty SideIndicator
Sign up to request clarification or add additional context in comments.

2 Comments

Thanks this worked EXTREMELY well and was fast too. I tested it without the Passthru and it didnt work. Do you mind explaining why that is?
Sorry missd your comment, Without -PassThru you get only the named properties in the response, here with -IncludeEqual you get all properties from the -ReferenceObject plus the -SideIndicator.
1

Perhaps using this code and a lookup Hashtable will speed things up for you:

Add the Combo property to the $wholearray objects:

$wholearray | ForEach-Object {
    $combo = '{0}{1}{2}{3}' -f $_.Place, $_.Color, $_.Zone, $_.Group
    $_ | Add-Member -MemberType NoteProperty -Name Combo -Value $combo
}

For the objects in the partial array, you create a Hashtable object:

$lookup = @{}

$partialarray | ForEach-Object {
    $combo = '{0}{1}{2}{3}' -f $_.Place, $_.Color, $_.Zone, $_.Group
    $lookup[$combo] = $true   # the key is the important thing here, the value doesn't matter
}

Then get your filtered data using:

$filtereddata = $wholearray | Where-Object { $lookup.ContainsKey($_.Combo)} 

Hope that helps

1 Comment

Thank you very much. This is amazingly faster than my method (10 seconds compared to 20 mins). And it keeps the feature engineering front too for when I need to do so in the future. However, for when I don't need the feature engineering I will be using the Compare-Object with a -passthru (4 secs).

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.