1

I'm trying to parse a site to collect price and product details. The script works in a loop however it's very slow. So I'm trying to run a multi-threaded powershell script as a job.

I've tried a lot of suggestions but I'm struggling to get the results out even though I can see its working (the web-request screen flashing up)

I'm only selecting the last 10 but I'll put in a throttle later. Just can't get it to output. Essentially I'd like all results to flow back into $arr.


#Import Danmurphy Sitelist
[xml] $XmlDocument = (New-Object System.Net.WebClient).DownloadString("http://www.example.com/sites.xml")

#get websites listed
$ImportedProducts = $XmlDocument.DocumentElement.url | select -Last 10

"Killing existing jobs . . ."
Get-Job | Remove-Job -Force
"Done."

#loop through the products

#Create Array
$arr = @()

#$argumentlist 

#ScriptBlock
$ScriptBlock = {
Param($product,$arr)

if ($product.loc -like "http://www.example.com/product/*"){

$uri = $product.loc
$WebResponse = Invoke-WebRequest -Uri $uri -SessionVariable WS 


#mainpricetest
$mainprice = $WebResponse.AllElements | ? { $_.Class -eq 'price-main' } | select innerText

$MainPriceArray = $mainprice.innerText.Split(' ')

$MainUnitArry = $MainPriceArray[1..10]

$MainDollar = $MainPriceArray[0]

$MainUnit = $MainUnitArry -join ' '


$item = New-Object PSObject
$item | Add-Member -type NoteProperty -Name 'Product Site' -Value $($product.loc)
$item | Add-Member -type NoteProperty -Name 'Main Price' -Value $($MainDollar)
$item | Add-Member -type NoteProperty -Name 'Main Unit' -Value $($MainUnit)



$arr += $item

}
}

foreach ($product in $ImportedProducts){
Start-Job -InputObject $ImportedProducts -ScriptBlock $ScriptBlock -ArgumentList $product,$arr
}

$data = Get-Job * | Receive-Job 

#Show Array
$arr
1
  • Why not just remove the $arr += $item from the scriptblock and capture the output with $data? Commented Oct 7, 2016 at 11:29

2 Answers 2

2

So you would want to use runspaces for that. Runspaces is a pretty complicated thing, luckily we have Posh-RSJob which handles everything for you. https://github.com/proxb/PoshRSJob

You can pass in the script block, so you would need very little adjustments. Probably something like this:

foreach ($product in $ImportedProducts){
    Start-RSJob -ScriptBlock $ScriptBlock
}
Get-RSjob | Receive-RSJob
Sign up to request clarification or add additional context in comments.

2 Comments

I've had good got at marking this work, but I can't seem to get the output. Has more data is always False. I tried adding the write-output $item as suggested by Bill Hurt's suggestion. Start-RSJob -InputObject $ImportedProducts -ScriptBlock $ScriptBlock -ArgumentList $product -Throtle and do { $arr += Get-RSJob -State Completed | Receive-RSJob } while (Get-RSJob -State Running)
Scrap that. looks like RSJobs are required to be used only in a pipe. Thank you, this worked: $ImportedProducts | foreach {Start-RSJob -ScriptBlock $ScriptBlock -ArgumentList $_ -Throttle 10 } do { $arr += Get-RSJob -State Completed | Receive-RSJob Get-RSJob -State Completed | Remove-RSJob } while (Get-RSJob -State Running)
1

If you want to get the results into $arr, you can't do it from within the script block as you are attempting to do. Multiple script blocks running in parallel cannot be allowed to access a single copy of a variable without taking additional steps not worth getting into.

The answer to your problem is going to be to write the output of each script block as regular output. That output is buffered until you use Receive-Job to get the results out of the job at which time you capture it into the $arr variable in a single threaded manner. Below is cod which should get you most of the way there.

#Import Danmurphy Sitelist
[xml] $XmlDocument = (New-Object System.Net.WebClient).DownloadString("http://www.example.com/sites.xml")

#get websites listed
$ImportedProducts = $XmlDocument.DocumentElement.url | select -Last 10

"Killing existing jobs . . ."
Get-Job | Remove-Job -Force
"Done."

#loop through the products

#Create Array
$arr = @()

#$argumentlist 

#ScriptBlock
$ScriptBlock = {
    Param($product)

    if ($product.loc -like "http://www.example.com/product/*"){

    $uri = $product.loc
    $WebResponse = Invoke-WebRequest -Uri $uri -SessionVariable WS 


    #mainpricetest
    $mainprice = $WebResponse.AllElements | ? { $_.Class -eq 'price-main' } | select innerText

    $MainPriceArray = $mainprice.innerText.Split(' ')

    $MainUnitArry = $MainPriceArray[1..10]

    $MainDollar = $MainPriceArray[0]

    $MainUnit = $MainUnitArry -join ' '


    $item = New-Object PSObject
    $item | Add-Member -type NoteProperty -Name 'Product Site' -Value $($product.loc)
    $item | Add-Member -type NoteProperty -Name 'Main Price' -Value $($MainDollar)
    $item | Add-Member -type NoteProperty -Name 'Main Unit' -Value $($MainUnit)



    Write-Output $item

    }
}

foreach ($product in $ImportedProducts){
    Start-Job -InputObject $ImportedProducts -ScriptBlock $ScriptBlock -ArgumentList $product
}

do {
    $arr += Get-Job -State Completed | Receive-Job -AutoRemoveJob
} while (Get-Job -State Running)

#Show Array
$arr

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.