2

I want to process a large number of URLs and grab the *.jpg file locations. The problem is that the $entry in the second foreach is not threadsafe. The script is firing hundreds of errors because the $entry is getting overwritten over and over. When I move the inner foreach outside of the ForEach-Object, then its working fine but very slowly. How can I process the split output properly within my ForEach-Object without getting these errors?

  • $array just contains a huge amount of URLs
  • $clean_img_array is the output array of the operation
  • $tmpArray is the reference to $clean_img_array in order to use it within a parallel ForEach

Errors:

InvalidOperation:
Line |
  14 |                  [void]$tmpArray:clean_img_array.Add($entry);
     |                  ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
     | You cannot call a method on a null-valued expression.

Snippet:

   $clean_img_array = [System.Collections.ArrayList]@();

   $array | ForEach-Object -Parallel {
        
        $web = Invoke-RestMethod $_;
        
        $i=1;
        foreach($entry in $web.Split("`"")){
            echo $entry;
            if($entry.IndexOf(".jpg") -ne -1 -And $entry.IndexOf("http") -ne -1){
                if($entry.IndexOf("?") -ne -1){
                    $tmpArray = $using:clean_img_array;
                    [void]$tmpArray.Add($entry.Substring(0, $entry.IndexOf('?')));
                }else{
                    $tmpArray = $using:Clean_img_array;
                    [void]$tmpArray:clean_img_array.Add($entry);
                }
                
            }
        }
        
    } -ThrottleLimit 20
3
  • thx for the input, I edited the question and added the Information you asked for I just removed the threadSafeImgArray because this just bout the Progress bar and is neglectable for now if you have further question feel free to ask Commented Mar 17, 2021 at 17:36
  • How about something like $clean_img_array = $array | foreach-object -parallel { ... } Commented Mar 17, 2021 at 17:49
  • @js2010 im a bit confused about variable at the start, what does it do? is it like a return value? can you provide me a source link because I don't get it from the Microsoft site learn.microsoft.com/en-us/powershell/module/… Commented Mar 17, 2021 at 17:52

2 Answers 2

1

Here's a simple example. Both $a and $b are arrays. $b is the result of the parallel loop. It's like example 12 in the docs.

$a = 1..10
$b = $a | foreach-object -parallel { $_ + 1 }

$b

2
3
4
5
6
7
8
9
10
11
Sign up to request clarification or add additional context in comments.

2 Comments

thanks for the example im a bit confused how I can combine this with the inner foreach wich I need to process the split result fast
for example just return $entry.Substring(0, $entry.IndexOf('?')) by itself.
0

Thanks for the support! I combined the answer from @js2012 with something of my own. The return alone did not solve the thread unsave behavior of $entry, but clearly untangled the workflow. But I used the inline version of .foreach with the pipe line variable $_ wich happens to be thread safe as it appears to me. Running now like a charm and also very fast for more than 2million entrys.

  • $array holds the URLs to be processed

  • $clean_img_array returns the grabbed image URLs

     $clean_img_array = $array | ForEach-Object -Parallel {
    
         $web = Invoke-RestMethod $_;
         $web.Split("`"").foreach({  
             if($_ -ne $null){
                 if($_.IndexOf(".jpg") -ne -1 -And $_.IndexOf("http") -ne -1){
                     if($_.IndexOf("?") -ne -1){
                         return $_.Substring(0, $entry.IndexOf('?'));
                     }else{
                         return $_;
                     }           
                 }
             }
         });
    
     } -ThrottleLimit 25
    

2 Comments

You don't need the word 'return', or the semicolons.
I freaking love the semicolons and I know they are obsolete :) and I will try it someday also without the return statement but thats more for my own readability

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.