4

I'm trying to download a 10MB file and store it as an array for further processing.

Everything seems fine when using a direct call to (New-Object System.Net.WebClient).DownloadData("<url>"). But if I wrap it inside a function and return the result of the call to WebClient::DownloadData memory footprint increases to around 500mb.

The function that I use:

function My-Download {
    param (
        [Parameter(Mandatory = $True, Position = 1)] [String] $UrlCode
    )
    (New-Object System.Net.WebClient).DownloadData($UrlCode)
}
$x = My-Download("https://file-examples.com/wp-content/uploads/2017/04/file_example_MP4_1280_10MG.mp4")

The reason I wrapped it inside of the function is that I also do additional processing on the data before returning it but even this small example illustrates the problem.

Calling $x = (New-Object System.Net.WebClient).DownloadData("https://file-examples.com/wp-content/uploads/2017/04/file_example_MP4_1280_10MG.mp4") results in 83MB:

direct call memory consumption

Calling the above function results in 500MB:

wrapper function memory consumption

What is the reason for such a high memory usage and what can I do to optimize it?

Powershell version:

Major  Minor  Build  Revision
-----  -----  -----  --------
5      1      17134  407
7
  • 2
    Don't use the brackets when you call the function. Instead, use a single space character. $x = My-Download "https://...". Also, the first index for parameter position is 0, not 1. Commented Jul 7, 2019 at 19:10
  • That doesn't seem to solve the problem. Commented Jul 7, 2019 at 19:27
  • 1
    (New-Object System.Net.WebClient).DownloadData($UrlCode) -> ,(New-Object System.Net.WebClient).DownloadData($UrlCode) Commented Jul 7, 2019 at 19:49
  • 1
    Thanks a lot, that worked! What does the comma mean and what happens without the comma? Commented Jul 7, 2019 at 19:57
  • 2
    @hurlenko - the comma operator in that location wraps an array around the item to the right. that causes the "unroll collections" feature of PoSh to unroll the outer collection while NOT unrolling the inner one. apparently breaking the collection into parts to pass along is where the memory use came from. ///// i would not have even thot to look into that ... [grin] Commented Jul 7, 2019 at 20:06

1 Answer 1

2

The [System.Net.WebClient] type's .DownloadData() method returns a byte array ([byte[]]).

  • If you assign the output from a call to that method to a variable directly, the variable receives that byte array as-is.

  • By contrast, if a call to that method is used to produce implicit output from a function, the [byte[]] array's elements are sent to the pipeline, one by one (byte by byte).
    The design intent behind the pipeline is to enable streaming, object-by-object processing rather than collect-all-result-first behavior, which trades execution speed for memory-throttling, one-by-one, as-output-becomes-available processing.

Assigning the function's output to a variable then causes PowerShell to implicitly collect the individual output objects (bytes in this case) in a regular [object[]] array.

In other words: the original [byte[]] array was first enumerated, only to be collected later in another array, albeit an [object[]]-typed one - that is obviously unnecessary and inefficient in your scenario.

There are two ways to opt out of this implicit enumeration:

  • Instead of implicit output, you can use a conceptually explicit Write-Output -NoEnumerate call in order to suppress the enumeration of an output array (collection):

    Write-Output -NoEnumerate (New-Object System.Net.WebClient).DownloadData($UrlCode)
    
  • A more obscure, but more concise and faster alternative is to combine implicit output with an auxiliary single-element wrapper array, which causes PowerShell to enumerate the wrapper array only, passing the wrapped array through, as PetSerAl suggests in a comment on the question:

    , (New-Object System.Net.WebClient).DownloadData($UrlCode)`
    
    • , is PowerShell's array-construction operator (the "comma operator"), and in its unary form it wraps the RHS in a single-element array (of type [object[]]).
Sign up to request clarification or add additional context in comments.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.