Code Monkey home page Code Monkey logo

Comments (13)

mklement0 avatar mklement0 commented on June 13, 2024 1

There is no magic, only an initially perhaps surprising behavior that is fundamental to PowerShell:

In the PowerShell pipeline - which is invariably involved when producing output from a command (function, script, cmdlet, script block) - (most) enumerables (such as arrays, ArrayLists, ...) are auto-enumerated; that is, an enumerable's elements are sent one by one to the success output stream.

In other words: Unless you take extra steps (see below), the original enumerable is - predictably - lost, and in the success output stream you cannot tell the difference between outputting a single-element enumerable and the one element it contains, given that in both cases it is only the latter that is sent to the output stream.

An empty enumerable - such as in your case - sends "nothing" to the success output stream (pipeline), which is technically the [System.Management.Automation.Internal.AutomationNull]::Value singleton, which in effect behaves like $null in expression contexts and argument-based parameter binding (such as your case).

The success output stream is an open-ended stream of objects that itself has no notion of an array or a similar data structure: the objects in it can - and often are - processed one by one, as they are received, in which case the question of how to collect them for later processing doesn't arise.

Collecting stream output of necessity comes into play when you assign to a variable (e.g., $arr = test-function), or make command output participate in a larger expression (e.g., 'foo' + (test-function)), including use of $(...) and @(...) (except with array literals). Collecting a single object in the stream causes it to be collect as-is. It is only if two or more objects in the stream that a list-like data type is invariably needed for collection, in which case PowerShell invariably creates an [object[]]-typed array. For the reasons explained above, this array is unrelated to any originating enumerable type, which never participated as itself in the pipeline.

To send an instance of an enumerable type itself, as a whole to the success output stream, you must prevent auto-enumeration:

  • New-Object itself uses this technique, which is why @rhubarb-geek-nz's workaround is effective; New-Object's behavior is unusual among cmdlets (see below), but necessary in order to preserve the constructed instance as-is.
    • It is worth noting that the alternative, v5+ expression syntax for calling .NET type constructors does not exhibit this unusual behavior; that is, while New-Object object[] 0 sends the resulting array as itself to the success output stream, the otherwise equivalent expression [object[]]::new(0) is subject to auto-enumeration
  • The conceptually clearest expression of the intent to suppress auto-enumeration is to use Write-Output -NoEnumerate,
    but an often-seen shortcut is to use the unary form of , the array-constructor operator to create a transient helper array that wraps the output enumerable in a single-element array whose auto-enumeration then sends the enumerable itself to the success output stream.

In other words: The following techniques all work to output an empty array as a whole from your function:

# Using New-Object
function test-function { New-Object -object[] 0 }

# Using Write-Output -NoEnumerate
function test-function { Write-Output -NoEnumerate @() }

# Using a transitory single-element helper array wrapper
function test-function { , @() }

It is worth noting that auto-enumeration is a core PowerShell feature that you should generally not deviate from, especially in public-facing functions / cmdlets / scripts.

On a higher level of abstraction, one of PowerShell's core strength is its consistency, of which consistent behavior in output streams / in the pipeline is one aspect.

To put it in concrete terms: Users justifiably expect commands to output objects one by one rather than outputting list-like containers as a whole, especially given that the latter behavior will not behave as expected in the pipeline; e.g.:

# Expected, auto-enumerating streaming behavior (element-by-element streaming).
# Where-Object's script block is invoked once for each element.
# -> 2, 3 
& { @(1, 2, 3) } | Where-Object { $_ -ge 2 }

# Unusual, array-as-a-whole output behavior.
# !! -> @(1, 2, 3) 
# !! Where-Object only receives *one* input object, which is the *array* as  while, in which
# !! case -ge acts as an array filter, that returns subarray @(2, 3), which Where-Object interprets
# !! as $true, and therefore *passes the input object (array) through*.
& { Write-Output -NoEnumerate @(1, 2, 3) } | Where-Object { $_ -ge 2 }

So as not to confound user expectations, deviation from this behavior should make the target command require user opt-in, such as via the -NoEnumerate and -AsArray switches some built-in cmdlets (now) offer.

The legacy PowerShell edition, Windows PowerShell, neglects to exhibit this patterns in a few cases (i.e. outputs arrays-as-a-whole by default or invariably), which have since been corrected in PowerShell 7.

A prominent example is ConvertFrom-Json, which only in PowerShell 7 exhibits the expected behavior - see #3424 for the backstory.

Note that while PowerShell 7's built-in cmdlets now work consistently, from what I can tell, third-party code and even modules that ship with Windows may still exhibit the unexpected behavior; e.g., Get-WinUserLanguageList

If you encounter such a command and want to force enumeration, simply enclose it in (...) (which collects all output in memory first) or pipe to Write-Output (which preserves the streaming behavior).

from powershell.

rhubarb-geek-nz avatar rhubarb-geek-nz commented on June 13, 2024
function test-function {
        return New-Object System.Collections.ArrayList
}

$arr = test-function

$arr.GetType()

gives

IsPublic IsSerial Name                                     BaseType
-------- -------- ----                                     --------
True     True     ArrayList                                System.Object

from powershell.

abgox avatar abgox commented on June 13, 2024
  • Thanks, it can solve the problem.
  • But, why @() be a problem?

from powershell.

rhubarb-geek-nz avatar rhubarb-geek-nz commented on June 13, 2024
  • But, why @() be a problem?

You have my sympathy, PowerShell can be infuriating with lists of zero or one items where the list can magically evaporate.

My best explanation is PowerShell always tries to simplify, often turning lists of one item into just one item. This can be problematic in that you can't just write code dealing with lists, you have to test for the cases of none, one and some. Also in key places classic PowerShell differs from the behaviour of PowerShell core. Hence parameters like -AsArray, -NoEnumerate etc trying to undo what PowerShell insists on doing.

from powershell.

rhubarb-geek-nz avatar rhubarb-geek-nz commented on June 13, 2024

Another trick to avoid PowerShell's delisting of single elements is to capture the OutVariable itself which will contain all the elements of the output pipeline, so you can see it is the assignment doing collection.

$date = Get-Date -OutVariable datevar
$date.GetType()
$datevar.GetType()
$datevar[0].GetType()

gives

IsPublic IsSerial Name                                     BaseType
-------- -------- ----                                     --------
True     True     DateTime                                 System.ValueType
True     True     ArrayList                                System.Object
True     True     DateTime                                 System.ValueType

from powershell.

mklement0 avatar mklement0 commented on June 13, 2024

@rhubarb-geek-nz, while that is technically true, I consider this asymmetry between direct variable assignment and -OutVariable a bug, not a feature, as discussed many years ago in the following issue (nowadays, I would use slightly different framing and language, but the gist of the issue still applies):

Consider the following pitfall:

$null = Get-Item -OutVariable v $PROFILE

# !! ->  "The property 'LastWriteTime' cannot be found on this object. Verify that the property exists and can be set."
$v.LastWriteTime = [datetime]::now

Clearly, the intent of Get-Item $PROFILE is to retrieve a single object; yet, $v is now an ArrayList instance, so that $v.LastWriteTime applies member-access enumeration, which is unsupported for setting properties.

from powershell.

rhubarb-geek-nz avatar rhubarb-geek-nz commented on June 13, 2024

As a general rule, avoid assignment in PowerShell when you are dealing with multiple items. It is problematic managing code paths which sometimes retun a single item or a list of items. Compare with an SQL query, a result set can return zero items, one item or multiple items with no drama. Where as powershell can give you a null, a single item or a collection.

It is all water under the bridge, but my recommendation remains, avoid assignment operator when dealing with multiple items where the count may be 0, 1 or many. Use the output either in a pipeline or the OutVariable for consistent results.

from powershell.

rhubarb-geek-nz avatar rhubarb-geek-nz commented on June 13, 2024

Clearly, the intent of Get-Item $PROFILE is to retrieve a single object

However

PS> get-command get-item -syntax

Get-Item [-Path] <string[]> [-Filter <string>] [-Include <string[]>] [-Exclude <string[]>] [-Force] [-Credential <pscredential>] [-Stream <string[]>] [<CommonParameters>]

Get-Item -LiteralPath <string[]> [-Filter <string>] [-Include <string[]>] [-Exclude <string[]>] [-Force] [-Credential <pscredential>] [-Stream <string[]>] [<CommonParameters>]

Your parameter goes to the path variable which can both take an array and expand the wildcards

PS> $FOO='*.ps1'
PS> get-item $FOO

Mode                 LastWriteTime         Length Name
----                 -------------         ------ ----
-a---          29/04/2024    00:20            156 array.ps1
-a---          28/04/2024    20:31            120 empty.ps1
-a---          28/04/2024    21:57             91 error.ps1
-a---          28/04/2024    21:55            101 outvar.ps1

So your clearly is clearly not quite as clear as you suggest.

from powershell.

mklement0 avatar mklement0 commented on June 13, 2024

The "clearly" applied to the specific command, where a literal, single path was provided as the input.

The point is that any cmdlet is free to situationally "return" - i.e., emit to the success output stream - zero, one, or more output objects.

Earlier we've discussed the stream collection behavior that applies in direct variable. assignment, notably that a single output object is collected as-is.

The point of my previous comment was:

  • There is NO good reason for $v = ... to collect the output objects differently than ... -OutVariable v.

  • This difference can lead to bugs / unexpected behavior that may be hard to understand.

Also, note that your framing wasn't correct:

avoid PowerShell's delisting of single elements is to capture the OutVariable itself

-OutVariable has no impact on auto-enumeration, which happens regardless, unless explicitly suppressed.
It's simply that the-OutVariable feature unconditionally creates an ArrayList for the collected output, irrespective of the number of output objects. (Case in point: if you use New-Object System.Collections.ArrayList in combination with -OutVariable, you get a nested single-element ArrayList instance, whose first and only element contains the empty instance created by New-Object).

In concrete terms:

  • With zero output objects, direct variable assignment stores [System.Management.Automation.Internal.AutomationNull]::Value (the "enumerable null", which behaves like $null in an expression context, and like an empty enumerable in the pipeline), whereas -OutVariable creates an empty ArrayList instance.
  • With one output object, direct variable assignment collects that object as-is, whereas -OutVariable creates a single-element ArrayList
  • With two or more output objects, direct variable assignment creates a - fixed size - [object[]] array, whereas -OutVariable creates a (multi-element, resizable) ArrayList instance.

While you may choose to rely on this awkward inconsistency (which the documentation only hints at, without spelling out the ramifications) in order to always get an array-like result, I personally recommend avoiding it, both for the awkwardness of then having to suppress the success output ($null = ... -OutVariable) and the confusing discrepancy.

The short of it:

  • In order to emit enumerables as a whole from a PowerShell command, auto-enumeration must be suppressed (as an aside: in the Cmdlet.WriteObject() SDK function, the logic is reversed), using the techniques previously discussed.

  • If you want to ensure that at most one object is captured in a variable, pipe to Select-Object -First 1 or - if you don't mind collecting all output first - use (...)[0] (assuming Set-StrictMode is at most at -Version 2).

  • If you want to ensure that output is always captured in an array, use @(...), the array-subexpression operator ($v = @(...)), or (with subtly different behavior, [array] $v = ....).

  • However, thanks to PowerShell's unified handling of scalars and lists, provided via intrinsic members for scalars and member-access enumeration for enumerables, it is often not necessary to force creation of an array or, conversely, to explicitly enumerate the elements of an array for member access (read-only property access and method access).

  • And, yes, if you don't actually need to collect a(n intermediate) command's output, processing it in a streaming fashion in a pipeline is the best approach.

from powershell.

rhubarb-geek-nz avatar rhubarb-geek-nz commented on June 13, 2024

A common pattern that I have, which is why I have so much frustration with PowerShell's Schroedinger's OO model is that being able to round trip JSON data is of vital importance. If the original JSON was an array it needs to stays as an array even if it only has one contained object. Likewise if an object contains a property that was array of one object that needs to stay as an array after our processing. Yes ConvertFrom-JSON now has the -NoEnumerate, and that adds to the complexity when writing scripts that have to work on both Desktop and Core. In order to do that we have to have test cases where every array anywhere within an object can have zero, one or some items so we know we are using the right flags at each processing step and work with all combinations of data.

from powershell.

mklement0 avatar mklement0 commented on June 13, 2024

ConvertFrom-JSON now has -NoEnumerate, and that adds to the complexity when writing scripts that have to work on both Desktop and Core.

That is unfortunate, but an unavoidable consequence of things getting improved / fixed in PS Core.

An array stored in a property value should never pose any problem, however; e.g., the following round-trips properly, in both editions:

[pscustomobject] @{ ArrayProp = @(1) } | ConvertTo-Json | ConvertFrom-Json

Also note that a simple way to avoid auto-enumeration is to pass an enumerable as an argument to ConvertTo-Json:

ConvertTo-Json @(1) -Compress # -> '[1]'

from powershell.

microsoft-github-policy-service avatar microsoft-github-policy-service commented on June 13, 2024

This issue has been marked as answered and has not had any activity for 1 day. It has been closed for housekeeping purposes.

from powershell.

microsoft-github-policy-service avatar microsoft-github-policy-service commented on June 13, 2024

πŸ“£ Hey @abgox, how did we do? We would love to hear your feedback with the link below! πŸ—£οΈ

πŸ”— https://aka.ms/PSRepoFeedback

from powershell.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    πŸ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. πŸ“ŠπŸ“ˆπŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❀️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.