How does Ruby Array #count handle multiple block arguments

Question

When I execute the following:

[[1,1], [2,2], [3,4]].count {|a,b| a != b} # => 1

the block arguments a, b are assigned to the first and the second values of each inner array respectively. I don't understand how this is accomplished.

The only example given in the documentation for Array#count and Enumerable#count with a block uses a single block argument:

ary.count {|x| x % 2 == 0} # => 3

Note that this is exactly the same thing as doing hsh.each {|key, value|} where hsh is some Hash, since Hash#each, like all implementations of each only ever yields a single value per iteration, in this case a two-element Array whose first element is the key and second element is the value. — Jörg W Mittag
– Jörg W Mittag, Commented Jan 7, 2019 at 8:12
It's called "array decomposition". That entire doc is well worth a read. In this case, when, say, [3, 4] is passed to the block, Ruby computes a, b = [3, 4]. Try that in IRB or PRY and you'll see a is set to 3 and b is set to 4. If the block variables were written |a, (b, c)| and [3, [4, 5]] were passed to the block the calculation would be a, (b, c) = [3, [4, 5]], which results in a #=> 3, b #=> 4 and c #=> 5. — Cary Swoveland
– Cary Swoveland, Commented Jan 7, 2019 at 17:46

Amadan · Accepted Answer · 2019-01-07 01:29:07Z

5

Just like assignments, there's a (not-so-) secret shortcut. If the right-hand-side is an array and the left-hand-side has multiple variables, the array is splatted, so the following two lines are identical:

a, b, c = [1, 2, 3]
a, b, c = *[1, 2, 3]

While not the same thing, blocks have something in the same vein, when the yielded value is an array, and there are multiple parameters. Thus, these two blocks will act the same when you yield [1, 2, 3]:

do |a, b, c|
  ...
end

do |(a, b, c)|
  ...
end

So, in your case, the value gets deconstructed, as if you wrote this:

[[1,1], [2,2], [3,4]].count {|(a,b)| a != b} # => 1

If you had another value that you are passing along with the array, you would have to specify the structure explicitly, as the deconstruction of the array would not be automatic in the way we want:

[[1,1], [2,2], [3,4]].each.with_index.count {|e,i| i + 1 == e[1] }
# automatic deconstruction of [[1,1],0]:
# e=[1,1]; i=0

[[1,1], [2,2], [3,4]].each.with_index.count {|(a,b),i| i + 1 == b }
# automatic deconstruction of [[1,1],0], explicit deconstruction of [1,1]:
# a=1; b=1; i=0

[[1,1], [2,2], [3,4]].each.with_index.count {|a,b,i| i + 1 == b }
# automatic deconstruction of [[1,1],0]
# a=[1,1]; b=0; i=nil
# NOT what we want

edited Jan 7, 2019 at 1:29

answered Jan 7, 2019 at 1:13

Amadan

200k23 gold badges252 silver badges321 bronze badges

Sign up to request clarification or add additional context in comments.

5 Comments

Stefan Over a year ago

"If the right-hand-side is an array and the left-hand-side has multiple variables, the array is splatted" – this isn't documented, is it?

Amadan Over a year ago

@Stefan It probably is somewhere. I don't remember where I read about it first. Maybe not in those words; but there is a clear difference in semantics of a = [1, 2] and a, b = [1, 2] - the first assigns the array, the second assigns its elements.

Stefan Over a year ago

(a, b) = [1, 2] is documented, but not the implicit form without parentheses. I've opened a ticket.

Holger Just Over a year ago

@Stefan I might miss something, but isn't at leat the deconstruction on assignment documented just above the section you linked to, i.e. here?

Amadan Over a year ago

@HolgerJust: No - a, b = 1, 2 is documented there (with multiple lvalues being assigned multiple rvalues); but not a, b = [1, 2] (multiple lvalues being assigned a single array).

Jörg W Mittag · Accepted Answer · 2019-01-07 08:10:05Z

4

I have looked at the documentation for Array.count and Enumerable.count and the only example given with a block uses a single block argument ...

Ruby, like almost all mainstream programming languages, does not allow user code to change the fundamental semantics of the language. In other words, you won't find anything about block formal parameter binding semantics in the documentation of Array#count, because block formal parameter binding semantics are specified by the Ruby Language Specification and Array#count cannot possibly change that.

What I don't understand is how this is accomplished.

This has nothing to do with Array#count. This is just standard block formal parameter binding semantics for block formal parameters.

Formal parameter binding semantics for block formal parameters are different from formal parameter binding semantics for method formal parameters. In particular, they are much more flexible in how they handle mismatches between the number of formal parameters and actual arguments.

If there is exactly one block formal parameter and you yield more than one block actual argument, the block formal parameter gets bound to an Array containing the block actual arguments.
If there are more than one block formal parameters and you yield exactly one block actual argument, and that one actual argument is an Array, then the block formal parameters get bound to the individual elements of the Array. (This is what you are seeing in your example.)
If you yield more block actual arguments than the block has formal parameters, the extra actual arguments get ignored.
If you pass fewer actual arguments than the block has formal parameters, then those extra formal parameters are defined but not bound, and evaluate to nil (just like defined but unitialized local variables).

If you look closely, you can see that the formal parameter binding semantics for block formal parameters are much closer to assignment semantics, i.e. you can imagine an assignment with the block formal parameters on the left-hand side of the assignment operator and the block actual arguments on the right-hand side.

If you have a block defined like this:

{|a, b, c|}

and are yielding to it like this:

yield 1, 2, 3, 4

you can almost imagine the block formal parameter binding to work like this:

a, b, c = 1, 2, 3, 4

And if, as is the case in your question, you have a block defined like this:

{|a, b|}

and are yielding to it like this:

yield [1, 2]

you can almost imagine the block formal parameter binding to work like this:

a, b = [1, 2]

Which of course, as you well know, will have this result:

a #=> 1
b #=> 2

Fun fact: up to Ruby 1.8, block formal parameter binding was using actual assignment! You could, for example, define a constant, an instance variable, a class variable, a global variable, and even an attribute writer(!!!) as a formal parameter, and when you yielded to that block, Ruby would literally perform the assignment:

class Foo
  def bar=(value)
    puts "`#{__method__}` called with `#{value.inspect}`"
    @bar = value
  end

  attr_reader :bar
end

def set_foo
  yield 42
end

foo = Foo.new

set_foo {|foo.bar|}
# `bar=` called with `42`

foo.bar
#=> 42

Pretty crazy, huh?

The most widely-used application of these block formal parameter binding semantics is when using Hash#each (or any of the Enumerable methods with a Hash instance as the receiver). The Hash#each method yields a single two-element Array containing the key and the value as an actual argument to the block, but we almost always treat it as if it were yielding the key and value as separate actual arguments. Usually, we prefer writing

hsh.each do |k, v|
  puts "The key is #{k} and the value is #{v}"
end

over

hsh.each do |key_value_pair|
  k, v = key_value_pair
  puts "The key is #{k} and the value is #{v}"
end

And that is exactly equivalent to what you are seeing in your question. I bet you have never asked yourself why you can pass a block with two block formal parameters to Hash#each even though it only yields a single Array? Well, this case is exactly the same. You are passing a block with two block formal parameters to a method that yields a single Array per iteration.

answered Jan 7, 2019 at 8:10

Jörg W Mittag

371k79 gold badges457 silver badges666 bronze badges

4 Comments

Stefan Over a year ago

Fun fact: YARV's Hash#each actually yields two values when the block takes multiple arguments (and an array otherwise). Probably for performance reasons to avoid the to-array/from-array conversion.

Jörg W Mittag Over a year ago

Ah, you're right, I remember that there was some discussion about checking the arity of the block. This is one of those cases where YARV makes some "unfair" optimizations that are not available to user code. If you wanted to do the same thing in user code, you would need to first convert the block to a proc so you can call Proc#arity or Proc#parameters on it, which would completely negate any performance advantage you would get from not constructing the array. Note, however, that there will also be another difference when you pass a lambda with &, I think.

Jörg W Mittag Over a year ago

The whole parameter binding thing with regards to blocks, procs converted to blocks, and lambdas converted to blocks is a bit of a mess.

Jörg W Mittag Over a year ago

Note also that this is really a YARV-specific optimization. E.g. JRuby already optimizes away the array in all of the more general cases, i.e. destructuring assignment, multiple value return, block parameter binding, etc.

Collectives™ on Stack Overflow

How does Ruby Array #count handle multiple block arguments

2 Answers 2

5 Comments

4 Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

2 Answers 2

5 Comments

4 Comments

Your Answer

Sign up or log in

Post as a guest

Related