1

I'm learning Scala multi-stage metaprogramming. I'm currently working on an exercise that asks me to implement the dot-product (sum-of-product) between two vectors of the same length.

Writing the code, I need to match at compile-time Arrays and check if their length equals, before proceeding with the dot product. Otherwise, an exception is thrown. I'm using a FromExpr implicit object to specify the extraction by implementing its unapply method. And, this is what I came up with

given ArrayFromExpr: FromExpr[Array[Int]] with
  def unapply(x: Expr[Array[Int]])(using Quotes): Option[Array[Int]] =
    x match
      case '{ Array[Int]()(using $ct1) } => Some(Array[Int]())
      case '{ Array[Int]((${Varargs(Exprs(elements))})*) } => Some(Array(elements*))
      case _ => None

The problem is, I get a None as a result in the enclosing match-case statement. Which is,

def dotImpl(v1: Expr[Array[Int]], v2: Expr[Array[Int]])(using q: Quotes): Expr[Int] = {
  (v1.value, v2.value) match {
    case (Some(arr1), Some(arr2)) if arr1.length != arr2.length => throw RuntimeException("Cannot compute dot product of arrays having different lengths.")
    case (Some(arr1), Some(arr2)) if arr1.length == 0 => '{ 0 }
    case (Some(arr1), Some(arr2)) => computeDotProduct(arr1, arr2)
    case (None, None) => '{ 99 }
    case _ => '{ 999 }
  }
}

This last snippet is not completed yet, but I'm trying to get past this wrong behaviour before continuing. I do get a correct behaviour for the case of empty Array (defined as val arr = Array[Int]()).

The test that allows me to asses the misbehaviour of the FromExpr implementation is

"dot product for different length arrays" should "throw" in {
  val v1 = Array[Int](2, 1)
  val v2 = Array[Int](1, 2, 3)

  dot(v1, v2) shouldBe 1 // this is for a better test output
  a [RuntimeException] should be thrownBy {
    dot(v1, v2)
  }
}

which results in

99 was not equal to 1

While a RuntimeException is expected.

Any help or advice?

1 Answer 1

1

Your Expr[Array[Int]] values are not what you would get in runtime - it's AST. In this case - dot(v1, v2) - for the first argument you wouldn't get

Array[Int](1, 2, 3)
// actually, it would be this tree:
Apply(
  Apply(
    TypeApply(Select(Ident("Array"), "apply"), List(TypeIdent("Int"))),
    List(
      Typed(Repeated(
        List(
          Literal(IntConstant(1)), 
          Literal(IntConstant(2)),
          Literal(IntConstant(3))
        ),
        Inferred()
      ), Inferred() )
    )
  ),
  List(
    Apply(
      TypeApply(Select(Ident("ClassTag"), "apply"), List(Inferred())),
      List(
        Literal(
          ClassOfConstant(
            TypeRef(
              TermRef(ThisType(TypeRef(NoPrefix(), "<root>")), "scala"),
              "Int"
            )
          )
        )
      )
    )
  )
)

but

v1
// actually, it would be this tree:
Ident("v1")

So the information how the runtime value is constructed was already discarded.

And for a good reason. Your code could be modified to contain:

v1(0) = 10

which would change the value if v1 in runtime (IArray would not have this issue) - what then should be matched against in macro? Original value? Somehow computed new value?

The answer is none of that - you are parsing exactly the expression that was passed into the macro, and if it isn't a tree containing all the data that you need, you are not supporting it.

If you are brave, you can experiment with -Yretain-trees and obtaining the Tree associated with the Symbol, to see how some identifier was defined... but that's going to be error prone at best.

My suggestion would be:

  • support Array only for values directly constructed as arguments
  • support IArray for both values directly constructed as arguments, as well as with source accessible to the macro via their Symbol (IArray is immutable, so there shouldn't be a risk of getting different values from the val in compile time than in the runtime)
  • support everything else with a runtime computation, as there is no way to compute dot product in compile time of something that cannot be guaranteed to have the known shape
Sign up to request clarification or add additional context in comments.

5 Comments

I understand most of what you explained. Thank you for coming back to me, by the way. I'm trying to get my head around the second point you made at the end of your comment. After understanding, you need to be sure the data structure is known at compile time and it won't change during execution, as you pointed out, IArray is what should be needed. Unfortunately, I still get the same failure from the tests, the case expression (now is case '{ IArray[Int]((${Varargs(Exprs(elements))}: Seq[Int])*) } => Some(IArray.apply(elements*))) doesn't match. Did I miss anything?
At call site, I have dot(IArray[Int](2, 1), IArray[Int](1, 2, 3)) now. Even if it doesn't make a difference since IArray is immutable and it would be defined as a value (instead of a variable)
Unfortunately, I can only recommend a grind: print with TreeAnsiCode and TreeStructure to see what is the shape of the Expr and handle all possible cases. In pattern matching, I would also add a lot of smaller expr match { case pattern => println(...) } } to see which things match against which nested value and where it stops. While case '{ Array[Int]((${Varargs(Exprs(elements))})*) } is nice to read, it's virtually undebuggable - match could have failed at IArray(${...}*), VarArgs(...) or Exprs and we have no idea.
During development it is much easier to have nested pattern matches which print intermediate results. Just from reading the code me as well cannot guess where the issue is, and I would have to rewrite the code (to print intermediate results) and run it several times against various inputs, before I'd jump into writing these one-liner cases. To handle expression containing references, I would suggest trying something like expr.asTerm.underlying - but it's going to be a lot of work anyway, so I'd suggest to do it later, once you'll get more comfortable with Quotes API.
Thanks for coming back to me. I actually realised matching over the IArray which actually contains data at compile time is too convoluted. The tree, as shown by TreeStructure is just too deeply nested.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.