3

I have followed this guide and this question trying to implement a decryption function to use in a SQL view.

I have compiled this scala code in the example to a jar file and uploaded to the Databricks File System (DBFS):

import com.macasaet.fernet.{Key, StringValidator, Token};
import org.apache.hadoop.hive.ql.exec.UDF;
import java.time.{Duration, Instant};

class Validator extends StringValidator {
    override def getTimeToLive() : java.time.temporal.TemporalAmount = {
      Duration.ofSeconds(Instant.MAX.getEpochSecond());
    }
  }

class udfDecrypt extends UDF {

    def evaluate(inputVal: String, sparkKey : String): String = {

      if( inputVal != null && inputVal!="" ) {
        val keys: Key = new Key(sparkKey)
        val token = Token.fromString(inputVal)
        val validator = new Validator() {}
        val payload = token.validateAndDecrypt(keys, validator)
        payload
      } else return inputVal
    }
  }

I can declare the function as demonstrated:

%sql
CREATE OR REPLACE FUNCTION default.udfDecrypt AS 'com.nm.udf.udfDecrypt'
USING jar 'dbfs:/FileStore/jars/decryptUDF.jar';

But if I try to call it an error is thrown:

%sql
SELECT default.udfDecrypt(field, '{key}') FROM default.encrypted_test;

Error in SQL statement: AnalysisException: Can not load class 'com.nm.udf.udfDecrypt' when registering the function 'default.udfDecrypt', please make sure it is on the classpath; line 1 pos 7

I have noticed that the function can be declared using any jar file path (even one that doesn't exist) and it will still return 'OK'.

I am using Databricks for Azure.

2
  • Can you validate the jar exists at the dbfs path? Commented Dec 13, 2021 at 14:02
  • Yes, just rechecked. Commented Dec 13, 2021 at 14:11

3 Answers 3

2

It seems like your UDF code is missing:

package com.nm.udf;

at the top.

Sign up to request clarification or add additional context in comments.

Comments

0

Update as of 2022 October, because the accepted solution did not work for me.

First off, the given Scala code is incorrect, you need to add...

import java.time.Duration
import java.time.Instant;

To the top of the code.

Secondly, after packing the .scala file to jar (using sbt package for example...), when you create the function...

CREATE OR REPLACE FUNCTION udfDecryptor AS 'udfDecrypt'
USING jar 'dbfs:/FileStore/jars/decryptUDF.jar';

Then the alias of the function must match the name of the class.

Pay attention to having the build.sbt file with the correct version and dependencies, for example...

ThisBuild / version := "0.2.0-properscala"

ThisBuild / scalaVersion := "2.12.14"

lazy val root = (project in file("."))
  .settings(
    name := "the_name_of_project_here"
  )

// https://mvnrepository.com/artifact/org.scala-lang/scala-library
libraryDependencies += "com.macasaet.fernet" % "fernet-java8" % "1.5.0"
// https://mvnrepository.com/artifact/org.apache.hive/hive-exec
libraryDependencies += "org.apache.hive" % "hive-exec" % "3.1.2"

This is the only way I managed to solve it and made running of the code successful. Hope it helps.

Comments

0

There could be some hidden parts in the guidlines. Here is a complete repo to make things in details.

https://github.com/dungruoc/databricks-fernet-udf

2 Comments

Link-only answers have risk of link going dead. Please consider providing supporting source code/summary/steps from link to solve the problem.
As it’s currently written, your answer is unclear. Please edit to add additional details that will help others understand how this addresses the question asked. You can find more information on how to write good answers in the help center.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.