3

How can I create a function like that https://docs.databricks.com/spark/latest/spark-sql/language-manual/create-function.html#create-function but defining the function in python?

I already did something like that:

from pyspark.sql.types import IntegerType
def relative_month(input_date):
  if input_date is not None:
    return ((input_date.month + 2) % 6)+1
  else:
    return None
_ = spark.udf.register("relative_month", relative_month, IntegerType())

But this UDF only works for the notebook that runs this piece of code.

I want to do the same thing using a SQL syntax to register the function because I will have some users using databricks trough SQL Clients and they will need the functions too.

In the Databricks docs says that i can define a resource:

: (JAR|FILE|ARCHIVE) file_uri

I need to create a .py file and put it somewhere in my databricks cluster?

1 Answer 1

1

To share notebooks, set spark.databricks.session.share to true in the cluster’s configuration. Normally UDF's are application specific in spark and temporary so if one has to use it in other application , they have to register it again for using it. But as i said if you set the spark.databricks.session.share to true , you can share it across multiple notebook.

If it is for HIVE then you can register the UDF permanantly and can be accessible across multiple user's

Here is a similar thread for the same.See if it helps.

Databricks - Creating permanent User Defined Functions (UDFs)

Sign up to request clarification or add additional context in comments.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.