0

I am trying to execute a udf function and it returns an error

from pyspark.sql.functions import udf

mytab = spark.read.jdbc(url=jdbcUrl, table="mytab",properties=connectionProperties)

def buscarx(Alm_r, Pro, Data_mat):
    data_s = mytab.where(col("doc")==Data_mat).where(col("alm")!=Alm_r).limit(1)
    if(data_s.count()==0):
        return Pro
    else:
        temp = "0"
        for item in data_s.collect():
            temp = data_s.Alm
        return temp

buscarx_udf = udf(buscarx)
df_temp = mytab.withColumn("alm_origen", buscarx_udf(mytab.Alm,mytab.Proveedor,mytab.Doc_mat))

Error:

Traceback (most recent call last):
  File "/databricks/spark/python/pyspark/serializers.py", line 473, in dumps
    return cloudpickle.dumps(obj, pickle_protocol)
  File "/databricks/spark/python/pyspark/cloudpickle/cloudpickle_fast.py", line 73, in dumps
    cp.dump(obj)
  File "/databricks/spark/python/pyspark/cloudpickle/cloudpickle_fast.py", line 563, in dump
    return Pickler.dump(self, obj)
TypeError: cannot pickle '_thread.RLock' object
PicklingError: Could not serialize object: TypeError: cannot pickle '_thread.RLock' object

I ran some tests and found that the problem is caused by:

    data_s = mytab.where(col("doc")==Data_mat).where(col("alm")!=Alm_r).limit(1)

Any suggestions to fix this? I need to perform a query within the function.

3
  • Does this answer your question? How to reference a dataframe when in an UDF on another dataframe? Commented Sep 2, 2022 at 6:19
  • 1
    UDFs are required to be in pure python. meaning, you can't use sql functions and reference spark dataframes or RDDs directly. Commented Sep 2, 2022 at 6:48
  • Some alternative? Commented Sep 2, 2022 at 6:57

1 Answer 1

1

A USER DEFINED FUNCTION operates on data within a dataframe and not on the dataframe as a whole like spark sql functions. Hence you cannot use pyspark sql methods like where, filter etc.

Sign up to request clarification or add additional context in comments.

1 Comment

Any alternative that allows me to achieve the proposed?

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.