0

I have written a udf function below and it throws me an error. Please help.

Below is my dataset;

df1 = sqlContext.range(0, 1000)\
 .withColumn('normal1',func.abs(10*func.round(randn(seed=1),2)))\
 .withColumn('normal2',func.abs(100*func.round(randn(seed=2),2)))\
 .withColumn('normal3',func.abs(func.round(randn(seed=3),2)))

df1 = df1.withColumn('Y',when(df1.normal1*df1.normal2*df1.normal3>750, 1)\
       .otherwise(0))

udf function below:

from pyspark.sql import types as T
balancingRatio=0.8
calculateWeights = udf(lambda d:(1 * balancingRatio) if d==0 else (1 * (1.0 -   balancingRatio)),T.IntegerType())
weightedDataset = df1.withColumn('classWeightCol', calculateWeights('Y'))
weightedDataset.show() 

It takes some time and throw me an error;

Py4JJavaError: An error occurred while calling o670.showString.
: org.apache.spark.SparkException: Job aborted due to stage failure: Task 0 
in stage 25.0 failed 1 times, most recent failure: Lost task 0.0 in stage 
25.0 (TID 427, localhost, executor driver): org.apache.spark.SparkException: 
Python worker failed to connect back.

What might be the problem? Thank you.

A simple example on internet that I found is not working also

maturity_udf = udf(lambda age: "adult" if age >=18 else "child", 
 T.StringType())
df = sqlContext.createDataFrame([{'name': 'Alice', 'age': 1}])
df.withColumn("maturity", maturity_udf(df.age)).show()

Not: I got python 3.7.1 and spark 2.4

7
  • What is T.IntegerType() exactly? shouldn't be just IntegerType()? Commented Nov 24, 2018 at 12:52
  • from pyspark.sql import types as T Commented Nov 24, 2018 at 12:53
  • What is your pyspark version? Commented Nov 24, 2018 at 12:56
  • my spark version is 2.4 Commented Nov 24, 2018 at 13:01
  • It seems some version issues. try installing version 2.3 and try again. Commented Nov 24, 2018 at 13:03

1 Answer 1

2

You need to disable fork safety by setting the OBJC_DISABLE_INITIALIZE_FORK_SAFETY variable to YES This solved the issue for me.

import os
os.environ['OBJC_DISABLE_INITIALIZE_FORK_SAFETY'] = 'YES'
Sign up to request clarification or add additional context in comments.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.