0

How can i add an empty array when using df.withColomn when() and otherwise(***empty_array***)
New column type is T.ArrayType(T.StringType()) from UDF

I want to avoid ending up with NaN values.

2
  • maybelit([])? Commented Aug 4, 2020 at 10:07
  • Already tried that but get an error: Unsupported literal type class java.util.Arraylist [] Commented Aug 4, 2020 at 10:26

2 Answers 2

2

Simply use array(lit(None))

df.select(when(col('target_bool')=='true',array(lit(1))).otherwise(array(lit(None)))).show()
Sign up to request clarification or add additional context in comments.

2 Comments

Is there a difference between F.array([]) and F.array(F.lit(none)) ?
Using lit for literals is recommended. Both are correct otherwise
0

Try below - Create a column with None value and cast to Array()

df_b = df_b.withColumn("empty_array", F.when(F.col("rn") == F.lit("1"), (None))).withColumn("empty_array", F.col("empty_array").cast(T.ArrayType(T.StringType())))
df_b.show()



 root
 |-- col1: string (nullable = true)
 |-- col2: string (nullable = true)
 |-- rn: integer (nullable = true)
 |-- case_condition: integer (nullable = true)
 |-- empty_array: array (nullable = true)
 |    |-- element: string (containsNull = true)

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.