How can i add an empty array when using df.withColomn when() and otherwise(***empty_array***)
New column type is T.ArrayType(T.StringType()) from UDF
I want to avoid ending up with NaN values.
Simply use array(lit(None))
df.select(when(col('target_bool')=='true',array(lit(1))).otherwise(array(lit(None)))).show()
F.array([]) and F.array(F.lit(none)) ?lit for literals is recommended. Both are correct otherwiseTry below - Create a column with None value and cast to Array()
df_b = df_b.withColumn("empty_array", F.when(F.col("rn") == F.lit("1"), (None))).withColumn("empty_array", F.col("empty_array").cast(T.ArrayType(T.StringType())))
df_b.show()
root
|-- col1: string (nullable = true)
|-- col2: string (nullable = true)
|-- rn: integer (nullable = true)
|-- case_condition: integer (nullable = true)
|-- empty_array: array (nullable = true)
| |-- element: string (containsNull = true)
lit([])?Unsupported literal type class java.util.Arraylist []