I am trying to replace all strings in a column that start with 'DEL_' with a NULL value.
I have tried this:
customer_details = customer_details.withColumn("phone_number", F.regexp_replace("phone_number", "DEL_.*", ""))
Which works as expected and the new column now looks like this:
+--------------+
| phone_number|
+--------------+
|00971585059437|
|00971559274811|
|00971559274811|
| |
|00918472847271|
| |
+--------------+
However, if I change the code to:
customer_details = customer_details.withColumn("phone_number", F.regexp_replace("phone_number", "DEL_.*", None))
This now replaces all values in the column:
+------------+
|phone_number|
+------------+
| null|
| null|
| null|
| null|
| null|
| null|
+------------+
nullis not a string type.