Need to update a PySpark dataframe if the column contains the certain substring
for example:
df looks like
id address
1 spring-field_garden
2 spring-field_lane
3 new_berry place
If the address column contains spring-field_ just replace it with spring-field.
Expected result:
id address
1 spring-field
2 spring-field
3 new_berry place
Tried:
df = df.withColumn('address',F.regexp_replace(F.col('address'), 'spring-field_*', 'spring-field'))
Seems not working.