I have a dataframe with some attributes and it has the next appearence:
+-------+-------+
| Atr1 | Atr2 |
+-------+-------+
| 3,06 | 4,08 |
| 3,03 | 4,08 |
| 3,06 | 4,08 |
| 3,06 | 4,08 |
| 3,06 | 4,08 |
| ... | ... |
+-------+-------+
As you can see, the values of the Atr1 and Atr2 of the dataframe are numbers that has a ',' character. This is because I have loaded those data from a CSV where the decimals of the DoubleType numbers were represented by ','.
When I load the data into a dataframe the values are cast to String, so I applied a casting from String to DoubleType for those attributes like this:
df = df.withColumn("Atr1", df["Atr1"].cast(DoubleType()))
df = df.withColumn("Atr2", df["Atr2"].cast(DoubleType()))
But when I do it, the values are converted to null
+-------+-------+
| Atr1 | Atr2 |
+-------+-------+
| null | null |
| null | null |
| null | null |
| null | null |
| null | null |
| ... | ... |
+-------+-------+
I guess that the reason is that DoubleType decimals must be separated by '.' instead of by ','. But I don't have the chance to edit the CSV file, so I want to replace the ',' signs from the Dataframe by '.' and then apply the casting to DoubleType.
How could I do it?