I used the following code to replace the None value in a DataFrame row to an empty string:
def replaceNone(row):
row_len = len(row)
for i in range(0, row_len):
if row[i] is None:
row[i] = ""
return row
in my pyspark code:
data_out = df.rdd.map(lambda row : replaceNone(row)).map(
lambda row : "\t".join( [x.encode("utf-8") if isinstance(x, basestring) else str(x).encode("utf-8") for x in row])
)
Then I got the following errors:
File "<ipython-input-10-8e5d8b2c3a7f>", line 1, in <lambda>
File "<ipython-input-2-d1153a537442>", line 6, in replaceNone
TypeError: 'Row' object does not support item assignment
Does anyone have any idea about the error? How do I replace a "None" value in a row to an empty string? Thanks!
df.replace('None',' ').