Convert Row into List(String) in PySpark

Question

I have data in Row tuple format -

Row(Sentence=u'When, for the first time I realized the meaning of death.')

I want to convert it into String format like this -

(u'When, for the first time I realized the meaning of death.')

I tried like this (Suppose 'a' is having data in Row tupple)-

b = sc.parallelize(a)
b = b.map(lambda line: tuple([str(x) for x in line]))
print(b.take(4))

But I am getting result something like this -

[('W', 'h', 'e', 'n', ',', ' ', 'f', 'o', 'r', ' ', 't', 'h', 'e', ' ', 'f', 'i', 'r', 's', 't', ' ', 't', 'i', 'm', 'e', ' ', 'I', ' ', 'r', 'e', 'a', 'l', 'i', 'z', 'e', 'd', ' ', 't', 'h', 'e', ' ', 'm', 'e', 'a', 'n', 'i', 'n', 'g', ' ', 'o', 'f', ' ', 'd', 'e', 'a', 't', 'h', '.')]

Do anybody know what I am doing wrong here?

Siddharth Raj · Accepted Answer · 2019-07-15 11:47:03Z

7

below is the code:

col = 'your_column_name'
val = df.select(col).collect()
val2 = [ ele.__getattr__(col) for ele in val]

answered Jul 15, 2019 at 11:47

Siddharth Raj

1412 silver badges3 bronze badges

Sign up to request clarification or add additional context in comments.

1 Comment

stevenl Over a year ago

This worked for me with the following adjustment (cleaner): val2 = [ ele[col] for ele in val]

Alper t. Turker · Accepted Answer · 2018-01-19 12:30:13Z

6

With single Row (why would you even...) it should be:

a = Row(Sentence=u'When, for the first time I realized the meaning of death.')

b = sc.parallelize([a])

and flattened with

b.map(lambda x: x.Sentence)

or

b.flatMap(lambda x: x)

although sc.parallelize(a) is already in the format you need - because you pass Iterable, Spark will iterate over all fields in Row to create RDD

answered Jan 19, 2018 at 12:30

community wiki

Alper t. Turker

Comments

Vijay Berwal · Accepted Answer · 2023-09-12 05:15:50Z

0

Below worked for me 1: list_val = df.selectExpr("max(Location) as loc").collect() str_val = [e['loc'] for e in list_val][0]

2: row = df.first() string_value = row['columnName']

edited Sep 12, 2023 at 5:15

answered Sep 7, 2023 at 15:43

Vijay Berwal

12 bronze badges

Collectives™ on Stack Overflow

Convert Row into List(String) in PySpark

3 Answers 3

1 Comment

Comments

Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

3 Answers 3

1 Comment

Comments

Comments

Your Answer

Sign up or log in

Post as a guest

Related