2

I know that it is possible to convert a dataframe column into a list using something like:

dataFrame.select("ColumnName").rdd.map(r => r(0)).collect()

Let's say I already know the schema of the dataframe and correspondingly I created a case class such as :

case class Synonym(URI: String, similarity: Double, FURI: String)

is there an efficient way to get a list of Synonym objects from the data of the dataframe?

In other words, I am trying to create a mapper that would convert each row of the dataframe into an object of my case class and then return this object in a way that I can have a list of these objects at the end of the operation. is this possible in an efficient nice way?

2 Answers 2

9

Use as[Synonym] to get a Dataset[Synonym] which you can then collect to get an Array[Synonym]:

val result = dataframe.as[Synonym].collect()
Sign up to request clarification or add additional context in comments.

Comments

3

Use typed Dataset:

df.select("URI", "similarity", "FURI").as[Synonym].collect

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.