1
wordsDF = sqlContext.createDataFrame([('cat',), ('elephant',), ('rat',), ('rat',), ('cat', )], ['word'])

This is a way of creating dataframe from a list of tuples in python. How can I do this in scala ? I'm new to Scala and I'm facing problem in figuring it out.

Any help will be appreciated!

1

2 Answers 2

6

One simple way,

val df = sc.parallelize(List( (1,"a"), (2,"b") )).toDF("key","value")

and so df.show

+---+-----+
|key|value|
+---+-----+
|  1|    a|
|  2|    b|
+---+-----+

Refer to the worked example in Programmatically Specifying the Schema for constructing a DataFrame with createDataFrame.

Sign up to request clarification or add additional context in comments.

Comments

0

To create a dataframe , you need to create SQLContext .

val sc: SparkContext // An existing SparkContext.
val sqlContext = new org.apache.spark.sql.SQLContext(sc)

// this is used to implicitly convert an RDD to a DataFrame , after importing it you can use .toDF method
import sqlContext.implicits._

now you can create dataframes

val df1 = sc.makeRDD(1 to 5).map(i => (i, i * 2)).toDF("single", "double")

learn more about creating dataframes here

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.