Scala DataFrameReader Keep column headers

Question

The following code reads csv into a dataframe in scala:

 val mDF: DataFrame = spark.read.csv("src/test/resources/knimeMerged.csv")

However, it treats the first row of the imported data as a data row. In fact, the first row is headers. It uses the default headers for dataframe as headers (e.g., _c0, _c1)

I assume there is an Option to allow the import of headers for a csv file but cannot find it in the Scala API docs (I'm new to scala and their documentation).

Any hints would be appreciated both on what the option is and how to implement

akuiper · Accepted Answer · 2017-08-31 03:54:09Z

3

The option to handle it is header; set header as true will work:

val mDF: DataFrame = spark.read.option("header", true).csv("src/test/resources/knimeMerged.csv")

edited Aug 31, 2017 at 3:54

answered Aug 31, 2017 at 3:51

akuiper

216k33 gold badges362 silver badges379 bronze badges

Sign up to request clarification or add additional context in comments.

Comments

Akash Sethi · Accepted Answer · 2017-08-31 04:19:07Z

1

You can add the option header before using csv method with value as true Something Just like this.

val df = spark.read.option("header","true").option("inferSchema","true").csv("src/test/resources/knimeMerged.csv")

I have also added the new option named inferSchema.

Using inferSchema as an option let spark try to specify column type. spark we try to infer the schema i.e. some column has data type as Int then it will add this information to CSV's Schema.

Using both the options you will have better metadata about the CSV file.

answered Aug 31, 2017 at 4:19

Akash Sethi

2,2941 gold badge23 silver badges40 bronze badges

1 Comment

Jake Over a year ago

which is just what I have been figuring out. Thanks a million

Collectives™ on Stack Overflow

Scala DataFrameReader Keep column headers

2 Answers 2

Comments

1 Comment

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

2 Answers 2

Comments

1 Comment

Your Answer

Sign up or log in

Post as a guest

Related