0

I need to read in specific parquet files with spark, I know this can be done like so:

sqlContext
    .read
    .parquet("s3://bucket/key", "s3://bucket/key")

Right now I have a List[String] object with all these s3 paths in it but I don't know how I can pass this programmatically to the parquet function in Scala? There are way to many files to do it manually, any ideas how to get the files into the parquet function programmatically?

2
  • 3
    You are looking for splat operator: .parquet(listOfStrings:_*) Commented Jul 8, 2016 at 2:33
  • Indeed I am, thank you much! Commented Jul 8, 2016 at 2:38

1 Answer 1

3

I've answer a similar question earlier concerning repeated parameters here.

As @Dima mentioned, you are looking for a splat operator because .parquet expected repeated arguments :

sqlContext.read.parquet(listOfStrings:_*)

More on repeated arguments in the Scala Language Specification seciton 4.6.2

Although it's the specs of scala 2.9, this part didn't change.

Sign up to request clarification or add additional context in comments.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.