1

I am new to Spark and Scala and to this kind of programming in general.

What I want to accomplish is the following:

I have an RDD that is org.apache.spark.rdd.RDD**[(Double, Iterable[String])]**

So the possible content could be:

<1 , (A,B,C)>
<42, (A)    >
<0 , (C,D)  >

I need to transform this to a new RDD in such way so I get a similar output to:

<1, A>
<1, B>
<1, C>
<42, A>
<0, C>
<0, D>

This has to be very simple, but I tried so many different ways and couldn't get it right.

2 Answers 2

2

You can use flatMapValues:

import org.apache.spark.SparkContext._

val r : RDD[(Double, Iterable[String])] = ...
r.flatMapValues(x => x)
Sign up to request clarification or add additional context in comments.

2 Comments

I got an error: "missing parameter type for expanded function" However, simply replacing (_) with (x=>x) did the job! so: val A = B.flatMapValues(x =>x)
@zsxwing It's not a type inference problem - x => x still requires type inference to work. It's just that _ in that particular context doesn't mean the identity function.
0

Lets have the input like

(Name , List[Interest]),

"Chandru",("Java","Scala","Python")
"Sriram", ("Science","Maths","Hadoop","C2","c3")
"Jai",("Flink","Scala","Haskell")

Create a case class for the person,

 case class Person(name:String, interest:List[String])

Create input

 val input={Seq(Person("Chandru",List("Java","Scala","Python")),Person("Sriram",List("Science","Maths","Hadoop","C2","c3")),Person("Jai",List("Flink","Scala","Haskell")))}

 val rdd=sc.parallelize(input)

 val mv=rdd.map(p=>(p.name,p.interest))

 val fmv=mv.flatMapValues(v=>v.toStream)

 fmv.collect

Result is:

  Array[(String, String)] = Array(
  (Chandru,Java), 
  (Chandru,Scala), 
  (Chandru,Python), 
  (Sriram,Science), 
  (Sriram,Maths), 
  (Sriram,Hadoop), 
  (Sriram,C2), 
  (Sriram,c3), 
  (Jai,Flink), 
  (Jai,Scala), 
  (Jai,Haskell))

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.