1

Object class is

class VertexAttributes(val m: Boolean, n: Any){

        val rootParentCustNumber: String = if(n == null) "Was Null" else n.toString
        val firstMsgFlg = m

}

I have an RDD of this object's type:

scala> myGraph.vertices
res92: org.apache.spark.graphx.VertexRDD[VertexAttributes] = VertexRDDImpl[2280] at RDD at VertexRDD.scala:57

Filtering on the RDD, I get the following:

scala> res92.filter{case(k,m) => k == 964088677}.collect
res94: Array[(org.apache.spark.graphx.VertexId, VertexAttributes)] = Array((964088677,VertexAttributes@2612b83f))

How can I access [email protected] in Array((964088677,VertexAttributes@2612b83f))

I have tried res92.filter{case(k,m) => k == 964088677}.map{case Array(k,m)=> m.rootParentCustNumber}

But I get the following error:

<console>:243: error: pattern type is incompatible with expected type;
 found   : Array[T]
 required: (org.apache.spark.graphx.VertexId, VertexAttributes)
    (which expands to)  (Long, VertexAttributes)
       res92.filter{case(k,m) => k == 964088677}.map{case Array(k,m)=> m.rootParentCustNumber}
                                                               ^
4
  • Same as the filter stage: .map{ case (k, m) => m.rootParentCustNumber } Commented Mar 18, 2018 at 19:15
  • But I need to filter first to get the object I want. Commented Mar 18, 2018 at 19:16
  • You sill can filter before. The filtering stage doesn't change the type of the RDD. So you can pipe the returned RDD of the filter stage with a map stage: res92.filter{ case(k, m) => k == 964088677 }.map{ case (k, m) => m.rootParentCustNumber } Commented Mar 18, 2018 at 19:18
  • Thank you. Could you please post that answer, so I can select? Commented Mar 18, 2018 at 19:20

1 Answer 1

1

The filtering stage doesn't change the type of the RDD ( which is RDD[(Long, VertexAttributes)]).

So you can pipe the returned RDD of the filter stage with a map stage and work with each record the same way you did in the filtering stage:

res92
  .filter{ case (k, m) => k == 964088677 }
  .map{ case (k, m) => m.rootParentCustNumber }

I think you've been misled by the collect stage which transforms the RDD into an Array.

Sign up to request clarification or add additional context in comments.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.