Reference a java nested class in Spark Scala

Question

I'm trying to read some data from hadoop into an RDD in Spark using the interactive Scala shell but I'm having trouble accessing some of the classes I need to deserialise the data.

I start by importing the necessary class

import com.example.ClassA

Which works fine. ClassA is located in a jar in the 'jars' path and has ClassB as a public static nested class

I'm then trying to use ClassB like so:

val rawData = sc.newAPIHadoopFile(dataPath, classOf[com.exmple.mapreduce.input.Format[com.example.ClassA$ClassB]], classOf[org.apache.hadoop.io.LongWritable], classOf[com.example.ClassA$ClassB])

This is slightly complicated by one of the other classes taking ClassB as a type, but I think that should be fine.

When I execute this line, I get the following error:

<console>:17: error: type ClassA$ClassB is not a member of package com.example

I have also tried using the import statement

import com.example.ClassA$ClassB

and it also seems fine with that.

Any advice as to how I could proceed to debug this would be appreciated

Thanks for reading.

update:

Changing the '$' to a '.' to reference the nested class seems to get past this problem, although I then got the following syntax error:

'<console>:17: error: inferred type arguments [org.apache.hadoop.io.LongWritable,com.example.ClassA.ClassB,com.example.mapredu‌ce.input.Format[com.example.ClassA.ClassB]] do not conform to method newAPIHadoopFile's type parameter bounds [K,V,F <: org.apache.hadoop.mapreduce.InputFormat[K,V]]

Just had a go with that and got '<console>:17: error: inferred type arguments [org.apache.hadoop.io.LongWritable,com.example.ClassA.ClassB,com.example.mapreduce.input.Format[com.example.ClassA.ClassB]] do not conform to method newAPIHadoopFile's type parameter bounds [K,V,F <: org.apache.hadoop.mapreduce.InputFormat[K,V]]'. Would that imply it's gotten past any class issues it had? — user1111284
– user1111284, Commented Feb 25, 2015 at 15:34

Zoltán · Accepted Answer · 2015-02-25 16:21:51Z

2

Notice the types that the newAPIHadoopFile expects:

K,V,F <: org.apache.hadoop.mapreduce.InputFormat[K,V]

the important part here is that the generic type InputFormat expects the types K and V, i.e. the exact types of the first two parameters to the method.

In your case, the third parameter should be of type

F <: org.apache.hadoop.mapreduce.InputFormat[LongWritable, ClassA.ClassB]

does your class extend FileInputFormat<LongWritable, V>?

edited Feb 25, 2015 at 16:21

answered Feb 25, 2015 at 16:10

Zoltán

22.3k15 gold badges96 silver badges140 bronze badges

Sign up to request clarification or add additional context in comments.

5 Comments

user1111284 Over a year ago

Yes, it extends 'org.apache.hadoop.mapreduce.lib.input.FileInputFormat' which in turn extends 'org.apache.hadoop.mapreduce.InputFormat'

user1111284 Over a year ago

Sorry, I should have been clearer here; my 'Format' class is declared as 'Format<V extends Message> extends FileInputFormat<NullWritable, V>' so it only takes one type parameter directly.

Zoltán Over a year ago

@user1111284 please post the signature of your Format class and the hierarchy of its type parameters

Zoltán Over a year ago

Does LongWritable extend NullWritable and does ClassA.ClassB extend Message?

user1111284 Over a year ago

Right, of course - NullWritable does not extend LongWritable; swapping that out in the command has fixed it. Thanks for your help

Collectives™ on Stack Overflow

Reference a java nested class in Spark Scala

1 Answer 1

5 Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

1 Answer 1

5 Comments

Your Answer

Sign up or log in

Post as a guest

Related