1

I'm trying to read some data from hadoop into an RDD in Spark using the interactive Scala shell but I'm having trouble accessing some of the classes I need to deserialise the data.

I start by importing the necessary class

import com.example.ClassA

Which works fine. ClassA is located in a jar in the 'jars' path and has ClassB as a public static nested class

I'm then trying to use ClassB like so:

val rawData = sc.newAPIHadoopFile(dataPath, classOf[com.exmple.mapreduce.input.Format[com.example.ClassA$ClassB]], classOf[org.apache.hadoop.io.LongWritable], classOf[com.example.ClassA$ClassB])

This is slightly complicated by one of the other classes taking ClassB as a type, but I think that should be fine.

When I execute this line, I get the following error:

<console>:17: error: type ClassA$ClassB is not a member of package com.example

I have also tried using the import statement

import com.example.ClassA$ClassB 

and it also seems fine with that.

Any advice as to how I could proceed to debug this would be appreciated

Thanks for reading.

update:

Changing the '$' to a '.' to reference the nested class seems to get past this problem, although I then got the following syntax error:

'<console>:17: error: inferred type arguments [org.apache.hadoop.io.LongWritable,com.example.ClassA.ClassB,com.example.mapredu‌​ce.input.Format[com.example.ClassA.ClassB]] do not conform to method newAPIHadoopFile's type parameter bounds [K,V,F <: org.apache.hadoop.mapreduce.InputFormat[K,V]]
6
  • 1
    I think you mean nested class, not subclass. Commented Feb 25, 2015 at 15:01
  • I should mention, I'm using Spark 1.2.0 Commented Feb 25, 2015 at 15:01
  • Sorry, yes it's a nested class not a subclass - my bad Commented Feb 25, 2015 at 15:02
  • 1
    have you tried using com.example.ClassA.ClassB? Commented Feb 25, 2015 at 15:04
  • Just had a go with that and got '<console>:17: error: inferred type arguments [org.apache.hadoop.io.LongWritable,com.example.ClassA.ClassB,com.example.mapreduce.input.Format[com.example.ClassA.ClassB]] do not conform to method newAPIHadoopFile's type parameter bounds [K,V,F <: org.apache.hadoop.mapreduce.InputFormat[K,V]]'. Would that imply it's gotten past any class issues it had? Commented Feb 25, 2015 at 15:34

1 Answer 1

2

Notice the types that the newAPIHadoopFile expects:

K,V,F <: org.apache.hadoop.mapreduce.InputFormat[K,V]

the important part here is that the generic type InputFormat expects the types K and V, i.e. the exact types of the first two parameters to the method.

In your case, the third parameter should be of type

F <: org.apache.hadoop.mapreduce.InputFormat[LongWritable, ClassA.ClassB]

does your class extend FileInputFormat<LongWritable, V>?

Sign up to request clarification or add additional context in comments.

5 Comments

Yes, it extends 'org.apache.hadoop.mapreduce.lib.input.FileInputFormat' which in turn extends 'org.apache.hadoop.mapreduce.InputFormat'
Sorry, I should have been clearer here; my 'Format' class is declared as 'Format<V extends Message> extends FileInputFormat<NullWritable, V>' so it only takes one type parameter directly.
@user1111284 please post the signature of your Format class and the hierarchy of its type parameters
Does LongWritable extend NullWritable and does ClassA.ClassB extend Message?
Right, of course - NullWritable does not extend LongWritable; swapping that out in the command has fixed it. Thanks for your help

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.