How to read data from sequence files using Spark Streaming

Question

I have the sequence file with key as text and value as custom data type.

But Spark streaming is unable to read data from the sequence file.

JavaPairInputDStream<Text, CustomDataType> myRDD =
        jssc.fileStream(path, Text.class, CustomDataType.class, SequenceFileInputFormat.class,
            new Function<Path, Boolean>() {
          @Override
          public Boolean call(Path v1) throws Exception {
            return Boolean.TRUE;
          }
        }, false);

Following is the syntax error from IDE.

Bound mismatch: The generic method fileStream(String, Class<K>, Class<V>, Class<F>, Function<Path,Boolean>, boolean) of type JavaStreamingContext is not applicable for the arguments (String, Class<Text>, Class<DeltaCounter>, Class<SequenceFileInputFormat>, new Function<Path,Boolean>(){}, boolean). The inferred type SequenceFileInputFormat is not a valid substitute for the bounded parameter <F extends InputFormat<K,V>>

How to read sequence file in Spark streaming?

vanekjar · Accepted Answer · 2015-06-02 13:00:35Z

1

You need to use a correct package in imports. You are probably importing the old org.apache.hadoop.mapred. Use this code:

import org.apache.hadoop.mapreduce.lib.input.SequenceFileInputFormat;

answered Jun 2, 2015 at 13:00

vanekjar

2,40616 silver badges24 bronze badges

Sign up to request clarification or add additional context in comments.

Collectives™ on Stack Overflow

How to read data from sequence files using Spark Streaming

1 Answer 1

Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

1 Answer 1

Comments

Your Answer

Sign up or log in

Post as a guest

Related