I am learning Apache Spark and I am using Java 8 and Spark Core 2.3.2.
I am finding that when I use the map function on an RDD it only works when I use a Lambda Expression.
So this works:
JavaRDD<Integer> rdd = sc.parallelize(Arrays.asList(1, 2, 3, 4));
JavaRDD<Integer> result = rdd.map(x -> x*x );
But this does not and throws an org.apache.spark.SparkException: Task not serializable
JavaRDD<Integer> result = rdd.map(new Function<Integer, Integer>() {
public Integer call(Integer x) { return x*x; }
});
Can someone please explain why? Thanks