I am trying to learn Spark GraphX on Windows 10 by replicating the code here. The code is developed using an older version of Spark and I'm not able to find a solution to create a vertex. The following is the code
import scala.util.MurmurHash
import org.apache.spark._
import org.apache.spark.graphx._
import org.apache.spark.rdd.RDD
val path = "F:/Soft/spark/2008.csv"
val df_1 = spark.read.option("header", true).csv(path)
val flightsFromTo = df_1.select($"Origin",$"Dest")
val airportCodes = df_1.select($"Origin", $"Dest").flatMap(x => Iterable(x(0).toString, x(1).toString))
// error caused by the following line
val airportVertices: RDD[(VertexId, String)] = airportCodes.distinct().map(x => (MurmurHash.stringHash(x), x))
The following is the error message:
<console>:57: error: missing parameter type
val airportVertices: RDD[(VertexId, String)] = airportCodes.distinct().map(x => (MurmurHash.stringHash(x), x))
^
I think the syntax is obsolete and I tried to find the latest syntax on official documents but it was of no help. The data set can be downloaded from here.
UPDATE:
Basically, I'm trying to create a Vertex and Edge, to finally create a graph as shown in the tutorial. I'm also new to the Map-Reduce paradigm.