I am new to Spark and Scala.
I have an RDD that is of type org.apache.spark.rdd.RDD[Array[String]].
Here is a listing from myRdd.take(3).
Array(Array(1, 2524474, CBSGPRS, 1, 2015-09-09 10:42:03, 0, 47880, 302001131103734, NAT, "", 502161081073570, "", BLANK, UNK, "", "", "", MV_PVC, BLANK, 1, "", 0, 475078439, 41131;0;0, "", 102651;0;0, 3|3), Array(2, 2524516, CBSGPRS, 1, 2015-09-09 23:42:14, 0, 1260, 302001131104272, NAT, "", 502161081074085, "", BLANK, UNK, "", "", "", MV_PVC, BLANK, 1, "", 0, 2044745984, 3652;0;0, "", 8636;0;0, 3|3), Array(3, 2524545, CBSGPRS, 1, 2015-09-09 14:56:55, 0, 32886, 302001131101629, NAT, "", 502161081071599, "", BLANK, UNK, "", "", "", MV_PVC, BLANK, 1, "", 0, 1956194307, 14164657;0;0, "", 18231194;0;0, 3|3))
I am trying to map it as follows ..
var gprsMap = frows.collect().map{ tuple =>
// bind variables to the tuple
var (recKey, origRecKey, recTypeId, durSpanId, timestamp, prevConvDur, convDur,
msisdn, callType, aPtyCellId, aPtyImsi, aPtyMsrn, bPtyNbr, bPtyNbrTypeId,
bPtyCellId, bPtyImsi, bPtyMsrn, inTrgId, outTrgId, callStatusId, suppSvcId, provChgAmt,
genFld1, genFld2, genFld3, genFld4, genFld5) = tuple
var dtm = timestamp.split(" ");
var idx = timestamp indexOf ' '
var dt = timestamp slice(0, idx)
var tm = timestamp slice(idx + 1, timestamp.length)
// return the results tuple
((dtm(0), msisdn, callType, recTypeId, provChgAmt), (convDur))
}
I keep getting error:
error: object Tuple27 is not a member of package scala.
I am not sure what the error is. Can someone help?