I'm working with Apache Spark's ALS model, and the recommendForAllUsers method returns a dataframe with the schema
root
|-- user_id: integer (nullable = false)
|-- recommendations: array (nullable = true)
| |-- element: struct (containsNull = true)
| | |-- item_id: integer (nullable = true)
| | |-- rating: float (nullable = true)
In practice, the recommendations are a WrappedArray like:
WrappedArray([636958,0.32910484], [995322,0.31974298], [1102140,0.30444127], [1160820,0.27908015], [1208899,0.26943958])
I'm trying to extract just the item_ids and return them as a 1D array. So the above example would be [636958,995322,1102140,1160820,1208899]
This is what's giving me trouble. So far I have:
val numberOfRecs = 20
val userRecs = model.recommendForAllUsers(numberOfRecs).cache()
val strippedScores = userRecs.rdd.map(row => {
val user_id = row.getInt(0)
val recs = row.getAs[Seq[Row]](1)
val item_ids = new Array[Int](numberOfRecs)
recs.toArray.foreach(x => {
item_ids :+ x.get(0)
})
item_ids
})
But this just returns [I@2f318251, and if I get the string value of it via mkString(","), it returns 0,0,0,0,0,0
Any thoughts on how I can extract the item_ids and return them as a separate, 1D array?