0

I have RDD[(BreezeMatrix[Double], Array[Int])] and i want to delete the columns of the Matrix that are in the array.

E.g if Array[1, 3, 4], I want to delete 1, 3, 4 columns from the Matrix.

My code is:

 val  viol = rdd.map(x =>  for (p <- x._2) {val c = x._1.delete(p, Axis._0)})

But for start i get Unit as return type, even if i return the matrix. Additionally, i was wondering if there is a more functional way to do it in scala.

2
  • 1
    Probably you are getting the Unit because x._1.delete(...) does return it. Anyhow let me remind you that may not be a good idea to update an object while you are iterating it. Commented Sep 10, 2018 at 11:03
  • Just out of my head since I don't have much time atm: I think you could duplicate the matrix and use 'filter' to exclude the rows in your array. Might also be necessary to use 'zipWithIndex' to remember which row has which index and finally use a map-function to remove the indices again before returning the final matrix. Commented Sep 10, 2018 at 14:42

1 Answer 1

1

More functional way is to iterate over array of indexes to delete with accumulating state between iteration steps. You can use foldLeft for this purpose. foldLeft works over collection of elements, in your case the array of indexes to delete:

rdd.map{
  case (matrix, toDelete) => 
    toDelete.foldLeft(matrix){case (acc, index) => acc.delete(index, Axis._0)}
}
Sign up to request clarification or add additional context in comments.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.