0

I am using Spark Scala and I created a String array from a CSV column and I want to use the array values to filter another CSV file. I am trying with this code :

val ID_a = dTestSample1.select("ID").dropDuplicates("ID")
val x =ID_a.select("ID").collect().map { row => row.toString() }
var i = 0 
for (newID <- x){
  val sbinPercent5 = dTestSample1.filter("ID".equals(newID )) ...................................
  i+=1 }

I am getting "overloaded method value filter" error. Any suggestion?

1 Answer 1

1
ID_a.select("ID").collect()

This piece of code returns an Array, that you then turn into an Array[String] by applying the row => row.toString() function with map. However, at this point, you no longer have a DataFrame.

Arrays have an implicit conversion that allow you to use methods like filter on it. filter is a higher order function that takes a predicate, in your case a function like the following String => Boolean. However, instead of passing a function that takes a String and returns a Boolean, you are directly calling the method equals on the string "ID", so you are passing the filter a Boolean instead of a predicate.

It looks like you are trying to use the DataFrame API to an Array, however what you can do to solve the problem is just pass a predicate to the filter method:

 dTestSample1.filter(s => s.equals(newID))

or more consisely

 dTestSample1.filter(_.equals(newID))

My suggestion is, however, to try to fully leverage the DataFrame API for the query you are doing, which apparently is counting the number of occurrences of a given value in the ID column of your initial DataFrame and that can be pretty simply expressed as follows:

val df = dTestSample1.select("ID").groupBy("ID").count()

You can now collect, show or perform any sort of action with a DataFrame that holds the count of occurrences of each value of the ID column.

Sign up to request clarification or add additional context in comments.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.