1

I have to filter a column in spark dataframe using a Array[String]

I have an parameter file like below,

variable1=100,200

I read the parameter file and split each row by "=" and load in to a Map[String,String] In order to get the value, I pass the key "varaible1" and split the value by ","

val value1:Array[String] = parameterValues("varaible1").split(",")

now I need to use this value1 while filtering a dataframe.

val finalDf = testDf.filter("column1 in ($value1) and column2 in ($value1)")

I'm getting the below error,

org.apache.spark.sql.catalyst.parser.ParseException: 
mismatched input '(' expecting <EOF>(line 1, pos 12)

== SQL ==
column1 in ([Ljava.lang.String;@760b9444) and column2 in ([Ljava.lang.String;@5397d41e)
------------^^^

Any suggestions?

2 Answers 2

1

To filter a column by an array, you can use the isin column method:

import org.apache.spark.sql.functions.col

val finalDf = testDf.filter(col("column1").isin(value1: _*) && col("column2").isin(value1: _*))
Sign up to request clarification or add additional context in comments.

Comments

1

You can filter a column using an array as you've done. To correct your SQL expression, you need to do two things.

First, you forgot to put the 's' string interpolator at the start of your string representing your SQL expression, as below:

s"column1 in ($value1) and column2 in ($value1)"

And then, you need to convert your Array[String] to a well formatted String that will be understood as an SQL array. To do so, you can use mkString method on your value1 array:

value1.mkString("'", "','","'")

On your array Array("100", "200"), this method would return the string "'100','200'"

If we wrap up everything, we get the following expression:

val finalDf = testDf.filter(s"column1 in (${value1.mkString("'", "','","'")}) and column2 in (${value1.mkString("'", "','","'")})")

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.