How to use array type column value in CASE statement

Question

I have a dataframe with two columns, listA stored as Seq[String] and valB stored as String. I want to create a third column valC, which will be of Int type and its value is
iff valB is present in listA then 1 otherwise 0

I tried doing the following:

val dfWithAdditionalColumn = df.withColumn("valC", when($"listA".contains($"valB"), 1).otherwise(0))

But Spark failed to execute this and gave the following error:

cannot resolve 'contains('listA', 'valB')' due to data type mismatch: argument 1 requires string type, however, 'listA' is of array type.;

How do I use a array type column value in CASE statement?

Thanks, Devj

Alper t. Turker · Accepted Answer · 2017-09-15 09:34:26Z

2

You should use array_contains:

import org.apache.spark.sql.functions.{expr, array_contains}

df.withColumn("valC", when(expr("array_contains(listA, valB)"), 1).otherwise(0))

edited Sep 15, 2017 at 9:34

answered Sep 14, 2017 at 14:08

Alper t. Turker

35.3k9 gold badges89 silver badges118 bronze badges

Sign up to request clarification or add additional context in comments.

Comments

philantrovert · Accepted Answer · 2017-09-14 14:06:40Z

1

You can write a simple udf that will check if the element is present in the array :

val arrayContains = udf( (col1: Int, col2: Seq[Int]) => if(col2.contains(col1) ) 1 else 0 )

And then just call it and pass the necessary columns in the correct order :

df.withColumn("hasAInB", arrayContains($"a", $"b" ) ).show

+---+---------+-------+
|  a|        b|hasAInB|
+---+---------+-------+
|  1|   [1, 2]|      1|
|  2|[2, 3, 4]|      1|
|  3|   [1, 4]|      0|
+---+---------+-------+

answered Sep 14, 2017 at 14:06

philantrovert

10.1k3 gold badges43 silver badges65 bronze badges

Collectives™ on Stack Overflow

How to use array type column value in CASE statement

2 Answers 2

Comments

Comments

Your Answer

Linked

Hot Network Questions

Collectives™ on Stack Overflow

2 Answers 2

Comments

Comments

Your Answer

Sign up or log in

Post as a guest

Linked

Related