0

I'm able to create a new Dataframe with one column having Map datatype.

val inputDF2 = Seq(
(1, "Visa", 1, Map[String, Int]()), 
(2, "MC", 2, Map[String, Int]())).toDF("id", "card_type", "number_of_cards", "card_type_details")
scala> inputDF2.show(false)
+---+---------+---------------+-----------------+
|id |card_type|number_of_cards|card_type_details|
+---+---------+---------------+-----------------+
|1  |Visa     |1              |[]               |
|2  |MC       |2              |[]               |
+---+---------+---------------+-----------------+

Now I want to create a new column of the same type as card_type_details. I'm trying to use the spark withColumn method to add this new column.

inputDF2.withColumn("tmp", lit(null) cast "map<String, Int>").show(false)

+---------+---------+---------------+---------------------+-----+
|person_id|card_type|number_of_cards|card_type_details    |tmp  |
+---------+---------+---------------+---------------------+-----+
|1        |Visa     |1              |[]                   |null |
|2        |MC       |2              |[]                   |null |
+---------+---------+---------------+---------------------+-----+ 

When I checked the schema of both the columns, it is same but values are coming different.

scala> inputDF2.withColumn("tmp", lit(null) cast "map<String, Int>").printSchema
root
 |-- id: integer (nullable = false)
 |-- card_type: string (nullable = true)
 |-- number_of_cards: integer (nullable = false)
 |-- card_type_details: map (nullable = true)
 |    |-- key: string
 |    |-- value: integer (valueContainsNull = false)
 |-- tmp: map (nullable = true)
 |    |-- key: string
 |    |-- value: integer (valueContainsNull = true)

I'm not sure if I'm doing correctly while adding the new column. Issue is coming when I'm applying the .isEmpty method on the tmp column. I'm getting null pointer exception.

scala> def checkValue = udf((card_type_details: Map[String, Int]) => {
     | var output_map = Map[String, Int]()
     | if (card_type_details.isEmpty) { output_map += 0.toString -> 1 }
     | else {output_map = card_type_details }
     | output_map
     | })
checkValue: org.apache.spark.sql.expressions.UserDefinedFunction

scala> inputDF2.withColumn("value", checkValue(col("card_type_details"))).show(false)
+---+---------+---------------+-----------------+--------+
|id |card_type|number_of_cards|card_type_details|value   |
+---+---------+---------------+-----------------+--------+
|1  |Visa     |1              |[]               |[0 -> 1]|
|2  |MC       |2              |[]               |[0 -> 1]|
+---+---------+---------------+-----------------+--------+

scala> inputDF2.withColumn("tmp", lit(null) cast "map<String, Int>")
.withColumn("value", checkValue(col("tmp"))).show(false)

org.apache.spark.SparkException: Failed to execute user defined function($anonfun$checkValue$1: (map<string,int>) => map<string,int>)

Caused by: java.lang.NullPointerException
  at $anonfun$checkValue$1.apply(<console>:28)
  at $anonfun$checkValue$1.apply(<console>:26)
  at org.apache.spark.sql.catalyst.expressions.ScalaUDF$$anonfun$2.apply(ScalaUDF.scala:108)
  at org.apache.spark.sql.catalyst.expressions.ScalaUDF$$anonfun$2.apply(ScalaUDF.scala:107)
  at org.apache.spark.sql.catalyst.expressions.ScalaUDF.eval(ScalaUDF.scala:1063)

How to add a new column that should have the same values as card_type_details column.

1 Answer 1

1

To add the tmp column with the same value as card_type_details, you just do:

inputDF2.withColumn("tmp", col("cart_type_details"))

If you aim to add a column with an empty map and avoid the NullPointerException, the solution is:

inputDF2.withColumn("tmp", typedLit(Map.empty[Int, String]))
Sign up to request clarification or add additional context in comments.

1 Comment

typedLit worked for me. Thanks for the help..! ` scala> inputDF2.withColumn("tmp", typedLit(Map.empty[String,Int])).withColumn("value", checkValue(col("tmp"))).show(false) +---+---------+---------------+-----------------+---+--------+ |id |card_type|number_of_cards|card_type_details|tmp|value | +---+---------+---------------+-----------------+---+--------+ |1 |Visa |1 |[] |[] |[0 -> 1]| |2 |MC |2 |[] |[] |[0 -> 1]| +---+---------+---------------+-----------------+---+--------+`

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.