I am getting below error in Spark(1.6) SQL while creating a table with column value default as NULL. Ex: create table test as select column_a, NULL as column_b from test_temp;
The same thing works in Hive and creates the column with data type "void".
I am using empty string instead of NULL to avoid the exception and new column getting string data type.
Is there any better way to insert null values in hive table using spark sql ?
2017-12-26 07:27:59 ERROR StandardImsLogger$:177 - org.apache.hadoop.hive.ql.metadata.HiveException: java.lang.UnsupportedOperationException: Unknown field type: void
at org.apache.hadoop.hive.ql.metadata.Hive.createTable(Hive.java:789)
at org.apache.hadoop.hive.ql.metadata.Hive.createTable(Hive.java:746)
at org.apache.spark.sql.hive.client.ClientWrapper$$anonfun$createTable$1.apply$mcV$sp(ClientWrapper.scala:428)
at org.apache.spark.sql.hive.client.ClientWrapper$$anonfun$createTable$1.apply(ClientWrapper.scala:426)
at org.apache.spark.sql.hive.client.ClientWrapper$$anonfun$createTable$1.apply(ClientWrapper.scala:426)
at org.apache.spark.sql.hive.client.ClientWrapper$$anonfun$withHiveState$1.apply(ClientWrapper.scala:293)
at org.apache.spark.sql.hive.client.ClientWrapper.liftedTree1$1(ClientWrapper.scala:239)
at org.apache.spark.sql.hive.client.ClientWrapper.retryLocked(ClientWrapper.scala:238)
at org.apache.spark.sql.hive.client.ClientWrapper.withHiveState(ClientWrapper.scala:281)
at org.apache.spark.sql.hive.client.ClientWrapper.createTable(ClientWrapper.scala:426)
at org.apache.spark.sql.hive.execution.CreateTableAsSelect.metastoreRelation$lzycompute$1(CreateTableAsSelect.scala:72)
at org.apache.spark.sql.hive.execution.CreateTableAsSelect.metastoreRelation$1(CreateTableAsSelect.scala:47)
at org.apache.spark.sql.hive.execution.CreateTableAsSelect.run(CreateTableAsSelect.scala:89)
at org.apache.spark.sql.execution.ExecutedCommand.sideEffectResult$lzycompute(commands.scala:58)
at org.apache.spark.sql.execution.ExecutedCommand.sideEffectResult(commands.scala:56)
at org.apache.spark.sql.execution.ExecutedCommand.doExecute(commands.scala:70)
at org.apache.spark.sql.execution.SparkPlan$$anonfun$execute$5.apply(SparkPlan.scala:132)
at org.apache.spark.sql.execution.SparkPlan$$anonfun$execute$5.apply(SparkPlan.scala:130)
at org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:150)
at org.apache.spark.sql.execution.SparkPlan.execute(SparkPlan.scala:130)
at org.apache.spark.sql.execution.QueryExecution.toRdd$lzycompute(QueryExecution.scala:56)
at org.apache.spark.sql.execution.QueryExecution.toRdd(QueryExecution.scala:56)
at org.apache.spark.sql.DataFrame.withCallback(DataFrame.scala:153)
at org.apache.spark.sql.DataFrame.<init>(DataFrame.scala:145)
at org.apache.spark.sql.DataFrame.<init>(DataFrame.scala:130)
at org.apache.spark.sql.DataFrame$.apply(DataFrame.scala:52)
at org.apache.spark.sql.SQLContext.sql(SQLContext.scala:829)
select cast(NULL as string)?