3

I want to import the output data into the mysql database, but the following error occurs, I will not convert the array to the desired string type, can help me?

 val Array(trainingData, testData) = msgDF.randomSplit(Array(0.9, 0.1))
    val pipeline = new Pipeline().setStages(Array(labelIndexer, word2Vec, mlpc, labelConverter))
    val model = pipeline.fit(trainingData)
    val predictionResultDF = model.transform(testData)
    val rows = predictionResultDF.select("song", "label", "predictedLabel")
    val df = rows.registerTempTable("song_classify")
    val sqlcommand = "select * from song_classify"
    val prop = new java.util.Properties
    prop.setProperty("user", "root")
    prop.setProperty("password", "123")
    sqlContext.sql(sqlcommand)
      .write.mode(SaveMode.Append).jdbc("jdbc:mysql://localhost:3306/yuncun", "song_classify", prop)
    sc.stop

Here is the console output

Exception in thread "main" java.lang.IllegalArgumentException: Can't get JDBC type for array<string>
    at org.apache.spark.sql.execution.datasources.jdbc.JdbcUtils$$anonfun$org$apache$spark$sql$execution$datasources$jdbc$JdbcUtils$$getJdbcType$2.apply(JdbcUtils.scala:148)
    at org.apache.spark.sql.execution.datasources.jdbc.JdbcUtils$$anonfun$org$apache$spark$sql$execution$datasources$jdbc$JdbcUtils$$getJdbcType$2.apply(JdbcUtils.scala:148)
    at scala.Option.getOrElse(Option.scala:121)

I want to store the following data in the mysql database

+---------+-----+--------------+
|     song|label|predictedLabel|
+---------+-----+--------------+
|   [一吻天荒]|    1|             2|
|  [有一点动心]|    1|             2|
|   [有你真好]|    1|             2|
|  [永远不分开]|    1|             2|
|[我要我们在一起]|    2|             2|
|  [后来的我们]|    2|             2|
|     [喜欢]|    2|             2|
|     [夜车]|    2|             2|
|   [寂寞疯了]|    2|             2|
|     [拥抱]|    2|             2|
|   [方圆几里]|    2|             2|
|   [时间煮雨]|    2|             2|
|    [爱上你]|    2|             2|
|     [献世]|    2|             2|
|   [说散就散]|    2|             2|
+---------+-----+--------------+

But the first column is an array, so the program is getting an error

Can you help me propose a change plan? Thank you

3
  • The error says clearly that you can't store array on the database you need to explode the array column or convert to a string before writing. Commented May 7, 2018 at 5:27
  • @ShankarKoirala I'm a newbie. I don't know how to convert an array to a string. This is where I really can't.Can you help me, thank you Commented May 7, 2018 at 9:41
  • yes yes ,thank you very much!! Commented May 7, 2018 at 13:44

1 Answer 1

9

You need to remove the columns with array type before writing to the databases.

You can create a string with comma separated for the column type array as

val datafrme = ??

import org.apache.spark.sql.functions._

dataframe.withColumn("song", concat_ws(",", $"song"))
// then write to database
    .write.mode(SaveMode.Append).jdbc("url", "song_classify", prop)

concat_ws creates a string the values in an array with delimiter provided.

Hope this helps!

Sign up to request clarification or add additional context in comments.

1 Comment

works as a charm. one note worth mentioning is even though you got strings values in the array, te final output would be just one huge string, ie i expected to got the conversion as: "\"string1\",\"string2\"" but instead it converted it (and i believe this is completely OK) into "string1,string2"

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.