1

I have a "Dataset(Row)" as below

+-----+--------------+
|val  |  history     |
+-----+--------------+
|500  |[a=456, a=500]|
|800  |[a=456, a=500]|
|784  |[a=456, a=500]|
+-----+--------------+

Here val is "String" and history is an "string array". I'm trying to add the content in val column to the history column, so that my dataset looks like :

+-----+---------------------+
|val  |  history            |
+-----+---------------------+
|500  |[a=456, b=500, c=500]|
|800  |[a=456, b=500, c=800]|
|784  |[a=456, b=500, c=784]|
+-----+---------------------+

A similar question is discussed here https://stackoverflow.com/a/49685271/2316771 , but I don't know scala and couldn't create a similar java solution.

Please help me to achieve this in java

2
  • Are you sure that is a string array? Why does it have keys? Commented Jun 7, 2019 at 1:18
  • It is just a format for convenience. But it is a string. Commented Jun 7, 2019 at 2:05

2 Answers 2

4

In Spark 2.4 (not before), you can use the concat function to concat two arrays. In your case, you could do something like:

df.withColumn("val2", concat(lit("c="), col("val")))
  .select(concat(col("history"), array(col("val2")));

NB: the first time I use concat is to concat strings, the second time, to concat arrays. array(col("val2")) creates an array of one element.

Sign up to request clarification or add additional context in comments.

Comments

0

I coded a solution but I'm not sure if it can be further optimized

    dataset.map(row -> {
        Seq<String> seq = row.getAs("history");
        ArrayList<String> list = new ArrayList<>(JavaConversions.seqAsJavaList(seq));
        list.add("c="+row.getAs("val"));

        return RowFactory.create(row.getAs("val"),list.toArray(new String[0]));},schema);

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.