1

I try to implement such code:

    StructType dataStruct = new StructType()
            .add("items", DataTypes.createArrayType(DataTypes.StringType, false), false);
    ExpressionEncoder<Row> encoder = RowEncoder.apply(dataStruct);

    Dataset<Row> arrayItems = transactions.map((MapFunction<Row, Row>) row -> {
        List<String> items = new LinkedList<>();
        for (int i = 1; i <= 12; i++) {
            if (row.getString(i) != null)
                items.add(row.getString(i));
        }
        System.out.println(items);
        return RowFactory.create(items.toArray());
    }, encoder);

to convert dataset with such schema:

|user<String>|item1<String>|item2<String>|item3<String>|...|item12<String>|

to dataset with such schema:

|item<String[]>|

but i take following exception: java.lang.RuntimeException: java.lang.String is not a valid external type for schema of array

I don't understand why RowFactory takes as argument String, not String[]? Can somebody help me, what I should do in this situation?

Data example:

user|item1|item2|item3|item4|item5|item6|item7|item8|item9|item10|item11|item12
Bob|01W|01J|01W|01J|01W|01J|01W|01J|01W|01J|null|null
John|03T|018T|003H|A44I|03T|null|003H|A44I|03T|018T|003H|null
Bill|CMZI|UDAG|01W|null|null|01J|018T|003H|A44I|018T|003H|A44I
4
  • please share data, otherwise your code is not going to be reproducible. Commented Jan 25, 2018 at 10:45
  • Which version of Spark do you use and on which line exception is? Commented Jan 25, 2018 at 10:51
  • I use Spark 2.2.1. Exception is on line "return RowFactory.create(items.toArray());" Commented Jan 25, 2018 at 11:06
  • @mtoto It is reproducible as is. It is just a tricky one. Commented Jan 25, 2018 at 11:38

1 Answer 1

2

This happens because varargs in Java are just syntactic sugar and

Object ... values

is equivalent to

Object[] values

so

return RowFactory.create(items.toArray());

will expand the array. You'll need a nested structure:

Object[] rowItems =  {items.toArray()};
RowFactory.create(rowItems));

Further reading Can I pass an array as arguments to a method with variable arguments in Java?

Sign up to request clarification or add additional context in comments.

1 Comment

@JackLoki I am glad I could help. This is a nice question. Would you consider upvote? :)

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.