Copying CSV data to a JSON array object in Azure Data Factory

Question

I've been going round in circles trying to get what I thought would be a relatively trivial pipeline working in Azure Data Factory. I have a CSV file with a schema like this:

Id, Name, Color
1, Apple, Green
2, Lemon, Yellow

I need to transform the CSV into a JSON file that looks like this:

{"fruits":[{"Id":"1","Name":"Apple","Color":"Green"},{"Id":"2","Name":"Lemon","Color":"Yellow"}]

I can't find a simple example that helps me understand how to do this in ADF. I've tried a Copy activity, and a data flow, but the furthest I've got is a json object like this:

{"fruits":{"Id":"1","Name":"Apple","Color":"Green"}}
{"fruits":{"Id":"2","Name":"Lemon","Color":"Yellow"}}

Surely this is simple to achieve. I'd be very grateful if anyone has any suggestions. Thanks!

It seams to be simple, but per my experience, we can not achieve that. Some others have post same questions and still have no good ideas. — Leon Yue
– Leon Yue, Commented Jul 13, 2020 at 5:48
Hi Simon, do you mind implement this requirement in other service and call it in ADF ? — Hury Shen
– Hury Shen, Commented Jul 15, 2020 at 6:15

GRT · Accepted Answer · 2020-07-20 02:38:42Z

2

https://learn.microsoft.com/en-us/azure/data-factory/copy-activity-schema-and-type-mapping#tabularhierarchical-source-to-hierarchical-sink

"When copying data from tabular source to hierarchical sink, writing to array inside object is not supported"

But, if we put file pattern under Sink properties as 'Array of Objects', you can achieve somewhere till here:

    [{"Id":"1","Name":" Apple","Color":" Green"}
     ,{"Id":"2","Name":" Lemon","Color":" Yellow"}
    ]

answered Jul 20, 2020 at 2:38

GRT

462 bronze badges

Sign up to request clarification or add additional context in comments.

Comments

Faitus Joseph · Accepted Answer · 2024-01-23 08:45:50Z

Below are the steps to be followed to generate the desired output JSON file.

In the ADF create a DataFlow with the following Transformations

Source
Derived Column
Aggregate
Sink

Data flow

In the Source Transformation select the Source Dataset where the source file is present. Source transformation
In the Deriver Column Transformation, add a column as 'fruit' and 3 sub columns Id, name and Color and map the column names from 'Input Schema' to the respective column name.

Column

sub column

In the Aggregate Transformation, leave the 'Group by' tab as blank and in the 'Aggregates' tab select the column 'fruits' and the expression as collect(fruits)

Aggregate

In the sink transformation select the destination dataset.

Sink

In Sink transformation setting set 'File name option' to 'Output to single file ' and mention the output file name. In mappings tab uncheck 'Auto Mapping'.

sink setting Mapping

Create a pipeline and drag the drop the data flow and run the pipeline. You will get your desired output.

Hope this helps.

Collectives™ on Stack Overflow

Copying CSV data to a JSON array object in Azure Data Factory

2 Answers 2

Comments

Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

2 Answers 2

Comments

Comments

Your Answer

Sign up or log in

Post as a guest

Related