1

New to data factory. I have a json file that needs to manipulate but I can't figure out how to go about it. The file has a generic "name" property but it should have the value as the key name. How can I get it so that I can get the value as key? So far been getting Complex JSON errors. This json is coming from file store.

[
    {
        "Version": "1.1",
        "Documents": [
            {
                "DocumentState": "Correct",
                "DocumentData": {
                    "Name": "Name1",
                    "$type": "Document",
                    "Fields": [
                        {
                            "Name": "Form",
                            "$type": "Text",
                            "Value": "Birthday Form"
                        },
                        {
                            "Name": "Date",
                            "$type": "Text",
                            "Value": "12/1/1999"
                        },
                        {
                            "Name": "FirstName",
                            "$type": "Text",
                            "Value": "John"
                        },
                        {
                            "Name": "FirstName",
                            "$type": "Text",
                            "Value": "Smith"
                        }
                    ]
                }
            }
        ]
    },
    {
        "Version": "1.1",
        "Documents": [
            {
                "DocumentState": "Correct",
                "DocumentData": {
                    "Name": "Name2",
                    "$type": "Document",
                    "Fields": [
                        {
                            "Name": "Form",
                            "$type": "Text",
                            "Value": "Entry Form"
                        },
                        {
                            "Name": "Date",
                            "$type": "Text",
                            "Value": "4/3/2010"
                        },
                        {
                            "Name": "FirstName",
                            "$type": "Text",
                            "Value": "Jane"
                        },
                        {
                            "Name": "LastName",
                            "$type": "Text",
                            "Value": "Doe"
                        }
                    ]
                }
            }
        ]
    }
]

Expected output

DocumentData: [
{
  "Form":"Birthday Form",
  "Date": "12/1/1999",
  "FirstName": "John",
  "LastName": "Smith"
},
{
  "Form":"Entry Form",
  "Date": "4/3/2010",
  "FirstName": "Jane",
  "LastName": "Doe"
}
]
2
  • Please share expected output. Commented Sep 17, 2021 at 12:20
  • @AbhishekKhandave-MT please see the updated output. Commented Sep 17, 2021 at 14:21

2 Answers 2

1

@jaimers,

I was able to achieve it by making use of the Data Flow Activity

The below is the complete DataFlow

enter image description here

1) Source1

This step involves getting the data from source. You will have to configure the Source dataset.

The only change I had done in the source was to Convert Fields.Name,Field.Type,Field.Value as string[] (From string).

This was required to make/create key value pair of the fields in the Subsequent steps.

enter image description here

  1. Flatten1 I had made use of Flatten at the Document level.

And got the values of DocumentData.DocumentName and DocumentData.Fields

Note : If you don't want DocumentData.DocumentName - You can safely ignore it.

enter image description here

4) DerivedColumn1

This is actual step where I convert name:key1 key:value1 to key1:value1.

To do that I had made use of the below expression :

keyValues(Fields.Name,Fields.Value)

Note: Keyvalues() function expects 2 array arguments. Hence, in the first step we had changed the type of Fields.Name and Fields.Value to array.

enter image description here

4) Select

Just to select the columns that need to be sent as an output

enter image description here

Output enter image description here

Sign up to request clarification or add additional context in comments.

Comments

0

You mentioned SQL in your title so if you have access to a SQL database, eg Azure SQL DB, then it is quite capable with manipulating JSON, eg using the OPENJSON and FOR JSON PATH methods. A simple example:

DECLARE @json VARCHAR(MAX) = '[
    {
        "Version": "1.1",
        "Documents": [
            {
                "DocumentState": "Correct",
                "DocumentData": {
                    "Name": "Name1",
                    "$type": "Document",
                    "Fields": [
                        {
                            "Name": "Form",
                            "$type": "Text",
                            "Value": "Birthday Form"
                        },
                        {
                            "Name": "Date",
                            "$type": "Text",
                            "Value": "12/1/1999"
                        },
                        {
                            "Name": "FirstName",
                            "$type": "Text",
                            "Value": "John"
                        },
                        {
                            "Name": "FirstName",
                            "$type": "Text",
                            "Value": "Smith"
                        }
                    ]
                }
            }
        ]
    },
    {
        "Version": "1.1",
        "Documents": [
            {
                "DocumentState": "Correct",
                "DocumentData": {
                    "Name": "Name2",
                    "$type": "Document",
                    "Fields": [
                        {
                            "Name": "Form",
                            "$type": "Text",
                            "Value": "Entry Form"
                        },
                        {
                            "Name": "Date",
                            "$type": "Text",
                            "Value": "4/3/2010"
                        },
                        {
                            "Name": "FirstName",
                            "$type": "Text",
                            "Value": "Jane"
                        },
                        {
                            "Name": "LastName",
                            "$type": "Text",
                            "Value": "Doe"
                        }
                    ]
                }
            }
        ]
    }
]'

-- Restructure the JSON and add a root
SELECT *
FROM OPENJSON ( @json )
WITH
(
    Form        VARCHAR(50) '$.Documents[0].DocumentData.Fields[0].Value',
    [Date]      DATE '$.Documents[0].DocumentData.Fields[1].Value',
    FirstName   VARCHAR(50) '$.Documents[0].DocumentData.Fields[2].Value',
    LastName    VARCHAR(50) '$.Documents[0].DocumentData.Fields[3].Value'
)
FOR JSON PATH, ROOT('DocumentData');

My results:

Results

NB I've used the ROOT clause to add a root to the JSON document. You could make the @json a stored proc parameter and use a Stored Proc task from the pipeline.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.