1

My json structure is the following:

{
"First":"xxxx",
"Country":"XX",
"Loop": {
    "Links": [
        {
            "Url":"xxxx",
            "Time":123
        }, {
            "Url":"xxxx",
            "Time":123
        }],
    "TotalTime":123,
    "Date":"2018-04-09T10:29:39.0233082+00:00"
}

I'd like to extract the properties

First
Country
Url & Time foreach object in the array
TotalTime
Date

This is my query

REFERENCE ASSEMBLY [Newtonsoft.Json];
REFERENCE ASSEMBLY [Microsoft.Analytics.Samples.Formats]; 

@extration = 
EXTRACT 
    jsonString string 
FROM @"/storage-api/input.json" 
USING Extractors.Tsv(quoting:false);

@cleanUp = SELECT jsonString FROM @extration WHERE (!jsonString.Contains("Part: h" ) AND jsonString!= "465}");

@jsonify = SELECT Microsoft.Analytics.Samples.Formats.Json.JsonFunctions.JsonTuple(jsonString) AS obj FROM @cleanUp;

@columnized = SELECT 
        obj["First"] AS first,
        obj["Country"] AS country
FROM @jsonify;

OUTPUT @columnized
TO @"/storage-api/outputs/tpe1-output.csv"
USING Outputters.Csv();

But this query only extract the first 2 properties, I don't know how to query nested data inside "Loop"

1 Answer 1

1

You can do that using the MultiLevelJsonExtractor (notes here) and a JSON path eg Loop.Links[*]. MultiLevelJsonExtractor has a nice feature where if your node isn't found at your base path it will check recursively for it, although I'm not sure how the performance would scale on a large JSON documents or a large number of JSON documents.

Try this:

DECLARE @input string = "/input/input65.json";

REFERENCE ASSEMBLY [Newtonsoft.Json];
REFERENCE ASSEMBLY [Microsoft.Analytics.Samples.Formats]; 

USING Microsoft.Analytics.Samples.Formats.Json;

@result =
    EXTRACT First string,
            Country string,
            Date DateTime,
            Url string,
            Time string,
            TotalTime int
    FROM @input
    USING new MultiLevelJsonExtractor("Loop.Links[*]",

          false,
          "First",
          "Country",
          "Date",
          "Url",
          "Time",
          "TotalTime"
          );


OUTPUT @result
TO "/output/output.csv"
USING Outputters.Csv();

My results:

Results

HTH

Sign up to request clarification or add additional context in comments.

1 Comment

Thank you, I didn't know MultiLevelJsonExtractor class

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.