7

I'm trying to flatten and filter my json data that is in a CosmosDB. The data looks like below and I would like to flatten everything in the array Variables and then filter by specific _id and Timestamp inside of the array:

{
"_id": 21032,
"FirstConnected": {
    "$date": 1522835868346
},
"LastUpdated": {
    "$date": 1523360279908
},
"Variables": [
    {
        "_id": 99999,
        "Values": [
            {
                "Timestamp": {
                    "$date": 1522835868347
                },
                "Value": 1
            }
        ]
    },
    {
        "_id": 99998,
        "Values": [
            {
                "Timestamp": {
                    "$date": 1523270312001
                },
                "Value": 8888
            }

       ]
    }
]
}   

2 Answers 2

6

If you want to flatten data from the Variables array with properties from the root object you can query your collection like this:

SELECT root._id, root.FirstConnected, root.LastUpdated, var.Values
FROM root 
JOIN var IN root.Variables
WHERE var._id = 99998

This will result into:

[
  {
    "_id": 21032,
    "FirstConnected": {
      "$date": 1522835868346
    },
    "LastUpdated": {
      "$date": 1523360279908
    },
    "Values": [
      {
        "Timestamp": {
          "$date": 1523270312001
        },
        "Value": 8888
      }
    ]
  }
]

If you want to even flatten the Values array you will need to write something like this:

SELECT root._id, root.FirstConnected, root.LastUpdated, 
       var.Values[0].Timestamp, var.Values[0]["Value"]
FROM root 
JOIN var IN root.Variables
WHERE var._id = 99998

Note that CosmosDB considers "Value" as a reserved keyword and you need to use an escpape syntax. The result for this query is:

[
  {
    "_id": 21032,
    "FirstConnected": {
      "$date": 1522835868346
    },
    "LastUpdated": {
      "$date": 1523360279908
    },
    "Timestamp": "1970-01-01T00:00:00Z",
    "Value": 8888
  }
]

Check for more details https://learn.microsoft.com/en-us/azure/cosmos-db/sql-api-sql-query#Advanced

Sign up to request clarification or add additional context in comments.

2 Comments

Thanks everyone for helping out. My final result is: SELECT root._id as hpid ,root.FirstConnected["$date"] as FirstConnected ,root.LastUpdated["$date"] as LastUpdated ,var._id as varid ,var.Values[0].Timestamp["$date"] as TimeStamp ,var.Values[0]["Value"] as Val FROM root JOIN var IN root.Variables WHERE var._id IN (99998,99999) AND var.Values[0].Timestamp["$date"] >= 1523270312001 A final question is if their is a good way to dynamicly filter on the TimeStamp value which is in UnixTimeStamp. Lets say I want to filter last 5 days from today?
@baatchen Please mark this answer as the accepted one. Then post another question for the dynamic filter on the TimeStamp.
4

If you're only looking for filtering by the nested '_id' property then you could use ARRAY_CONTAINS w/ the partial_match argument set to true. The query would look something like this:

SELECT VALUE c
FROM c
WHERE ARRAY_CONTAINS(c.Variables, {_id: 99998}, true)

If you also want to flatten the array, then you could use JOIN

SELECT VALUE v
FROM v IN c.Variables
WHERE v._id = 99998

3 Comments

Thanks! If I also want { "_id": 21032, "FirstConnected": { "$date": 1522835868346 }, "LastUpdated": { "$date": 1523360279908 } for each variables._id how would the query look like then?
@baatchen What's the final format of result you want? Could you please sort it in your question?
You could do this: SELECT VALUE c FROM c JOIN v IN c.Variables WHERE ...

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.