4

Is there an expression-based tool for querying python complex objects the same way one can do using XPath for XML or jsonpath for JSON?

I thought about serializing my object to JSON and then using jsonpath on it, but it seems to be a clumsy way of doing that.

2
  • What do you mean by "complex objects"? Nested dictionaries? Commented Oct 5, 2019 at 12:20
  • I mean something as complex as a JSON object: a traversable tree of elements that can be bool, str, int, float, and also dict (as for an object) and list (as for an array). But the root element would be a dictionary, indeed. Commented Oct 5, 2019 at 12:24

5 Answers 5

2

You can use built-in library json to import json as a nested dictionary and traverse it using dictionary notation - root['level1_object']['level2_object']. JSON-compatible object types are of course loaded as corresponding Python types.

For other types of data there are other libraries, which mostly behave in similar fashion.

My new favourite is Box, which allows you to traverse nested dictionaries using a dot notation.

Sign up to request clarification or add additional context in comments.

1 Comment

Box seems promising, I'll take a look on it.
1

You might want to take a look at AST module: https://docs.python.org/2/library/ast.html

2 Comments

How would you query 'complex objects' using AST ?
I might have misunderstood it, but it seems I would need to have my root dictionary serialized to string in order to use that. Did I miss something?
1

@vBobCat I'm currently on the search for a similar solution. Agreed that serializing and deserializing with json is not ideal. What did you end up going with?

I found http://objectpath.org/ to be close to the right solution for my use case though it lacks features in making arbitrary updates to fields that I need. Its syntax, though slightly different than JSONPath, expresses many of the things that JSONPath does.

3 Comments

Hello, sorry for the late answer. My use case was simple enough so I used the Box library (in accepted answer) combined with exec(), so I can interpolate stored paths in a string and execute it.
Hello again, I don't know wether it remains useful for you after all this time, actually jsonpath-rw seems to be the library we were looking for.
Hey @VBobCat, oops I didn't think to update this, but yes I arrived at the same conclusion as you. I assumed initially that the library required a json string as input, but that turned out to be incorrect and that you can pass it dictionaries and lists.
1

I'm adding this answer for the sake of future researchers:

It seems jsonpath-rw is the library I was looking for since the beginning, since it does exactly what I originally requested.

Comments

0

Adding a non-library option here. A method that finds a nested element based on a dot notation string (can traverse nested dicts and lists), see below of Gist here:

from functools import reduce
import re
from typing import Any, Optional

def find_key(dot_notation_path: str, payload: dict) -> Any:
    """Try to get a deep value from a dict based on a dot-notation"""

    def get_despite_none(payload: Optional[dict], key: str) -> Any:
        """Try to get value from dict, even if dict is None"""
        if not payload or not isinstance(payload, (dict, list)):
            return None
        # can also access lists if needed, e.g., if key is '[1]'
        if (num_key := re.match(r"^\[(\d+)\]$", key)) is not None:
            try:
                return payload[int(num_key.group(1))]
            except IndexError:
                return None
        else:
            return payload.get(key, None)

    found = reduce(get_despite_none, dot_notation_path.split("."), payload)
   
    # compare to None, as the key could exist and be empty
    if found is None:
        raise KeyError()
    return found

# Test cases:

payload = {
    "haystack1": {
        "haystack2": {
            "haystack3": None, 
            "haystack4": "needle"
        }
    },
    "haystack5": [
        {"haystack6": None}, 
        {"haystack7": "needle"}
    ],
    "haystack8": {},
}

find_key("haystack1.haystack2.haystack4", payload)
# "needle"
find_key("haystack5.[1].haystack7", payload)
# "needle"
find_key("[0].haystack5.[1].haystack7", [payload, None])
# "needle"
find_key("haystack8", payload)
# {}
find_key("haystack1.haystack2.haystack4.haystack99", payload)
# KeyError

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.