22

I have a working model to receive a json data set using pydantic. The model data set looks like this:

data = {'thing_number': 123, 
        'thing_description': 'duck',
        'thing_amount': 4.56}

What I would like to do is have a list of json files as the data set and be able to validate them. Ultimately the list will be converted to records in pandas for further processing. My goal is to validate an arbitrarily long list of json entries that looks something like this:

bigger_data = [{'thing_number': 123, 
                'thing_description': 'duck',
                'thing_amount': 4.56}, 
               {'thing_number': 456, 
                'thing_description': 'cow',
                'thing_amount': 7.89}]

The basic setup I have now is as follows. Note that adding the class ItemList is part of the attempt to get the arbitrary length to work.

from typing import List
from pydantic import BaseModel
from pydantic.schema import schema
import json
                                                          
class Item(BaseModel):
    thing_number: int
    thing_description: str
    thing_amount: float

class ItemList(BaseModel):
    each_item: List[Item]                                                                           

The basic code will then produce what I think I'm looking for in an array object that will take Item objects.

item_schema = schema([ItemList])
print(json.dumps(item_schema, indent=2)) 
                                
    {
      "definitions": {
        "Item": {
          "title": "Item",
          "type": "object",
          "properties": {
            "thing_number": {
              "title": "Thing_Number",
              "type": "integer"
            },
            "thing_description": {
              "title": "Thing_Description",
              "type": "string"
            },
            "thing_amount": {
              "title": "Thing_Amount",
              "type": "number"
            }
          },
          "required": [
            "thing_number",
            "thing_description",
            "thing_amount"
          ]
        },
        "ItemList": {
          "title": "ItemList",
          "type": "object",
          "properties": {
            "each_item": {
              "title": "Each_Item",
              "type": "array",
              "items": {
                "$ref": "#/definitions/Item"
              }
            }
          },
          "required": [
            "each_item"
          ]
        }
      }
    }

The setup works on a single json item being passed:

item = Item(**data)                                                      
    
print(item)
                                                             
Item thing_number=123 thing_description='duck' thing_amount=4.56

But when I try and pass the single item into the ItemList model it returns an error:

item_list = ItemList(**data)

---------------------------------------------------------------------------
ValidationError                           Traceback (most recent call last)
<ipython-input-94-48efd56e7b6c> in <module>
----> 1 item_list = ItemList(**data)

/opt/conda/lib/python3.7/site-packages/pydantic/main.cpython-37m-x86_64-linux-gnu.so in pydantic.main.BaseModel.__init__()

/opt/conda/lib/python3.7/site-packages/pydantic/main.cpython-37m-x86_64-linux-gnu.so in pydantic.main.validate_model()

ValidationError: 1 validation error for ItemList
each_item
  field required (type=value_error.missing)

I've also tried passing bigger_data into the array thinking that it would need to start as a list. that also returns an error - - Although, I at least have a better understanding of the dictionary error I can't figure out how to resolve.

item_list2 = ItemList(**data_big)

---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
<ipython-input-100-8fe9a5414bd6> in <module>
----> 1 item_list2 = ItemList(**data_big)

TypeError: MetaModel object argument after ** must be a mapping, not list

Thanks.

Other Things I've Tried

I've tried passing the data into the specific key with a little more luck (maybe?).

item_list2 = ItemList(each_item=data_big)

---------------------------------------------------------------------------
ValidationError                           Traceback (most recent call last)
<ipython-input-111-07e5c12bf8b4> in <module>
----> 1 item_list2 = ItemList(each_item=data_big)

/opt/conda/lib/python3.7/site-packages/pydantic/main.cpython-37m-x86_64-linux-gnu.so in pydantic.main.BaseModel.__init__()

/opt/conda/lib/python3.7/site-packages/pydantic/main.cpython-37m-x86_64-linux-gnu.so in pydantic.main.validate_model()

ValidationError: 6 validation errors for ItemList
each_item -> 0 -> thing_number
  field required (type=value_error.missing)
each_item -> 0 -> thing_description
  field required (type=value_error.missing)
each_item -> 0 -> thing_amount
  field required (type=value_error.missing)
each_item -> 1 -> thing_number
  field required (type=value_error.missing)
each_item -> 1 -> thing_description
  field required (type=value_error.missing)
each_item -> 1 -> thing_amount
  field required (type=value_error.missing)

6 Answers 6

37

The following also works, and does not require a root type.

To convert from a List[dict] to a List[Item]:

items = parse_obj_as(List[Item], bigger_data)

To convert from JSON str to a List[Item]:

items = parse_raw_as(List[Item], bigger_data_json)

To convert from a List[Item] to a JSON str:

from pydantic.json import pydantic_encoder

bigger_data_json = json.dumps(items, default=pydantic_encoder)

or with a custom encoder:

from pydantic.json import pydantic_encoder

def custom_encoder(**kwargs):
    def base_encoder(obj):
        if isinstance(obj, BaseModel):
            return obj.dict(**kwargs)
        else:
            return pydantic_encoder(obj)
    return base_encoder


bigger_data_json = json.dumps(items, default=custom_encoder(by_alias=True))
Sign up to request clarification or add additional context in comments.

5 Comments

I found this really useful. For others, the import for pydantic_encoder is: from pydantic.json import pydantic_encoder .
This saved the day for me. It was exactly what I needed, without doing this, I received a non-serializable datamodel error.
Thanks! This deserves to be documented more clearly. The case of returning a simple list of pydantic types is pretty common …
pydantic:parse_raw_as` has been removed in V2.
Looks like you can use TypeAdapter: items = TypeAdapter(list[Item]).validate_json(bigger_data_json)
30

To avoid having "each_item" in the ItemList, you can use the __root__ Pydantic keyword:

from typing import List
from pydantic import BaseModel

class Item(BaseModel):
    thing_number: int
    thing_description: str
    thing_amount: float

class ItemList(BaseModel):
    __root__: List[Item]    # ⯇-- __root__

To build the item_list:

just_data = [
    {"thing_number": 123, "thing_description": "duck", "thing_amount": 4.56},
    {"thing_number": 456, "thing_description": "cow", "thing_amount": 7.89},
]
item_list = ItemList(__root__=just_data)

a_json_duck = {"thing_number": 123, "thing_description": "duck", "thing_amount": 4.56}
item_list.__root__.append(a_json_duck)

The web-frameworks supporting Pydantic often jsonify such ItemList as a JSON array without intermediate __root__ keyword.

4 Comments

For my own understanding, does the __root__ effectively change the 'root' character of the ItemList to those item in Item? Whereas, using each_item effectively creates a thing inside ItemList? Thanks.
The docs lists this as a use case, so I prefer this, although it feels slighly un-pythonic to ask users to use __root__ keyword. Advantage of this method over the other answer is that ItemList.json() returns the expected JSON structure.
Unfortunately, when appending items to the root (as done at the end of this answer), there is no validation of these items. They are simply appended. If (like in your case), the items in the list are pydantic models that might require validation, you need to trigger this yourself (e.g. using Item.validate(...)).
For information, if you want to iterate over the __root__ list or access items by an index -- you have to implement __iter__ and __getitem__ methods in the class.
14
from typing import List
from pydantic import BaseModel
import json


class Item(BaseModel):
    thing_number: int
    thing_description: str
    thing_amount: float


class ItemList(BaseModel):
    each_item: List[Item]

Base on your code with each_item as a List of Item

a_duck = Item(thing_number=123, thing_description="duck", thing_amount=4.56)
print(a_duck.json())

a_list = ItemList(each_item=[a_duck])

print(a_list.json())

Generate the following output:

{"thing_number": 123, "thing_description": "duck", "thing_amount": 4.56}
{"each_item": [{"thing_number": 123, "thing_description": "duck", "thing_amount": 4.56}]}

using these as "entry json":

a_json_duck = {"thing_number": 123, "thing_description": "duck", "thing_amount": 4.56}
a_json_list = {
    "each_item": [
        {"thing_number": 123, "thing_description": "duck", "thing_amount": 4.56}
    ]
}

print(Item(**a_json_duck))
print(ItemList(**a_json_list))

Work just fine and generates:

Item thing_number=123 thing_description='duck' thing_amount=4.56
ItemList each_item=[<Item thing_number=123 thing_description='duck' thing_amount=4.56>]

We are just left with the only datas:

just_datas = [
    {"thing_number": 123, "thing_description": "duck", "thing_amount": 4.56},
    {"thing_number": 456, "thing_description": "cow", "thing_amount": 7.89},
]
item_list = ItemList(each_item=just_datas)
print(item_list)
print(type(item_list.each_item[1]))
print(item_list.each_item[1])

Those works as expected:

ItemList each_item=[<Item thing_number=123 thing_description='duck'thing_amount=4.56>,<Item thin…
<class '__main__.Item'>
Item thing_number=456 thing_description='cow' thing_amount=7.89

So in case i'm missing something the pydantic librairy works as expected.

My pydantic version : 0.30 python 3.7.4

Reading from a lookalike file:

json_data_file = """[
{"thing_number": 123, "thing_description": "duck", "thing_amount": 4.56},
{"thing_number": 456, "thing_description": "cow", "thing_amount": 7.89}]"""

from io import StringIO
item_list2 = ItemList(each_item=json.load(StringIO(json_data_file)))

Work also fine.

1 Comment

I spent hours thinking the issue was the class/object structure - - not how i was loading the information. Works perfect. Thanks.
2

You can use Pydantic RootModel
https://docs.pydantic.dev/latest/concepts/models/#rootmodel-and-custom-root-types

from pydantic import BaseModel, RootModel

class OneItem(BaseModel):
    a: int

class ListItems(RootModel):
    root: List[OneItem]

    def __iter__(self):
        return iter(self.root)

    def __getitem__(self, item):
        return self.root[item]


src = [{'a': 1}, {'a': 2}]
model = ListItems.model_validate(src)

[print(_) for _ in model]

dst = model.model_dump()
print(dst)
assert src == dst

Comments

1

what did the trick for me was fastapi.encoders.jsonable_encoder (take a look at https://fastapi.tiangolo.com/tutorial/encoder/)

So in your case I have appended the "single" items to a list result i.e. result.append(Item(thing_number=123, thing_description="duck", thing_amount=4.56))

and finally fastapi.JSONResponse(content=fastapi.encoders.jsonable_encoder(result))

1 Comment

The question is asked in a Pydantic context. Not everyone that uses pydantic uses it in a FastAPI context...
1

Use TypeAdapter.

To convert from JSON str to a list[Item]:

items = TypeAdapter(list[Item]).validate_json(bigger_data_json)

To convert from list[dict] to list[Item]:

items = TypeAdapter(list[Item]).validate_python(bigger_data)

To convert from a list[Item] to a JSON str:

bigger_data_json = TypeAdapter(list[Item]).dump_json(items)

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.