How to delete empty field? - pydantic

I have a BaseModel like this
from pydantic import BaseModel
class TestModel(BaseModel):
id: int
names: str = None
While I validate some data like this
TestModel(id=123).dict()
I got result
{'id': 123, 'name': None}
But what I expect is:
{'id': 123}
Question:
Is there a method to delete empty field? Thanks!

The correct way to do this is with
TestModel(id=123).dict(exclude_none=True)
If you need this everywhere, you can override dict() and change the default.

You can also set it in the model's Config so you don't have to keep writing it down:
class TestModel(BaseModel):
id: int
names: str = None
class Config:
fields = {'name': {'exclude': True}}

if you want to delete keys from a dictionary with value "None" :
result = {k: v for k, v in result.items() if v is not None }

you can perhaps add a root_validator:
#pydantic.root_validator(pre=False)
def check(cls, values):
if values['some_key'] is None
del values['some_key']

Related

Pydantic: how to make model with some mandatory and arbitrary number of other optional fields, which names are unknown and can be any?

I'd like to represent the following json by Pydantic model:
{
"sip" {
"param1": 1
}
"param2": 2
...
}
Means json may contain sip field and some other field, any number any names, so I'd like to have model which have sip:Optional[dict] field and some kind of "rest", which will be correctly parsed from/serialized to json. Is it possible?
Maybe you are looking for the extra model config:
extra
whether to ignore, allow, or forbid extra attributes during model initialization. Accepts the string values of 'ignore', 'allow', or 'forbid', or values of the Extra enum (default: Extra.ignore). 'forbid' will cause validation to fail if extra attributes are included, 'ignore' will silently ignore any extra attributes, and 'allow' will assign the attributes to the model.
Example:
from typing import Any, Dict, Optional
import pydantic
class Foo(pydantic.BaseModel):
sip: Optional[Dict[Any, Any]]
class Config:
extra = pydantic.Extra.allow
foo = Foo.parse_raw(
"""
{
"sip": {
"param1": 1
},
"param2": 2
}
"""
)
print(repr(foo))
print(foo.json())
Output:
Foo(sip={'param1': 1}, param2=2)
{"sip": {"param1": 1}, "param2": 2}

How To Get Pydantic To Discriminate On A Field Within List[Union[TypeA, TypeB]]?

I am trying to use Pydantic to validate a POST request payload for a Rest API. A list of applicants can contain a primary and optional other applicant. So far, I have written the following Pydantic models listed below, to try and reflect this. The Rest API json payload is using a boolean field isPrimary to discriminate between a primary and other applicant.
from datetime import date
from pydantic import BaseModel, validator
from typing import List, Literal, Optional, Union
class PrimaryApplicant(BaseModel):
isPrimary: Literal[True]
dateOfBirth: Optional[date]
class OtherApplicant(BaseModel):
isPrimary: Literal[False]
dateOfBirth: date
relationshipStatus: Literal["family", "friend", "other", "partner"]
class Application(BaseModel):
applicants: List[Union[PrimaryApplicant, OtherApplicant]]
#validator("applicants")
def validate(
cls,
v: List[Union[PrimaryApplicant, OtherApplicant]]
) -> List[Union[PrimaryApplicant, OtherApplicant]]:
list_count = len(v)
primary_count = len(
list(
filter(lambda item: item.isPrimary, v)
)
)
secondary_count = list_count - primary_count
if primary_count > 1:
raise ValueError("Only one primary applicant required")
if secondary_count > 1:
raise ValueError("Only one secondary applicant allowed")
return v
def main() -> None:
data_dict = {
"applicants": [
{
"isPrimary": True
},
{
"isPrimary": False,
"dateOfBirth": date(1990, 1, 15),
"relationshipStatus": "family"
},
]
}
_ = Application(**data_dict)
if __name__ == "__main__":
main()
With the example json payload listed above, when I try to remove some of the required mandatory fields from the OtherApplicant payload a ValidationError is correctly raised. For example, if I try to remove relationshipStatus or dateOfBirth field an error is raised. However, the isPrimary field is also reported by Pydantic to be invalid. Pydantic believes that this the isPrimary field should be True??? Example Pydantic validation output is listed below.
Why is Pydantic expecting that the isPrimary field should be True for an OtherApplicant list item in the json payload? Is it somehow associating the payload with PrimaryApplicant because of the use of Union? If so, how do I get Pydantic to use the isPrimary field to distinguish between primary and other applicants in the list payload?
Missing relationshipStatus field in list payload for OtherApplicant
pydantic.error_wrappers.ValidationError: 2 validation errors for Application
applicants -> 1 -> isPrimary
unexpected value; permitted: True (type=value_error.const; given=False; permitted=(True,))
applicants -> 1 -> dateOfBirth
field required (type=value_error.missing)
Missing dateOfBirth field in list payload for OtherApplicant
pydantic.error_wrappers.ValidationError: 2 validation errors for Application
applicants -> 1 -> isPrimary
unexpected value; permitted: True (type=value_error.const; given=False; permitted=(True,))
applicants -> 1 -> relationshipStatus
field required (type=value_error.missing)
Found the answer via also asking on Pydantic GitHub Repository
Pydantic 1.9 introduces the notion of discriminatory union.
After upgrading to Pydantic 1.9 and adding:
Applicant = Annotated[
Union[PrimaryApplicant, OtherApplicant],
Field(discriminator="isPrimary")]
It is now possible to have applicants: List[Applicant] field in my Application model. The isPrimary field is marked as being used to distinguish between a primary and other applicant.
The full code listing is therefore:
from datetime import date
from pydantic import BaseModel, Field, validator
from typing import List, Literal, Optional, Union
from typing_extensions import Annotated
class PrimaryApplicant(BaseModel):
isPrimary: Literal[True]
dateOfBirth: Optional[date]
class OtherApplicant(BaseModel):
isPrimary: Literal[False]
dateOfBirth: date
relationshipStatus: Literal["family", "friend", "other", "partner"]
Applicant = Annotated[
Union[PrimaryApplicant, OtherApplicant],
Field(discriminator="isPrimary")]
class Application(BaseModel):
applicants: List[Applicant]
#validator("applicants")
def validate(cls, v: List[Applicant]) -> List[Applicant]:
list_count = len(v)
primary_count = len(
list(
filter(lambda item: item.isPrimary, v)
)
)
secondary_count = list_count - primary_count
if primary_count > 1:
raise ValueError("Only one primary applicant required")
if secondary_count > 1:
raise ValueError("Only one secondary applicant allowed")
return v
def main() -> None:
data_dict = {
"applicants": [
{
"isPrimary": True
},
{
"isPrimary": False,
"relationshipStatus": "family"
},
]
}
_ = Application(**data_dict)
if __name__ == "__main__":
main()

Pymongo: Best way to remove $oid in Response

I have started using Pymongo recently and now I want to find the best way to remove $oid in Response
When I use find:
result = db.nodes.find_one({ "name": "Archer" }
And get the response:
json.loads(dumps(result))
The result would be:
{
"_id": {
"$oid": "5e7511c45cb29ef48b8cfcff"
},
"about": "A jazz pianist falls for an aspiring actress in Los Angeles."
}
My expected:
{
"_id": "5e7511c45cb29ef48b8cfcff",
"about": "A jazz pianist falls for an aspiring actress in Los Angeles."
}
As you seen, we can use:
resp = json.loads(dumps(result))
resp['id'] = resp['id']['$oid']
But I think this is not the best way. Hope you guys have better solution.
You can take advantage of aggregation:
result = db.nodes.aggregate([{'$match': {"name": "Archer"}}
{'$addFields': {"Id": '$_id.oid'}},
{'$project': {'_id': 0}}])
data = json.dumps(list(result))
Here, with $addFields I add a new field Id in which I introduce the value of oid. Then I make a projection where I eliminate the _id field of the result. After, as I get a cursor, I turn it into a list.
It may not work as you hope but the general idea is there.
First of all, there's no $oid in the response. What you are seeing is the python driver represent the _id field as an ObjectId instance, and then the dumps() method represent the the ObjectId field as a string format. the $oid bit is just to let you know the field is an ObjectId should you need to use for some purpose later.
The next part of the answer depends on what exactly you are trying to achieve. Almost certainly you can acheive it using the result object without converting it to JSON.
If you just want to get rid of it altogether, you can do :
result = db.nodes.find_one({ "name": "Archer" }, {'_id': 0})
print(result)
which gives:
{"name": "Archer"}
import re
def remove_oid(string):
while True:
pattern = re.compile('{\s*"\$oid":\s*(\"[a-z0-9]{1,}\")\s*}')
match = re.search(pattern, string)
if match:
string = string.replace(match.group(0), match.group(1))
else:
return string
string = json_dumps(mongo_query_result)
string = remove_oid(string)
I am using some form of custom handler. I managed to remove $oid and replace it with just the id string:
# Custom Handler
def my_handler(x):
if isinstance(x, datetime.datetime):
return x.isoformat()
elif isinstance(x, bson.objectid.ObjectId):
return str(x)
else:
raise TypeError(x)
# parsing
def parse_json(data):
return json.loads(json.dumps(data, default=my_handler))
result = db.nodes.aggregate([{'$match': {"name": "Archer"}}
{'$addFields': {"_id": '$_id'}},
{'$project': {'_id': 0}}])
data = parse_json(result)
In the second argument of find_one, you can define which fields to exclude, in the following way:
site_information = mongo.db.sites.find_one({'username': username}, {'_id': False})
This statement will exclude the '_id' field from being selected from the returned documents.

Django rest framework: Is there a way to clean data before validating it with a serializer?

I've got an API endpoint POST /data.
The received data is formatted in a certain way which is different from the way I store it in the db.
I'll use geometry type from postgis as an example.
class MyPostgisModel(models.Model):
...
position = models.PointField(null=True)
my_charfield = models.CharField(max_length=10)
...
errors = JSONField() # Used to save the cleaning and validation errors
class MyPostgisSerializer(serializers.ModelSerializer):
class Meta:
model = MyPostgisModel
fields = [
...
"position",
...
"my_charfield",
"errors",
]
def to_internal_value(self, data):
...
# Here the data is coming in the field geometry but in the db, it's called
# position. Moreover I need to apply the `GEOSGeometry(json.dumps(...))`
# method as well.
data["position"] = GEOSGeometry(json.dumps(data["geometry"]))
return data
The problem is that there is not only one field like position but many. And I would like (maybe wrongly) to do like the validate_*field_name* scheme but for cleaning (clean_*field_name*).
There is another problem. In this scheme, I would like to still save the rest of the data in the database even if some fields have raised ValidationError (eg: a CharField that is too long) but are not part of the primary_key/a unique_together constraint. And save the related errors into a JSONField like this:
{
"cleaning_errors": {
...
"position": 'Invalid format: {
"type": "NotAValidType", # Should be "Point"
"coordinates": [
4.22,
50.67
]
}'
...
},
"validating_errors": {
...
"my_charfield": "data was too long: 'this data is way too long for 10 characters'",
...
}
}
For the first problem, I thought of doing something like this:
class BaseSerializerCleanerMixin:
"""Abstract Mixin that clean fields."""
def __init__(self, *args, **kwargs):
"""Initialize the cleaner strategy."""
# This is the error_dict to be filled by the `clean_*field_name*`
self.cleaning_error_dict = {}
super().__init__(*args, **kwargs)
def clean_fields(self, data):
"""Clean the fields listed in self.fields_to_clean before validating them."""
cleaned_data = {}
for field_name in getattr(self.Meta, "fields", []):
cleaned_field = (
getattr(self, "clean_" + field_name)(data)
if hasattr(self, "clean_" + field_name)
else data.get(field_name)
)
if cleaned_field is not None:
cleaned_data[field_name] = cleaned_field
return cleaned_data
def to_internal_value(self, data):
"""Reformat data to put it in the database."""
cleaned_data = self.clean_fields(data)
return super().to_internal_value(cleaned_data)
I'm not sure that's a good idea and maybe there is an easy way to deal with such things.
For the second problem ; catching the errors of the validation without specifying with is_valid() returning True when no primary_key being wrongly formatted, I'm not sure how to proceed.

Django Rest Framework Displaying Serialized data through Views.py

class International(object):
""" International Class that stores versions and lists
countries
"""
def __init__(self, version, countrylist):
self.version = version
self.country_list = countrylist
class InternationalSerializer(serializers.Serializer):
""" Serializer for International page
Lists International countries and current version
"""
version = serializers.IntegerField(read_only=True)
country_list = CountrySerializer(many=True, read_only=True)
I have a serializer set up this way, and I wish to display serialized.data (which will be a dictionary like this: { "version": xx, and "country_list": [ ] } ) using views.py
I have my views.py setup this way:
class CountryListView(generics.ListAPIView):
""" Endpoint : somedomain/international/
"""
## want to display a dictionary like the one below
{
"version": 5
"country_list" : [ { xxx } , { xxx } , { xxx } ]
}
What do I code in this CountryListView to render a dictionary like the one above? I'm really unsure.
Try this
class CountryListView(generics.ListAPIView):
""" Endpoint : somedomain/international/
"""
def get(self,request):
#get your version and country_list data and
#init your object
international_object = International(version,country_list)
serializer = InternationalSerializer(instance=international_object)
your_data = serializer.data
return your_data
You can build on the idea from here:
http://www.django-rest-framework.org/api-guide/pagination/#example
Suppose we want to replace the default pagination output style with a modified format that includes the next and previous links under in a nested 'links' key. We could specify a custom pagination class like so:
class CustomPagination(pagination.PageNumberPagination):
def get_paginated_response(self, data):
return Response({
'links': {
'next': self.get_next_link(),
'previous': self.get_previous_link()
},
'count': self.page.paginator.count,
'results': data
})
As long as you don't need the pagination, you can setup a custom pagination class which would pack your response in whichever layout you may need:
class CountryListPagination(BasePagination):
def get_paginated_response(self, data):
return {
'version': 5,
'country_list': data
}
Then all you need to do is to specify this pagination to your class based view:
class CountryListView(generics.ListAPIView):
# Endpoint : somedomain/international/
pagination_class = CountryListPagination
Let me know how is this working for you.