multiple models required for serialisation, django - api

I am having a very specific issue here, and am not sure how to navigate through it. So, I have a couple of serializers here, the code to which is attached below. Starting with three models, Cluster, Family and Individual, where a Cluster is a group of families, a Family is a group of individuals and an Individual is the lowest information holder.
With the help of through models(foreign keys between two models), I need to serialize and output families with their individuals containing integer fields representing a certain order defined by the through relations. As you can see, I have to use bots, Individual and Family as arguments in the returns_order function, and that's creating problems. Is there a better way that I am not aware of?
class IndividualSerializer(serializers.ModelSerializer):
order = serializers.SerializerMethodField('returns_order')
def returns_order(self, Individual, Family):
through = IndividualThroughFamily.objects.get(family=Family.id, topics=Topic.id)
order = through.order
return order
class Meta:
model = Individual
fields = ['order', 'individualName']
class FamilySerializer(serializers.ModelSerializer):
order = serializers.SerializerMethodField('returns_order')
def returns_order(self, Family):
through = FamilyThroughCluster.objects.get(family=Family.id)
order = through.order
return order
relatedIndividuals = IndividualSerializer(many=True)
class Meta:
model = Family
fields = ['order', 'familyName', 'relatedIndividuals']

Related

How do I construct complex django query statements?

I am not very familiar with SQL and so trying to make more complex calls via Django ORM is stumping me. I have a Printer model that spawns Jobs and the jobs receive statuses via a State model with a foreign key relationship to it. The jobs status is determined by the most recent state object associated with it. This is so I can track the history of states of jobs throughout its life cycle. I want to be able to determine which Printers have successful jobs associated with them.
from django.db import models
class Printer(models.Model):
label = models.CharField(max_length=120)
class Job(models.Model):
label = models.CharField(max_length=120)
printer = models.ForeignKey(
Printer,
related_name='jobs',
related_query_name='job'
)
def set_state(self, state):
State.objects.create(state=state, job=self)
#property
def current_state(self):
return self.states.latest('created_at').state
class State(models.Model):
created_at = models.DateTimeField(auto_now_add=True)
state = models.SmallIntegerField()
job = models.ForeignKey(
Job,
related_name='states',
related_query_name='state'
)
I need a QuerySet of Printer objects that have at least one related job with its most recent (latest) state object which has State.state == '200'. Is there a way to construct a compound call which will achieve this using the database and not having to pull in all Job objects to run python iterations on? Perhaps a custom manager? I've been reading posts about Subquery and Annotation and OuterRef, but these ideas are just not sinking in in a way that is showing me a path. I need them explained like I'm 5. They are very unpythonic statements..
The naive python way to describe what I want:
printers = []
for printer in Printer.objects.all():
for job in printer.jobs.objects.all():
if job.states.latest().state == '200':
printers.append(printer)
printers = list(set(printers))
But with the least number of DB round trips possible. Help!
edit: further question, what's the best way to filter Jobs based on the current state. Since Job.current_state is a calculated property it cannot be used in a QuerySet filter. But, again, I don't want to have to pull in all Job objects.
Took about two days to sink in, but I think I have an answer using annotation and Subqueries:
state_sq = State.objects.filter(job=OuterRef('pk')).order_by('-created_at')
successful_jobs = Job.objects.annotate(
latest_state=Subquery(state_sq.values('state')[:1])
).filter(printer=OuterRef('pk'), latest_state='200')
printers_with_successful_jobs = Printer.objects.annotate(
has_success_jobs=Exists(successful_jobs)
).filter(has_success_jobs=True)
And further, I constructed a custom manager to return latest_state by default.
class JobManager(models.Manager):
def get_queryset(self):
state_sq = State.objects.filter(
object_id=OuterRef('pk')
).order_by('-created_at')
return super().get_queryset().annotate(
latest_state=Subquery(state_sq.values('state')[:1])
)
class Job(models.Model):
objects = JobManager()
...

What is the best way to get all linked instances of a models in Django?

I am trying to create a messaging system in Django, and I came across an issue: How could I efficiently find all messages linked in a thread?
Let's imagine I have two models:
class Conversation(models.Model):
sender = models.ForeignKey(User)
receiver = models.ForeignKey(User)
first_message = models.OneToOneField(Message)
last_message = models.OneToOneField(Message)
class Message(models.Model):
previous = models.OneToOneField(Message)
content = models.TextField()
(code not tested, I'm sure it wouldn't work as is)
Since it is designed as a simple linked list, is it the only way to traverse it recursively?
Should I try to just get the previous of the previous until I find the first, or is there a way to query all of them more efficiently?
I use Rest Framework serializer with depth. So If you have serializer with Depth value to 3. I will fetch the full model of whatever the foreign key available until three parents.
https://www.django-rest-framework.org/api-guide/serializers/#specifying-nested-serialization
class AppliedSerializer(serializers.ModelSerializer):
class Meta:
model = Applied
fields = ("__all__")
depth = 3

How to Annotate Specific Related Field Value

I am attempting to optimize some code. I have model with many related models, and I want to annotate and filter by the value of a field of a specific type of these related models, as they are designed to be generic. I can find all instances of the type of related model I want, or all of the models related to the parent, but not the related model of the specific type related to the parent. Can anyone advise?
I initially tried
parents = parent.objects.all()
parents.annotate(field_value=Subquery(related_model.objects.get(
field__type='specific',
parent_id=OuterRef('id'),
).value)))
But get the error This queryset contains a reference to an outer query and may only be used in a subquery. When I tried
parents = parent.objects.all()
parents.annotate(field_value=Q(related_model.objects.get(
field__type='specific',
parent_id=F('id'),
).value)))
I get DoesNotExist: related_field matching query does not exist. which seems closer but still does not work.
Model structure:
class parent(models.Model):
id = models.IntegerField(null=False, primary_key=True)
class field(models.Model):
id = models.IntegerField(null=False, primary_key=True)
type = models.CharField(max_length=60)
class related_model(models.Model):
parent = models.ForeignKey(parent, on_delete=models.CASCADE, related_name='related_models')
field = models.ForeignKey(field, on_delete=models.CASCADE, related_name='fields')
Is what I want to do even possible?
Never mind I decided to do a reverse lookup, kinda like
parent_ids = related_model.objects.filter(field__type='specific', parent_id__in=list_of_parents).values_list('parent_id')
parents.objects.filter(id__in=parents_id)

django query multiple tables with foreign key pointing on self

lets say I have a model named User, and other models representing actions done by the a user, like "User_clicked", "User_left", "User_ate_cookie" etc. etc.
the User_* models have different fields and inherit from a single abstract class (User_do_something)
my question is this:
what's the django-way of querying ALL the models, from all the relevant tables, that point on a specific user?
eg. I want to do User.get_all_actions() and get a list with mixed type of models in them, containing all the objects that inherit from User_do_something and point on the specific user.
Note: performance is critical. I don't want to make multiple db queries and combine the list, if possible, I want to do it with a single sql select.
some code to be clear:
class User(models.Model):
uuid = models.UUIDField(unique=True, default=uuid.uuid4)
creation_time = models.DateTimeField(auto_now_add=True)
def get_all_actions(self):
'''
return a list with ALL the actions this player did.
'''
??????? how do I do this query ???????
class User_do_action(models.Model):
user = models.ForeignKey(User)
creation_time = models.DateTimeField(auto_now_add=True)
class Meta:
abstract = True
class User_click(User_do_action):
... some fields
class User_left(User_do_action):
... some fields
class User_ate_cookie(User_do_action):
... some fields
etc. etc.
thanks!

Django aggregate query

I have a model Page, which can have Posts on it. What I want to do is get every Page, plus the most recent Post on that page. If the Page has no Posts, I still want the page. (Sound familiar? This is a LEFT JOIN in SQL).
Here is what I currently have:
Page.objects.annotate(most_recent_post=Max('post__post_time'))
This only gets Pages, but it doesn't get Posts. How can I get the Posts as well?
Models:
class Page(models.Model):
name = models.CharField(max_length=50)
created = models.DateTimeField(auto_now_add = True)
enabled = models.BooleanField(default = True)
class Post(models.Model):
user = models.ForeignKey(User)
page = models.ForeignKey(Page)
post_time = models.DateTimeField(auto_now_add = True)
Depending on the relationship between the two, you should be able to follow the relationships quite easily, and increase performance by using select_related
Taking this:
class Page(models.Model):
...
class Post(models.Model):
page = ForeignKey(Page, ...)
You can follow the forward relationship (i.e. get all the posts and their associated pages) efficiently using select_related:
Post.objects.select_related('page').all()
This will result in only one (larger) query where all the page objects are prefetched.
In the reverse situation (like you have) where you want to get all pages and their associated posts, select_related won't work. See this,this and this question for more information about what you can do.
Probably your best bet is to use the techniques described in the django docs here: Following Links Backward.
After you do:
pages = Page.objects.annotate(most_recent_post=Max('post__post_time'))
posts = [page.post_set.filter(post_time=page.most_recent_post) for page in pages]
And then posts[0] should have the most recent post for pages[0] etc. I don't know if this is the most efficient solution, but this was the solution mentioned in another post about the lack of left joins in django.
You can create a database view that will contain all Page columns alongside with with necessary latest Post columns:
CREATE VIEW `testapp_pagewithrecentpost` AS
SELECT testapp_page.*, testapp_post.* -- I suggest as few post columns as possible here
FROM `testapp_page` LEFT JOIN `testapp_page`
ON test_page.id = test_post.page_id
AND test_post.post_time =
( SELECT MAX(test_post.post_time)
FROM test_post WHERE test_page.id = test_post.page_id );
Then you need to create a model with flag managed = False (so that manage.py sync won't break). You can also use inheritance from abstract Model to avoid column duplication:
class PageWithRecentPost(models.Model): # Or extend abstract BasePost ?
# Page columns goes here
# Post columns goes here
# We use LEFT JOIN, so all columns from the
# 'post' model will need blank=True, null=True
class Meta:
managed = False # Django will not handle creation/reset automatically
By doing that you can do what you initially wanted, so fetch from both tables in just one query:
pages_with_recent_post = PageWithRecentPost.objects.filter(...)
for page in pages_with_recent_post:
print page.name # Page column
print page.post_time # Post column
However this approach is not drawback free:
It's very DB engine-specific
You'll need to add VIEW creation SQL to your project
If your models are complex it's very likely that you'll need to resolve table column name clashes.
Model based on a database view will very likely be read-only (INSERT/UPDATE will fail).
It adds complexity to your project. Allowing for multiple queries is a definitely simpler solution.
Changes in Page/Post will require re-creating the view.