I have these models:
class Person(models.Model):
user = models.OneToOneField(User)
class Post(models.Model):
author = models.ForeignKey(Person, null=False)
text = models.TextField(null=False, blank=False)
and want a Queryset at least with the below fields,
author.user.username
text
I have read select_related() queries but when I try to use that with this view can't get username field
posts = Post.objects.select_related('person__user')[:10]
can I use Django query or have to use SQL raw ?
Thanks for any help
You can serialize like this:
import json
from django.core.serializers.json import DjangoJSONEncoder
json_data = json.dumps(list(Post.objects.values('author__user__username', 'text')[:10]), cls=DjangoJSONEncoder)
select_related should be called with the field names, not the type.
posts = Post.objects.select_related('author__user')[:10]
for post in posts:
print(post.person.user.username)
print(post.text)
All the select_related does is ensure that the foreign fields can be accessed without extra queries (select_related constructs joins to the relevant tables).
Related
Suppose we have the following model (i'm making a Django example, but i suppose my question extrapolates on any framework):
from django.db import models
class Card(models.Model):
title = models.CharField(max_length=255)
description = models.CharField(max_length=1024)
tags = models.ManyToManyField(
Tag,
related_name="cards",
)
class Meta:
indexes = [
["title", "description"],
]
class Tag(models.Model):
uuid = models.UUIDField(default=uuid4, primary_key=True)
title = models.CharField(max_length=255, unique=True)
I'm later using those two fields in the custom search later:
from django.db.models import Q
def custom_filter_queryset(queryset, text):
return queryset.filter(
Q(title__contains=text) | Q(description__contains=text)
).distinct()
custom_filter_queryset(Card.objects.all().prefetch_related("tags"), "...")
Questions:
am i correcly taking advantage of a built index on those fields?
can i somehow create an index to optimize many-to-many fetching?
UPD: i'm using Postgres, but i suppose this extrapolates on any relational SQL DB
Not exactly.
The order of the indicated fields is important, so in that case queries like .filter(title=..., description=...) will benefit from the index, but .filter(description=...) will not. If your queries will include both title OR desc and title AND desc, use three indexes:
...
class Meta:
indexes = [
["title", "description"],
["title"],
["description"],
]
Note that prefetch_related does evaluate the queryset, so subsequent filters will be executed in Python.
Use special Prefetch object to execute filters on DB side:
from django.db.models import Prefetch
Card.objects.prefetch_related(Prefetch('tags'))
Ref: https://docs.djangoproject.com/en/4.1/ref/models/querysets/
I'm trying to query for an object in my DB, and get data from another object that shares a Foreign Key relationship.
For example, if I have these models:
class Book(models.Model):
language = models.ForeignKey('Language')
...
class Language(models.Model):
name = models.CharField(max_length=255, unique=True)
I want to query these models and get a QuerySet of books, then return the books via an API.
In raw SQL I would do something akin to:
SELECT book, language.name
FROM book
JOIN ....
Is there any way to accomplish this with the Django ORM?
If you are using django rest framework for the API, You can do this by using serializer Method field
class BookSerializer(serializers.ModelSerializer):
language = serializers.SerializerMethodField()
class Meta:
model = Book
fields = ['id', 'language']
def get_language(self,obj)
return obj.language.name
Another way to do this is adding a property field to Book Model
#property
def language(self):
return self.language.name
Now if you do book_obj.language ,You can get the name in language model
Model:
class Comment(MPTTModel):
submitter = models.ForeignKey(User, blank=True, null=True)
post = models.ForeignKey(Post, related_name="post_comments")
parent = TreeForeignKey('self', blank=True, null=True, related_name="children")
text = models.CharField("Text", max_length=1000)
rank = models.FloatField(default=0.0)
pub_date = models.DateTimeField(auto_now_add=True)
Iterating through nodes has the same effect (>1000 queries).
I had similar issue with MPTT models. It was solved with select_related
(also for parent's foreign keys).
So, depending on your needs, proper queryset can looks like:
Comment.objects.select_related('post', 'submitter', 'parent', 'parent__submitter', 'parent__post')
Also, if you need comment's children in your loop as well, it can be optimized like that:
queryset.prefetch_related('children')
Or even like that:
queryset.prefetch_related(
Prefetch(
'children',
queryset=Comment.objects.select_related('post', 'etc.'),
to_attr='children_with_posts'
)
)
... and depending on tree depth, you can use that:
queryset.select_related('parent', 'parent__parent', 'parent__parent__parent')
# you got the idea:)
Duplicated queries happens because all objects from iteration hits the data base when you refer a related object.
Try using select_related in your view method.
Probably using django prefetch related or select related will resolve that, but if not work, sorry you will need a raw query.
Have you ever read about optimizing Django queries? Here is a simple tutorial that's explain a lot of things: https://docs.djangoproject.com/en/3.1/topics/db/optimization/
I have:
class MyUser(Model):
today_ref_viewed_ips = ManyToManyField(
UniqAddress,
related_name='today_viewed_users',
verbose_name="Adresses visited referal link today")
...
On some croned daily request I do:
for u in MyUser.objects.all():
u.today_ref_viewed_ips.clear()
Can it be done on DB server with update?
MyUser.objects.all().update(...)
Ok, I can't update, thanks. But only thing I need is to TRUNCATE m2m internal table, is it possible to perform from django? How to know it's name whithout mysql's console "SHOW TABLES"?
If you want to update the m2m fields only and do not want to delete the m2m objects you can use the following:
#if you have **list of pk** for new m2m objects
today_ref_pk = [1,2,3]
u = MyUser.objects.get(pk=1)
u.today_ref_viewed_ips.clear()
u.today_ref_viewed_ips.add(*today_ref_pk)
for django >=1.11 documentation:
# if you have the **list of objects** for new m2m and you dont have the
# issue of race condition, you can do the following:
today_ref_objs = [obj1, obj2, obj3]
u = MyUser.objects.get(pk=1)
u.today_ref_viewed_ips.set(today_ref_objs, clear=True)
Query-1:
No, you cannot use .update() method to update a ManyToManyField.
Django's .update() method does not support ManyToManyField.
As per the docs from the section on updating multiple objects at once:
You can only set non-relation fields and ForeignKey fields using this
method. To update a non-relation field, provide the new value as a
constant. To update ForeignKey fields, set the new value to be the new
model instance you want to point to.
Query-2:
If you want to delete all the objects of m2m table, you can use .delete() queryset method.
MyModel.objects.all().delete() # deletes all the objects
Another method is to execute the raw SQL directly. This method is faster than the previous one.
from django.db import connection
cursor = connection.cursor()
cursor.execute("TRUNCATE TABLE table_name")
Query-3:
To get the table name of a model, you can use db_table model Meta option.
my_model_object._meta.db_table # gives the db table name
I have a model Page, which can have Posts on it. What I want to do is get every Page, plus the most recent Post on that page. If the Page has no Posts, I still want the page. (Sound familiar? This is a LEFT JOIN in SQL).
Here is what I currently have:
Page.objects.annotate(most_recent_post=Max('post__post_time'))
This only gets Pages, but it doesn't get Posts. How can I get the Posts as well?
Models:
class Page(models.Model):
name = models.CharField(max_length=50)
created = models.DateTimeField(auto_now_add = True)
enabled = models.BooleanField(default = True)
class Post(models.Model):
user = models.ForeignKey(User)
page = models.ForeignKey(Page)
post_time = models.DateTimeField(auto_now_add = True)
Depending on the relationship between the two, you should be able to follow the relationships quite easily, and increase performance by using select_related
Taking this:
class Page(models.Model):
...
class Post(models.Model):
page = ForeignKey(Page, ...)
You can follow the forward relationship (i.e. get all the posts and their associated pages) efficiently using select_related:
Post.objects.select_related('page').all()
This will result in only one (larger) query where all the page objects are prefetched.
In the reverse situation (like you have) where you want to get all pages and their associated posts, select_related won't work. See this,this and this question for more information about what you can do.
Probably your best bet is to use the techniques described in the django docs here: Following Links Backward.
After you do:
pages = Page.objects.annotate(most_recent_post=Max('post__post_time'))
posts = [page.post_set.filter(post_time=page.most_recent_post) for page in pages]
And then posts[0] should have the most recent post for pages[0] etc. I don't know if this is the most efficient solution, but this was the solution mentioned in another post about the lack of left joins in django.
You can create a database view that will contain all Page columns alongside with with necessary latest Post columns:
CREATE VIEW `testapp_pagewithrecentpost` AS
SELECT testapp_page.*, testapp_post.* -- I suggest as few post columns as possible here
FROM `testapp_page` LEFT JOIN `testapp_page`
ON test_page.id = test_post.page_id
AND test_post.post_time =
( SELECT MAX(test_post.post_time)
FROM test_post WHERE test_page.id = test_post.page_id );
Then you need to create a model with flag managed = False (so that manage.py sync won't break). You can also use inheritance from abstract Model to avoid column duplication:
class PageWithRecentPost(models.Model): # Or extend abstract BasePost ?
# Page columns goes here
# Post columns goes here
# We use LEFT JOIN, so all columns from the
# 'post' model will need blank=True, null=True
class Meta:
managed = False # Django will not handle creation/reset automatically
By doing that you can do what you initially wanted, so fetch from both tables in just one query:
pages_with_recent_post = PageWithRecentPost.objects.filter(...)
for page in pages_with_recent_post:
print page.name # Page column
print page.post_time # Post column
However this approach is not drawback free:
It's very DB engine-specific
You'll need to add VIEW creation SQL to your project
If your models are complex it's very likely that you'll need to resolve table column name clashes.
Model based on a database view will very likely be read-only (INSERT/UPDATE will fail).
It adds complexity to your project. Allowing for multiple queries is a definitely simpler solution.
Changes in Page/Post will require re-creating the view.