Django DB API equivalent of a somewhat complex SQL query - sql

I'm new to Django and still having some problems about simple queries.
Let's assume that I'm writting an email application. This is the Mail
model:
class Mail(models.Model):
to = models.ForeignKey(User, related_name = "to")
sender = models.ForeignKey(User, related_name = "sender")
subject = models.CharField()
conversation_id = models.IntegerField()
read = models.BooleanField()
message = models.TextField()
sent_time = models.DateTimeField(auto_now_add = True)
Each mail has conversation_id which identifies a set of email messages
which are written and replyed. Now, for listing emails in inbox, I
would like as gmail to show only last email per conversation.
I have the SQL equivalent which does the job, but how to construct native Django query for this?
select
*
from
main_intermail
where
id in
(select
max(id)
from
main_intermail
group by conversation_id);
Thank you in advance!

Does this work? It would require Django 1.1.
from django.db.models import Max
mail_list = Mail.objects.values('conversation_id').annotate(Max('id'))
conversation_id_list = mail_list.values_list('id__max',flat=True)
conversation_list = Mail.objects.filter(id__in=conversation_id_list)

So, given a conversation_id you want to retrieve the related record which has the highest id. To do this use order_by to sort the results in descending order (because you want the highest id to come first), and then use array syntax to get the first item, which will be the item with the highest id.
# Get latest message for conversation #42
Mail.objects.filter(conversation_id__exact=42).order_by('-id')[0]
However, this differs from your SQL query. Your query appears to provide the latest message from every conversation. This provides the latest message from one specific conversation. You could always do one query to get the list of conversations for that user, and then follow up with multiple queries to get the latest message from each conversation.

Related

Rails Select one random listings for premium users

In my rails app, I have Users and Listings. The Listings belong to a User. Listing has user_id and its filled with users id who is creating the listing.
A user can be a premium user, gold user or silver user.
What I want is for each premium user, select one random listing to show in premium listings.
I can do it in O(n**2) time or n+1 query as follow:
users_id = User.where(:role => "premium").pluck[:id]
final_array = Array.new
users_id.each do |id|
final_array << Listing.where(:user_id => id).sample(1)
end
final_array
Is there a better way of doing this?
You could try this:
listings = Listing.select(
<<~SQL
DISTINCT ON (users.id) users.id,
listings.*,
row_number() OVER (PARTITION BY users.id ORDER BY random())
SQL
)
.joins(:user)
.includes(:user)
.where(users: { role: :premium })
It gives a random Listing for every premium user.
It produces the only request to db and also it won't make an extra request for getting listing's user, so you are free to do something like this:
listings.each do |listing|
p listing.user
end
random_user_listings = []
User.includes(:listings).where(role: "premium").find_each do |user|
random_user_listings << user.listings.sample(1)
end
random_user_listings
To avoid N+1 query you need to combine them, perform query one time like this:
list = Listing.includes(:user).where(:role => "premium").sample(1)
Feel free to deal with list instead of Listing. Because now you're dealing with variable, not Query.
ids = list.pluck(:user_id).uniq
Getting array of ids like above and doing further steps as you did (but with list, not Listing)
Need to be noticed that, when you deal with Model you're dealing with QUERY. Avoiding doing that in loop statement.

Selecting related model: Left join, prefetch_related or select_related?

Considering I have the following relationships:
class House(Model):
name = ...
class User(Model):
"""The standard auth model"""
pass
class Alert(Model):
user = ForeignKey(User)
house = ForeignKey(House)
somevalue = IntegerField()
Meta:
unique_together = (('user', 'property'),)
In one query, I would like to get the list of houses, and whether the current user has any alert for any of them.
In SQL I would do it like this:
SELECT *
FROM house h
LEFT JOIN alert a
ON h.id = a.house_id
WHERE a.user_id = ?
OR a.user_id IS NULL
And I've found that I could use prefetch_related to achieve something like this:
p = Prefetch('alert_set', queryset=Alert.objects.filter(user=self.request.user), to_attr='user_alert')
houses = House.objects.order_by('name').prefetch_related(p)
The above example works, but houses.user_alert is a list, not an Alert object. I only have one alert per user per house, so what is the best way for me to get this information?
select_related didn't seem to work. Oh, and surely I know I can manage this in multiple queries, but I'd really want to have it done in one, and the 'Django way'.
Thanks in advance!
The solution is clearer if you start with the multiple query approach, and then try to optimise it. To get the user_alerts for every house, you could do the following:
houses = House.objects.order_by('name')
for house in houses:
user_alerts = house.alert_set.filter(user=self.request.user)
The user_alerts queryset will cause an extra query for every house in the queryset. You can avoid this with prefetch_related.
alerts_queryset = Alert.objects.filter(user=self.request.user)
houses = House.objects.order_by('name').prefetch_related(
Prefetch('alert_set', queryset=alerts_queryset, to_attrs='user_alerts'),
)
for house in houses:
user_alerts = house.user_alerts
This will take two queries, one for houses and one for the alerts. I don't think you require select related here to fetch the user, since you already have access to the user with self.request.user. If you want you could add select_related to the alerts_queryset:
alerts_queryset = Alert.objects.filter(user=self.request.user).select_related('user')
In your case, user_alerts will be an empty list or a list with one item, because of your unique_together constraint. If you can't handle the list, you could loop through the queryset once, and set house.user_alert:
for house in houses:
house.user_alert = house.user_alerts[0] if house.user_alerts else None

OrientDB messages unread count

I currently have the following graph in my OrientDB database:
Which contains of the following properties:
'
Basically a User can be part of a so called Thread, this is set by the IsMember edge. If they are a member they are able to send a Message to a Thread.
Inside the IsMember edge there is also a last_read property which is of the type DateTime, this is a date of when they last opened the Thread. So if we try and get all the Messages with a newer created_at we get all the unread Message's. A query to accomplish this could look like this (cluster 12=users 14=thread):
SELECT * FROM Message
let $LR = (select lastRead.asLong() from IsMember where in = #12:1320782 AND out = #14:705856)
WHERE in = #14:705856 AND out = #12:1320782 AND created_at.asLong() > $LR[0].lastRead
This is great and all but I would like to show a unread counter for all the Threads. Using this query for all the Threads a User is a Member of would in some cases use up to 200-300 queries.
So basically I am looking for a query that is able to get all the unread Messages of all the threads a User is a member of.
Extra usefull queries:
A query to get all the subscribed Threads of a User would look something like this:
select expand(out) from (
select * from IsMember where in = 12:1320782
)
Query to get the lastRead property from a given User and Thread
select lastRead.asLong() from IsMember where in = #12:1320782 AND out = #14:705856
Try this query
select in.nick as user ,out.title as thread ,$a.size() as count from IsMember
let $a=(select created_at from Message where out.nick=$parent.current.in.nick and in.title=$parent.current.out.title and created_at > $parent.current.last_read)

Django sql order by

I'm really struggling on this one.
I need to be able to sort my user by the number of positive vote received on their comment.
I have a table userprofile, a table comment and a table likeComment.
The table comment has a foreign key to its user creator and the table likeComment has a foreign key to the comment liked.
To get the number of positive vote a user received I do :
LikeComment.objects.filter(Q(type = 1), Q(comment__user=user)).count()
Now I want to be able to get all the users sorted by the ones that have the most positive votes. How do I do that ? I tried to use extra and JOIN but this didn't go anywhere.
Thank you
It sounds like you want to perform a filter on an annotation:
class User(models.Model):
pass
class Comment(models.Model):
user = models.ForeignKey(User, related_name="comments")
class Like(models.Model):
comment = models.ForeignKey(Comment, related_name="likes")
type = models.IntegerField()
users = User \
.objects \
.all()
.extra(select = {
"positive_likes" : """
SELECT COUNT(*) FROM app_like
JOIN app_comment on app_like.comment_id = app_comment.id
WHERE app_comment.user_id = app_user.id AND app_like.type = 1 """})
.order_by("positive_likes")
models.py
class UserProfile(models.Model):
.........
def like_count(self):
LikeComment.objects.filter(comment__user=self.user, type=1).count()
views.py
def getRanking( anObject ):
return anObject.like_count()
def myview(request):
users = list(UserProfile.objects.filter())
users.sort(key=getRanking, reverse=True)
return render(request,'page.html',{'users': users})
Timmy's suggestion to use a subquery is probably the simplest way to solve this kind of problem, but subqueries almost never perform as well as joins, so if you have a lot of users you may find that you need better performance.
So, re-using Timmy's models:
class User(models.Model):
pass
class Comment(models.Model):
user = models.ForeignKey(User, related_name="comments")
class Like(models.Model):
comment = models.ForeignKey(Comment, related_name="likes")
type = models.IntegerField()
the query you want looks like this in SQL:
SELECT app_user.id, COUNT(app_like.id) AS total_likes
FROM app_user
LEFT OUTER JOIN app_comment
ON app_user.id = app_comment.user_id
LEFT OUTER JOIN app_like
ON app_comment.id = app_like.comment_id AND app_like.type = 1
GROUP BY app_user.id
ORDER BY total_likes DESCENDING
(If your actual User model has more fields than just id, then you'll need to include them all in the SELECT and GROUP BY clauses.)
Django's object-relational mapping system doesn't provide a way to express this query. (As far as I know—and I'd be very happy to be told otherwise!—it only supports aggregation across one join, not across two joins as here.) But when the ORM isn't quite up to the job, you can always run a raw SQL query, like this:
sql = '''
SELECT app_user.id, COUNT(app_like.id) AS total_likes
# etc (as above)
'''
for user in User.objects.raw(sql):
print user.id, user.total_likes
I believe this can be achieved with Django's queryset:
User.objects.filter(comments__likes__type=1)\
.annotate(lks=Count('comments__likes'))\
.order_by('-lks')
The only problem here is that this query will miss users with 0 likes. Code from #gareth-rees, #timmy-omahony and #Catherine will include also 0-ranked users.

Django ORM version of SQL COUNT(DISTINCT <column>)

I need to fill in a template with a summary of user activity in a simple messaging system. For each message sender, I want the number of messages sent and the number of distinct recipients.
Here's a simplified version of the model:
class Message(models.Model):
sender = models.ForeignKey(User, related_name='messages_from')
recipient = models.ForeignKey(User, related_name='messages_to')
timestamp = models.DateTimeField(auto_now_add=True)
Here's how I'd do it in SQL:
SELECT sender_id, COUNT(id), COUNT(DISTINCT recipient_id)
FROM myapp_messages
GROUP BY sender_id;
I've been reading through the documentation on aggregation in ORM queries, and although annotate() can handle the first COUNT column, I don't see a way to get the COUNT(DISTINCT) result (even extra(select={}) hasn't been working, although it seems like it should). Can this be translated into a Django ORM query or should I just stick with raw SQL?
You can indeed use distinct and count together, as seen on this answer: https://stackoverflow.com/a/13145407/237091
In your case:
SELECT sender_id, COUNT(id), COUNT(DISTINCT recipient_id)
FROM myapp_messages
GROUP BY sender_id;
would become:
Message.objects.values('sender').annotate(
message_count=Count('sender'),
recipient_count=Count('recipient', distinct=True))
from django.db.models import Count
messages = Message.objects.values('sender').annotate(message_count=Count('sender'))
for m in messages:
m['recipient_count'] = len(Message.objects.filter(sender=m['sender']).\
values_list('recipient', flat=True).distinct())