How to perform translation from RAW SQL to django queryset - sql

I am struggling with conversion to django query having raw sql
I am new in django and any help will be appreciated
There are simple models:
Winemaker - target model
Wine
Post
Winemaker has 1+ Wines
Wine has 1+ Posts
I know that it should be done with annotations but have no idea how to implement it.
select w2.*,
(select count(wp.id)
from web_winemaker www
inner join web_wine ww on www.id = ww.winemaker_id
inner join web_post wp on ww.id = wp.wine_id
where
ww.status=20
and
wp.status=20
and
www.id = w2.id
) as wineposts_count,
(
select count(w.id)
from web_winemaker www1
inner join web_wine w on www1.id = w.winemaker_id
where
w.status=20
and www1.id = w2.id
) as wines_count
from web_winemaker w2;

You should be able to accomplish this with a Count aggregation expression in an annotate function. I took a guess at your related_name values on your relationship fields, so the following code may not plug in directly, but should give you an idea of how to do what you want.
from django.db.models import Count, Q
wine_makers = Winemaker.objects.annotate(
posts_count=Count(
'wine__post__id',
filter=Q(wines__status=20, wines__posts__status=20),
),
wines_count=Count(
'wines__id',
filter=Q(wines__status=20),
),
)
You may need to supply distinct=True depending on if you're crossing relationships.

Related

How to filter all objects that contain multiple elements in a many-to-many relationship using Django's ORM?

I have two classes in Django linked through a ManyToManyField (the User class is the built-in User model):
from django.contrib.auth.models import User
class Activity():
participants = models.ManyToManyField(User, related_name='activity_participants')
I want to find all the activities in which two users are simultaneously participating.
I managed to solve my problem using a raw query (my app name is "core", therefore the "core" prefix in the table names):
SELECT ca.id FROM core_activity_participants AS cap, core_activity AS ca
INNER JOIN core_activity_participants AS cap2 ON cap.activity_id
WHERE cap.user_id == 1 AND cap2.user_id == 2
AND cap.activity_id == cap2.activity_id
AND ca.id == cap.activity_id
However, if possible, I'd like to avoid using raw queries, since it breaks uniformity from the rest of my app. How could I make this query, or one equivalent to it, using Django's ORM?
If you're using Django 1.11 or later the intersection queryset method will give you the records you want.
# u1 and u2 are User instances
u1_activities = Activity.objects.filter(participants=u1)
u2_activities = Activity.objects.filter(participants=u2)
common_activities = u1_activities.intersection(u2_activities)
Will produce a query something like this:
SELECT "core_activity"."id"
FROM "core_activity"
INNER JOIN "core_activity_participants"
ON ("core_activity"."id" = "core_activity_participants"."activity_id")
WHERE "core_activity_participants"."user_id" = 1
INTERSECT
SELECT "core_activity"."id"
FROM "core_activity"
INNER JOIN "core_activity_participants"
ON ("core_activity"."id" = "core_activity_participants"."activity_id")
WHERE "core_activity_participants"."user_id" = 2
You can also add extra querysets to the intersection if you want to check for activity overlap between more than 2 users.
Update:
Another approach, which works with older Django versions, would be
u1_activities = u1.activity_participants.values_list('pk', flat=True)
common_activities = u2.activity_participants.filter(pk__in=u1_activities)
Which produces a query like
SELECT "core_activity"."id"
FROM "core_activity"
INNER JOIN "core_activity_participants"
ON ("core_activity"."id" = "core_activity_participants"."activity_id")
WHERE (
"core_activity_participants"."user_id" = 2
AND "core_activity"."id" IN (
SELECT U0."id"
FROM "core_activity" U0
INNER JOIN "core_activity_participants" U1
ON (U0."id" = U1."activity_id")
WHERE U1."user_id" = 1
)
)

How to get Django QuerySet 'exclude' to work right?

I have a database that contains schemas for skus, kits, kit_contents, and checklists. Here is a query for "Give me all the SKUs defined for kitcontent records defined for kit records defined in checklist 1":
SELECT DISTINCT s.* FROM skus s
JOIN kit_contents kc ON kc.sku_id = s.id
JOIN kits k ON k.id = kc.kit_id
JOIN checklists c ON k.checklist_id = 1;
I'm using Django, and I mostly really like the ORM because I can express that query by:
skus = SKU.objects.filter(kitcontent__kit__checklist_id=1).distinct()
which is such a slick way to navigate all those foreign keys. Django's ORM produces basically the same as the SQL written above. The trouble is that it's not clear to me how to get all the SKUs not defined for checklist 1. In the SQL query above, I'd do this by replacing the "=" with "!=". But Django's models don't have a not equals operator. You're supposed to use the exclude() method, which one might guess would look like this:
skus = SKU.objects.filter().exclude(kitcontent__kit__checklist_id=1).distinct()
but Django produces this query, which isn't the same thing:
SELECT distinct s.* FROM skus s
WHERE NOT ((skus.id IN
(SELECT kc.sku_id FROM kit_contents kc
INNER JOIN kits k ON (kc.kit_id = k.id)
WHERE (k.checklist_id = 1 AND kc.sku_id IS NOT NULL))
AND skus.id IS NOT NULL))
(I've cleaned up the query for easier reading and comparison.)
I'm a beginner to the Django ORM, and I'd like to use it when possible. Is there a way to get what I want here?
EDIT:
karthikr gave an answer that doesn't work for the same reason the original ORM .exclude() solution doesn't work: a SKU can be in kit_contents in kits that exist on both checklist_id=1 and checklist_id=2. Using the by-hand query I opened my post with, using "checklist_id = 1" produces 34 results, using "checklist_id = 2" produces 53 results, and the following query produces 26 results:
SELECT DISTINCT s.* FROM skus s
JOIN kit_contents kc ON kc.sku_id = s.id
JOIN kits k ON k.id = kc.kit_id
JOIN checklists c ON k.checklist_id = 1
JOIN kit_contents kc2 ON kc2.sku_id = s.id
JOIN kits k2 ON k2.id = kc2.kit_id
JOIN checklists c2 ON k2.checklist_id = 2;
I think this is one reason why people don't seem to find the .exclude() solution a reasonable replacement for some kind of not_equals filter -- the latter allows you to say, succinctly, exactly what you mean. Presumably the former could also allow the query to be expressed, but I increasingly despair of such a solution being simple.
You could do this - get all the objects for checklist 1, and exclude it from the complete list.
sku_ids = skus.values_list('pk', flat=True)
non_checklist_1 = SKU.objects.exclude(pk__in=sku_ids).distinct()

Joining tables, counting, and group to return a Model

So I've got a SQL query I'd like to duplicate in rails:
select g.*
from gamebox_favorites f
inner join gameboxes g on f.gamebox_id = g.id
group by f.gamebox_id
order by count(f.gamebox_id) desc;
I've been reading over the rails Active Record Query Interface site, but can't quite seem to put this together. I'd like the query to return a collection of Gamebox records, sorted by the number of 'favorites' a gamebox has. What is the cleanest way to do this in rails?
I believe this will work (works on a similarly structured database locally), though I'm not sure I have the proper models in the proper spots for what you're trying to do, so you might need to move a coule things around:
Gamebox.joins(:gamebox_favorites).
group('"gamebox_favorites"."gamebox_id"').
order('count("gamebox_favorites"."gamebox_id")')
On the console, this should compile to (in the case of PostgreSQL on the back end):
SELECT "gameboxes".* FROM "gamebox_favorites"
INNER JOIN "gamebox_favorites"
ON "gamebox_favorites"."gamebox_id" = "gamebox"."id"
GROUP BY "gamebox_favorites"."gamebox_id"
ORDER BY count("gamebox_favorites"."gamebox_id")
...and I'm guessing that you don't want do just wrap it in a find_by_sql call, such as:
Gamebox.find_by_sql("select g.* from gamebox_favorites f
inner join gameboxes g
on f.gamebox_id = g.id
group by f.gamebox_id
order by count(f.gamebox_id) desc")

Is it possible to do this in NHibernate without using CreateSQLQuery?

Is it possible to do this in NHibernate without using CreateSQLQuery. Preferably with Linq To Nhibernate. The biggest question is how do I do joins not on a primary key?
SELECT DISTINCT calEvent.* From CalendarEvent as calEvent
LEFT JOIN UserChannelInteraction as channelInteraction on channelInteraction.ChannelToFollow_id = calEvent.Channel_id
LEFT JOIN UserCalendarEventInteraction as eventInteraction on eventInteraction.CalendarEvent_id = calEvent.Id
LEFT JOIN UserChannelInteraction as eventInteractionEvent on eventInteractionEvent.UserToFollow_id = eventInteraction.User_id
WHERE (calEvent.Channel_id = #intMainChannelID
OR channelInteraction.User_id = #intUserID
OR eventInteraction.User_id = #intUserID
OR (eventInteractionEvent.User_id = #intUserID AND eventInteraction.Status = 'Accepted'))
AND calEvent.StartDateTime >= #dtStartDate
AND calEvent.StartDateTime <= #dtEndDate
ORDER BY calEvent.StartDateTime asc
Hmmm... maybe you need to try to leverage subqueries?
Check this out: http://devlicio.us/blogs/derik_whittaker/archive/2009/04/06/simple-example-of-using-a-subquery-in-nhibernate-when-using-icriteria.aspx
You can do arbitrary joins by using Theta joins. A theta join is the Cartesian product, so it results in all possible combinations, which then can be filtered.
In NHibernate you can perform a theta style join like this (HQL):
from Book b, Review r where b.Isbn = r.Isbn
You can then add any filtering conditions you want to, order the results and everything else you might want to do.
from Book b, Review r where b.Isbn = r.Isbn where b.Title = 'My Title' or r.Name = 'John Doe' order by b.Author asc
Here is an article about theta joins in Hibernate (not NHibernate, but it's still relevant).
However, since the theta join is a Cartesian product, you might want to think twice and do some performance testing before you use that approach to do a three-join query.

fetching single child row based on a max value using Django ORM

I have a model, "Market" that has a one-to-many relation to another model, "Contract":
class Market(models.Model):
name = ...
...
class Contract(models.Model):
name= ...
market = models.ForeignKey(Market, ...)
current_price = ...
I'd like to fetch Market objects along with the contract with the maximum price of each. This is how I'd do it via raw SQL:
SELECT M.id as market_id, M.name as market_name, C.name as contract_name, C.price
as price from pm_core_market M INNER JOIN
(SELECT market_id, id, name, MAX(current_price) as price
FROM pm_core_contract GROUP BY market_id) AS C
ON M.id = C.market_id
Is there a way to implement this without using SQL? If there is, which one should be preferred in terms of performance?
Django 1.1 (currently beta) adds aggregation support to the database API. Your query can be done like this:
from django.db.models import Max, F
Contract.objects.annotate(max_price=Max('market__contract__current_price')).filter(current_price=F('max_price')).select_related()
This generates the following SQL query:
SELECT contract.id, contract.name, contract.market_id, contract.current_price, MAX(T3.current_price) AS max_price, market.id, market.name
FROM contract LEFT OUTER JOIN market ON (contract.market_id = market.id) LEFT OUTER JOIN contract T3 ON (market.id = T3.market_id)
GROUP BY contract.id, contract.name, contract.market_id, contract.current_price, market.id, market.name
HAVING contract.current_price = MAX(T3.current_price)
The API uses an extra join instead of a subquery (like your query does). It is difficult to tell which query is faster, especially without knowing the database system. I suggest that you do some benchmarks and decide.