How to model complex left join Django - sql

I have two Django models that have a relationship that cannot be modelled with a foreign key
class PositionUnadjusted(models.Model):
identifier = models.CharField(max_length=256)
timestamp = models.DateTimeField()
quantity = models.IntegerField()
class Adjustment(models.Model):
identifier = models.CharField(max_length=256)
start = models.DateTimeField()
end = models.DateTimeField()
quantity_delta = models.IntegerField()
I want to create the notion of an adjusted position, where the quantity is modified by the sum of qty_deltas of all adjustments where adj.start <= pos.date < adj.end. In SQL this would be
SELECT pos_unadjusted.id,
pos_unadjusted.timestamp,
pos_unadjusted.identifier,
CASE
WHEN Sum(qty_delta) IS NOT NULL THEN pos_unadjusted.qty + Sum(qty_delta)
ELSE qty
END AS qty,
FROM myapp_positionunadjusted AS pos_unadjusted
LEFT JOIN myapp_adjustment AS adjustments
ON pos_unadjusted.identifier = adjustments.identifier
AND pos_unadjusted.timestamp >= date_start
AND pos_unadjusted.timestamp < date_end
GROUP BY pos_unadjusted.id,
pos_unadjusted.timestamp,
pos_unadjusted.identifier,
Is there some way to get this result without using raw sql? I use this query as a base for many other queries so I don't want to use raw sql.
I've looked into QuerySet and extra() but can't seem to coerce them into having this precise relationship. I'd love for position and PositionUnadjusted to have the same model and same API with no copy-pasting since right now updating them is a lot of copy pasting.

Related

Select Related With Multiple Conditions

Using the Django ORM is it possible to perform a select_related (left join) with conditions additional to the default table1.id = table2.fk
Using the example models:
class Author(models.Model):
name = models.TextField()
age = models.IntegerField()
class Book(models.Model):
title = models.TextField()
and the raw sql
SELECT 'Book'.*, 'Author'.'name'
FROM 'Book'
LEFT JOIN
'Author'
ON 'Author'.'id' = 'Book'.'author_id'
AND 'Author'.'age' > 18 ;<---this line here is what id like to use via the ORM
I understand that in this simple example you can perform the filtering after the join, but that hasn't worked in my specific case. As i am doing sums across multiple left joins that require filters.
# gets all books which has author with age higher than 18
books = Book.objects.filter(author__age__gt=18)
returns queryset.
Then you can loop trough the queryset to access specific values and print them:
for b in books:
print(b.title, b.author.name, b.author.age)

Rails, joining two tables with where clauses on each tabe

I'm new to web development and rails, and I'm trying to construct a query object for my first time. I have a table Players, and a table DefensiveStats, which has a foriegn-key player_id, so each row in this table belongs to a player. Players have a field api_player_number, which is an id used by a 3rd party that I'm referencing. A DefensiveStats object has two fields that are relevant for this query - a season_number integer and a week_number integer. What I'd like to do is build a single query that takes 3 parameters: an api_player_number, season_number, and week_number, and it should return the DefensiveStats object with the corresponding season and week numbers, that belongs to the player with api_player_number = passed in api_player_number.
Here is what I have attempted:
class DefensiveStatsWeekInSeasonQuery
def initialize(season_number, week_number, api_player_number)
#season_number = season_number
#week_number = week_number
#api_player_number = api_player_number
end
# data method always returns an object or list of object, not a relation
def data
defensive_stats = Player.where(api_player_number: #api_player_number)
.joins(:defensive_stats)
.where(season_number:#season_number, week_number: #week_number)
if defensive_stats.nil?
defensive_stats = DefensiveStats.new
end
defensive_stats
end
end
However, this does not work, as it performs the second where clause on the Player class, and not the DefensiveStats class -> specifically, "SQLite3::SQLException: no such column: players.season_number"
How can I construct this query? Thank you!!!
Player.joins(:defensive_stats).where(players: {api_player_number: #api_player_number}, defensive_stats: {season_number: #season_number, week_number: #week_number})
OR
Player.joins(:defensive_stats).where("players.api_player_number = ? and defensive_stats.season_number = ? and defensive_stats.week_number = ?", #api_player_number, #season_number, #week_number)

Translating query with JOIN expressions and a generic relation to Django ORM

class Business(models.Model):
manager = models.ForeignKey(User, on_delete=models.CASCADE)
#...
class Event(models.Model):
business = models.ForeignKey(Business, on_delete=models.CASCADE)
text = models.TextField()
when = models.DateTimeField()
likes = GenericRelation('Like')
class Like(models.Model):
person = models.ForeignKey(User, on_delete=models.CASCADE)
content_type = models.ForeignKey(ContentType)
object_id = models.PositiveIntegerField()
content_object = GenericForeignKey('content_type', 'object_id')
date = models.DateTimeField(auto_now=True)
So I have this structure in models.py. Here's an explanation of important models:
Event model has "business" field which links to the certain Business object, which further has "manager" field. Also, Event model has "when" field which describes the date when an event will occur.
On the other side, Like model has generic foreign key field which can link to the certain Event object, and also "person" and "date" fields which describe who gave like and when it was given to that event.
The goal is now to display on user page all liked events by targeted user, and all events which manager is that user. It can be simply done with this SQL command:
SELECT event.*
FROM event
INNER JOIN
business
ON (event.business_id = business.id)
LEFT JOIN
'like'
ON (event.id = object_id AND content_type_id = 17)
WHERE ('like'.person_id = 1 OR business.manager_id = 1);
But now the results have to be sorted, by already mentioned "date" in Like and "when" in Event model. The sorting behavior should be as follows: If the Event object derives from Like object then it should be sorted by "date" in that Like object, in other case it should be sorted by "when" in Event - this is where the things change. Therefore, here's how the final raw query looks like:
SELECT event.*
FROM event
INNER JOIN
business
ON (event.business_id = business.id AND business.manager_id = 1)
LEFT JOIN
'like'
ON (event.id = object_id AND content_type_id = 17 AND person_id = 1)
ORDER BY COALESCE('like'.date, event.'when') DESC;
And I have now to translate that last query to Django ORM, but I'm completely lost on that part. Can anyone help me? Thanks in advance!
After another day of struggling, I've finally solved it. Although it does not produce the same query from above and is not efficient like it (because it's selecting all likes on the queried event), it seems like it's the only good way of doing it in ORM:
from django.db.models import Case, When, F
Event.objects.filter( \
Q(business__manager=person) | \
Q(likes__person=person)) \
.order_by( \
Case( \
When(likes__person=person, then=F('likes__date')), \
default=F('when')) \
.desc())
Here's what SQL it produces:
SELECT event.*
FROM event
INNER JOIN
business
ON (event.business_id = business.id)
LEFT OUTER JOIN
'like'
ON (event.id = object_id AND content_type_id = 17)
WHERE (business.manager_id = 2 OR 'like'.person_id = 2)
ORDER BY CASE
WHEN 'like'.person_id = 2 THEN 'like'.date
ELSE event.'when'
END DESC;

Django - Count a subset of related models - Need to annotate count of active Coupons for each Item

I have a Coupon model that has some fields to define if it is active, and a custom manager which returns only live coupons. Coupon has an FK to Item.
In a query on Item, I'm trying to annotate the number of active coupons available. However, the Count aggregate seems to be counting all coupons, not just the active ones.
# models.py
class LiveCouponManager(models.Manager):
"""
Returns only coupons which are active, and the current
date is after the active_date (if specified) but before the valid_until
date (if specified).
"""
def get_query_set(self):
today = datetime.date.today()
passed_active_date = models.Q(active_date__lte=today) | models.Q(active_date=None)
not_expired = models.Q(valid_until__gte=today) | models.Q(valid_until=None)
return super(LiveCouponManager,self).get_query_set().filter(is_active=True).filter(passed_active_date, not_expired)
class Item(models.Model):
# irrelevant fields
class Coupon(models.Model):
item = models.ForeignKey(Item)
is_active = models.BooleanField(default=True)
active_date = models.DateField(blank=True, null=True)
valid_until = models.DateField(blank=True, null=True)
# more fields
live = LiveCouponManager() # defined first, should be default manager
# views.py
# this is the part that isn't working right
data = Item.objects.filter(q).distinct().annotate(num_coupons=Count('coupon', distinct=True))
The .distinct() and distinct=True bits are there for other reasons - the query is such that it will return duplicates. That all works fine, just mentioning it here for completeness.
The problem is that Count is including inactive coupons that are filtered out by the custom manager.
Is there any way I can specify that Count should use the live manager?
EDIT
The following SQL query does exactly what I need:
SELECT data_item.title, COUNT(data_coupon.id) FROM data_item LEFT OUTER JOIN data_coupon ON (data_item.id=data_coupon.item_id)
WHERE (
(is_active='1') AND
(active_date <= current_timestamp OR active_date IS NULL) AND
(valid_until >= current_timestamp OR valid_until IS NULL)
)
GROUP BY data_item.title
At least on sqlite. Any SQL guru feedback would be greatly appreciated - I feel like I'm programming by accident here. Or, even better, a translation back to Django ORM syntax would be awesome.
In case anyone else has the same problem, here's how I've gotten it to work:
Items = Item.objects.filter(q).distinct().extra(
select={"num_coupons":
"""
SELECT COUNT(data_coupon.id) FROM data_coupon
WHERE (
(data_coupon.is_active='1') AND
(data_coupon.active_date <= current_timestamp OR data_coupon.active_date IS NULL) AND
(data_coupon.valid_until >= current_timestamp OR data_coupon.valid_until IS NULL) AND
(data_coupon.data_id = data_item.id)
)
"""
},).order_by(order_by)
I don't know that I consider this a 'correct' answer - it completely duplicates my custom manager in a possibly non portable way (I'm not sure how portable current_timestamp is), but it does work.
Are you sure your custom manager actually get's called? You set your manager as Model.live, but you query the normal manager at Model.objects.
Have you tried the following?
data = Data.live.filter(q)...

Django ORM: Getting rows based on max value of a column

I have a class Marketorders which contains information about single market orders and they are gathered in snapshots of the market (represented by class Snapshot). Each order can appear in more than one snapshot with the latest row of course being the relevant one.
class Marketorders(models.Model):
id = models.AutoField(primary_key=True)
snapid = models.IntegerField()
orderid = models.IntegerField()
reportedtime = models.DateTimeField(null=True, blank=True)
...
class Snapshot(models.Model):
id = models.IntegerField(primary_key=True)
...
What I'm doing is getting all of the orders across several snapshots for processing, but I want to include only the most recent row for each order. In SQL I would simply do:
SELECT m1.* FROM marketorders m1 WHERE reportedtime = (SELECT max(reportedtime)
FROM marketorders m2 WHERE m2.orderid=m1.orderid);
or better yet with a join:
SELECT m1.* FROM marketorders m1 LEFT JOIN marketorders m2 ON
m1.orderid=m2.orderid AND m1.reportedtime < m2.reportedtime
WHERE m2.orderid IS NULL;
However, I just can't figure out how to do this with Django ORM. Is there any way to accomplish this without raw SQL?
EDIT: Just to clarify the problem. Let's say we have the following marketorders (leaving out everything unimportant and using only orderid, reportedtime):
1, 09:00:00
1, 10:00:00
1, 12:00:00
2, 09:00:00
2, 10:00:00
How do I get the following set with the ORM?
1, 12:00:00
2, 10:00:00
If I understood right you need a list of Marketorder objects that contains each Marketorder with highest reportedtime per orderid
Something like this should work (disclaimer: didn't test it directly):
m_orders = Marketorders.objects.filter(id__in=(
Marketorders.objects
.values('orderid')
.annotate(Max('reportedtime'))
.values_list('id', flat=True)
))
For documentation check:
http://docs.djangoproject.com/en/dev/topics/db/aggregation/
Edit:
This should get a single Marketorder with highest reportedtime for a specific orderid
order = (
Marketorders.objects
.filter(orderid=the_orderid)
.filter(reportedtime=(
Marketorders.objects
.filter(orderid=the_orderid)
.aggregate(Max('reportedtime'))
['reportedtime__max']
))
)
Do you have a good reason why you don't use ForeignKey or (in your case better) ManyToManyField. These fields represent the relational structure of ur models.
Furthermore it is not necessary to declare an pk-field id. if no pk is defined, django adds id.
The code below allow orm-queries like this:
m1 = Marketorder()
m1.save() # important: primary key will be added in db
s1 = Snapshot()
s2 = Snapshot()
s1.save()
s2.save()
m1.snapshots.add(s1)
m1.snapshots.add(s2)
m1.snapshots.all()
m1.snapshots.order_by("id") # snapshots in relations with m1
# ordered by id, that is added automatically
s1.marketorder_set.all() # reverse
so for your query:
snapshot = Snapshot.objects.order_by('-id')[0] # order desc, pick first
marketorders = snapshot.marketorder_set.all() # all marketorders in snapshot
or compact:
marketorders = Snapshot.objects.order_by('-id')[0].marketorder_set.all()
models.py:
class Snapshot(models.Model):
name = models.CharField(max_length=100)
class Marketorder(models.Model):
snapshots = models.ManyToManyField(Snapshot)
reportedtime = models.DateTimeField(auto_now= True)
By convention all model classes name are singular. Django makes it plural in different places automatically.
more on queries (filtering, sorting, complex lookups). a must-read.