Django retrieve results by date operation query - sql

I have one model to make my Catalog like this
class Product(models.Model):
item = models.CharField(max_length=10)
description = models.CharField(max_length=140)
unit = models.CharField(max_length=5)
expiration= models.IntegerField(default=365) # Days after admission
expiration_warning= models.IntegerField(default=30) # Days before expiration
...
and I have my Inventory like this:
class Inventory(models.Model):
product = models.ForeignKey(Product)
quantity= models.DecimalField(default=0, decimal_places=2, max_digits=10)
admission = models.DateTimeField(auto_now_add=True)
...
Now I want to retrieve all Inventory objects that its expiration date is upcoming or has been reached.
With a RAW SQL query, it will be something like:
SELECT product,quantity,admission,expiration,expiration_warning
FROM Inventory as I JOIN Product as P on I.product=P.id
WHERE DATEADD(day,(expiration_warning*-1),DATEADD(day,expiration,admission))<=GETDATE()
I haven't tried this query yet, it is just an example
I want to achieve this using Django query syntax, can you help me please?

Maybe something like this can achieve what you need:
from datetime import datetime, timedelta
expiring_inventory = []
expired_inventory = []
for inventory in Inventory.objects.select_related('product').all(): # 1 query
p = inventory.product
expire_datetime = inventory.admission + timedelta(days=p.expiration)
expiring_datetime = expire_datetime - timedelta(days=p.expiration_warning)
if expiring_datetime <= datetime.now() < expire_datetime: # not quite expired
expiring_inventory.append(inventory)
if expire_datetime <= datetime.now(): # expired
expired_inventory.append(inventory)
You can read more about Query Sets in the django docs.

Related

How to model complex left join Django

I have two Django models that have a relationship that cannot be modelled with a foreign key
class PositionUnadjusted(models.Model):
identifier = models.CharField(max_length=256)
timestamp = models.DateTimeField()
quantity = models.IntegerField()
class Adjustment(models.Model):
identifier = models.CharField(max_length=256)
start = models.DateTimeField()
end = models.DateTimeField()
quantity_delta = models.IntegerField()
I want to create the notion of an adjusted position, where the quantity is modified by the sum of qty_deltas of all adjustments where adj.start <= pos.date < adj.end. In SQL this would be
SELECT pos_unadjusted.id,
pos_unadjusted.timestamp,
pos_unadjusted.identifier,
CASE
WHEN Sum(qty_delta) IS NOT NULL THEN pos_unadjusted.qty + Sum(qty_delta)
ELSE qty
END AS qty,
FROM myapp_positionunadjusted AS pos_unadjusted
LEFT JOIN myapp_adjustment AS adjustments
ON pos_unadjusted.identifier = adjustments.identifier
AND pos_unadjusted.timestamp >= date_start
AND pos_unadjusted.timestamp < date_end
GROUP BY pos_unadjusted.id,
pos_unadjusted.timestamp,
pos_unadjusted.identifier,
Is there some way to get this result without using raw sql? I use this query as a base for many other queries so I don't want to use raw sql.
I've looked into QuerySet and extra() but can't seem to coerce them into having this precise relationship. I'd love for position and PositionUnadjusted to have the same model and same API with no copy-pasting since right now updating them is a lot of copy pasting.

Conditional bulk update in Django using grouping

Suppose I have a list of transactions with the following model definition:
class Transaction(models.Model):
amount = models.FloatField()
client = models.ForeignKey(Client)
date = models.DateField()
description = models.CharField()
invoice = models.ForeignKey(Invoice, null=True)
Now I want to create invoices for each client at the end of the month. The invoice model looks like this:
class Invoice(models.Model):
client = models.ForeignKey(Client)
invoice_date = models.DateField()
invoice_number = models.CharField(unique=True)
def amount_due(self):
return self.transaction_set.aggregate(Sum('amount'))
def create_invoices(invoice_date):
for client in Client.objects.all():
transactions = Transaction.objects.filter(client=client)
if transactions.exists():
invoice = Invoice(client=client, number=get_invoice_number(), date=invoice_date)
invoice.save()
transactions.update(invoice=invoice)
I know I can create all the invoices with a bulk create in 1 query with bulk create, but I would still have to the set the invoice field in the transaction model individually.
Is it possible to set the invoice field of all the Transaction models with a single query after I've created all the invoices? Preferably in using the ORM but happy to use raw SQL if required.
I know I can also use group by client on the transaction list to get the total per client, but then the individual entries are not linked to the invoice.
You could try constructing a conditional update query if you are able to generate a mapping from clients to invoices before:
from django.db.models import Case, Value, When
# generate this after creating the invoices
client_invoice_mapping = {
# client: invoice
}
cases = [When(client_id=client.pk, then=Value(invoice.pk))
for client, invoice in client_invoice_mapping.items()]
Transaction.objects.update(invoice_id=Case(*cases))
Note that Conditional Queries are available since Django 1.8. Otherwise you may look into constructing something similar using raw SQL.
To complement #Bernhard Vallant answer. You can use only 3 queries.
def create_invoices(invoice_date):
# Maybe use Exists clause here instead of subquery,
# do some tests for your case if the query is slow
clients_with_transactions = Client.objects.filter(
id__in=Transaction.objects.values('client')
)
invoices = [
Invoice(client=client, number=get_invoice_number(), date=invoice_date)
for client in clients_with_transactions
]
# With PostgreSQL Django can populate id's here
invoices = Invoice.objects.bulk_create(invoices)
# And now use a conditional update
cases = [
When(client_id=invoice.client_id, then=Value(invoice.pk))
for invoice in invoices
]
Transaction.objects.update(invoice_id=Case(*cases))

Django complex filter and order

I have 4 model like this
class Site(models.Model):
name = models.CharField(max_length=200)
def get_lowest_price(self, mm_date):
'''This method returns lowest product price on a site at a particular date'''
class Category(models.Model):
name = models.CharField(max_length=200)
site = models.ForeignKey(Site)
class Product(models.Model):
name = models.CharField(max_length=200)
category = models.ForeignKey(Category)
class Price(models.Model):
date = models.DateField()
price = models.IntegerField()
product = models.ForeignKey(Product)
Here every have many category, every category have many product. Now product price can change every day so price model will hold the product price and date.
My problem is I want list of site filter by price range. This price range will depends on the get_lowest_price method and can be sort Min to Max and Max to Min. Already I've used lambda expression to do that but I think it's not appropriate
sorted(Site.objects.all(), key=lambda x: x.get_lowest_price(the_date))
Also I can get all site within a price range by running a loop but this is also not a good idea. Please help my someone to do the query in right manner.
If you still need more clear view of the question please see the first comment from "Ishtiaque Khan", his assumption is 100% right.
*In these models writing frequency is low and reading frequency is high.
1. Using query
If you just wanna query using a specific date. Here is how:
q = Site.objects.filter(category__product__price__date=mm_date) \
.annotate(min_price=Min('category__product__price__price')) \
.filter(min_price__gte=min_price, min_price__lte=max_price)
It will return a list of Site with lowest price on mm_date fall within range of min_price - max_price. You can also query for multiple date using query like so:
q = Site.objects.values('name', 'category__product__price__date') \
.annotate(min_price=Min('category__product__price__price')) \
.filter(min_price__gte=min_price, min_price__lte=max_price)
2. Eager/pre-calculation, you can use post_save signal. Since the write frequency is low this will not be expensive
Create another Table to hold lowest prices per date. Like this:
class LowestPrice(models.Model):
date = models.DateField()
site = models.ForeignKey(Site)
lowest_price = models.IntegerField(default=0)
Use post_save signal to calculate and update this every time there. Sample code (not tested)
from django.db.models.signals import post_save
from django.dispatch import receiver
#receiver(post_save, sender=Price)
def update_price(sender, instance, **kwargs):
cur_price = LowestPrice.objects.filter(site=instance.product.category.site, date=instance.date).first()
if not cur_price:
new_price = LowestPrice()
new_price.site = instance.product.category.site
new_price.date = instance.date
else:
new_price = cur_price
# update price only if needed
if instance.price<new_price.lowest_price:
new_price.lowest_price = instance.price
new_price.save()
Then just query directly from this table when needed:
LowestPrice.objects.filter(date=mm_date, lowest_price__gte=min_price, lowest_price__lte=max_price)
Solution:
from django.db.models import Min
Site.objects.annotate(
price_min=Min('categories__products__prices__price')
).filter(
categories__products__prices__date=the_date,
).distinct().order_by('price_min') # prefix '-' for descending order
For this to work, you need to modify the models by adding a related_name attribute to the ForeignKey fields.
Like this -
class Category(models.Model):
# rest of the fields
site = models.ForeignKey(Site, related_name='categories')
Similary, for Product and Price models, add related_name as products and prices in the ForeignKey fields.
Explanation:
Starting with related_name, it describes the reverse relation from one model to another.
After the reverse relationship is setup, you can use them to inner join the tables.
You can use the reverse relationships to get the price of a product of a category on a site and annotate the min price, filtered by the_date. I have used the annotated value to order by min price of the product, in ascending order. You can use '-' as a prefix character to do in descending order.
Do it with django queryset operations
Price.objects.all().order_by('price') #add [0] for only the first object
or
Price.objects.all().order_by('-price') #add [0] for only the first object
or
Price.objects.filter(date= ... ).order_by('price') #add [0] for only the first object
or
Price.objects.filter(date= ... ).order_by('-price') #add [0] for only the first object
or
Price.objects.filter(date= ... , price__gte=lower_limit, price__lte=upper_limit ).order_by('price') #add [0] for only the first object
or
Price.objects.filter(date= ... , price__gte=lower_limit, price__lte=upper_limit ).order_by('-price') #add [0] for only the first object
I think this ORM query could do the job ...
from django.db.models import Min
sites = Site.objects.annotate(price_min= Min('category__product__price'))
.filter(category__product__price=mm_date).unique().order_by('price_min')
or /and for reversing the order :
sites = Site.objects.annotate(price_min= Min('category__product__price'))
.filter(category__product__price=mm_date).unique().order_by('-price_min')

Query that joins just a single row from a ForeignKey relationship

I have the following models (simplified):
class Category(models.model):
# ...
class Product(models.model):
# ...
class ProductCategory(models.Model):
product = models.ForeignKey(Product)
category = models.ForeignKey(Category)
# ...
class ProductImage(models.Model):
product = models.ForeignKey(Product)
image = models.ImageField(upload_to=product_image_path)
sort_order = models.PositiveIntegerField(default=100)
# ...
I want to construct a query that will get all the products associated with a particular category. I want to include just one of the many associated images--the image with the lowest sort_order--in the queryset so that a single query gets all of the data needed to show all products within a category.
In raw SQL I would might use a GROUP BY something like this:
SELECT * FROM catalog_product p
LEFT JOIN catalog_productcategory c ON (p.id = c.product_id)
LEFT JOIN catalog_productimage i ON (p.id = i.product_id)
WHERE c.category_id=2
GROUP BY p.id HAVING i.sort_order = MIN(sort_order)
Can this be done without using a raw query?
Edit - I should have noted what I've tried...
# inside Category model...
products = Product.objects.filter(productcategory__category=self) \
.annotate(Min('productimage__sort_order'))
While this query does GROUP BY, I do not see any way to (a) get the right ProductImage.image into the QuerySet eg. HAVING clause. I'm effectively trying to dynamically add a field to the Product instance (or the QuerySet) from a specific ProductImage instance. This may not be the way to do it with Django.
It isn't quite a raw query, but it isn't quite public api either.
You can add a group by clause to the queryset before it is evaluated:
qs = Product.objects.filter(some__foreign__key__join=something)
qs.group_by = 'some_field'
results = list(qs)
Word of caution, though: this behaves differently depending on the db backend.
catagory = Catagory.objects.get(get_your_catagory)
qs = Product.objects.annotate(Min('productimage__sortorder').filter(productcategory__category = catagory)
This should hit the DB only once, because querysets are lazy.

Django - Count a subset of related models - Need to annotate count of active Coupons for each Item

I have a Coupon model that has some fields to define if it is active, and a custom manager which returns only live coupons. Coupon has an FK to Item.
In a query on Item, I'm trying to annotate the number of active coupons available. However, the Count aggregate seems to be counting all coupons, not just the active ones.
# models.py
class LiveCouponManager(models.Manager):
"""
Returns only coupons which are active, and the current
date is after the active_date (if specified) but before the valid_until
date (if specified).
"""
def get_query_set(self):
today = datetime.date.today()
passed_active_date = models.Q(active_date__lte=today) | models.Q(active_date=None)
not_expired = models.Q(valid_until__gte=today) | models.Q(valid_until=None)
return super(LiveCouponManager,self).get_query_set().filter(is_active=True).filter(passed_active_date, not_expired)
class Item(models.Model):
# irrelevant fields
class Coupon(models.Model):
item = models.ForeignKey(Item)
is_active = models.BooleanField(default=True)
active_date = models.DateField(blank=True, null=True)
valid_until = models.DateField(blank=True, null=True)
# more fields
live = LiveCouponManager() # defined first, should be default manager
# views.py
# this is the part that isn't working right
data = Item.objects.filter(q).distinct().annotate(num_coupons=Count('coupon', distinct=True))
The .distinct() and distinct=True bits are there for other reasons - the query is such that it will return duplicates. That all works fine, just mentioning it here for completeness.
The problem is that Count is including inactive coupons that are filtered out by the custom manager.
Is there any way I can specify that Count should use the live manager?
EDIT
The following SQL query does exactly what I need:
SELECT data_item.title, COUNT(data_coupon.id) FROM data_item LEFT OUTER JOIN data_coupon ON (data_item.id=data_coupon.item_id)
WHERE (
(is_active='1') AND
(active_date <= current_timestamp OR active_date IS NULL) AND
(valid_until >= current_timestamp OR valid_until IS NULL)
)
GROUP BY data_item.title
At least on sqlite. Any SQL guru feedback would be greatly appreciated - I feel like I'm programming by accident here. Or, even better, a translation back to Django ORM syntax would be awesome.
In case anyone else has the same problem, here's how I've gotten it to work:
Items = Item.objects.filter(q).distinct().extra(
select={"num_coupons":
"""
SELECT COUNT(data_coupon.id) FROM data_coupon
WHERE (
(data_coupon.is_active='1') AND
(data_coupon.active_date <= current_timestamp OR data_coupon.active_date IS NULL) AND
(data_coupon.valid_until >= current_timestamp OR data_coupon.valid_until IS NULL) AND
(data_coupon.data_id = data_item.id)
)
"""
},).order_by(order_by)
I don't know that I consider this a 'correct' answer - it completely duplicates my custom manager in a possibly non portable way (I'm not sure how portable current_timestamp is), but it does work.
Are you sure your custom manager actually get's called? You set your manager as Model.live, but you query the normal manager at Model.objects.
Have you tried the following?
data = Data.live.filter(q)...