I have the following three models where Budget and Sale both contain a foreign key to Customer:
class Customer(models.Model):
name = models.CharField(max_length=45)
# ...
class Budget(models.Model):
customer = models.ForeignKey(Customer, on_delete=models.PROTECT)
# ...
class Sale(models.Model):
customer = models.ForeignKey(Customer, on_delete=models.PROTECT)
# ...
I want to get a queryset of all Customer objects for which both a Budget and Sale exists. I initially tried getting the intersection of the customer field of all Budget and Sale objects:
customers = {
budget.customer for budget in Budget.objects.all()
} & {
sale.customer for sale in Sale.objects.all()
}
This returns the correct objects, but becomes horribly inefficient as the size of my database grows.
How can I retrieve these objects in a more efficient way? Thanks for any help!
You can filter with:
Customer.objects.filter(
budget__isnull=False,
sale__isnull=False
).distinct()
Django can follow ForeignKeys in reverse. It uses the related_query_name=… parameter [Django-doc] for the name of relation. If that is not specified, it falls back on the related_name=… parameter [Django-doc] parameter, and if that is not specified, it will use the name of the model in lowercase, so budget and sale. We here make LEFT OUTER JOINs on the Budget and Sale table, and check if for both there is a non-null row. Likely the Django ORM will optimize this to INNER JOINs.
Related
Imagine we have the following models:
type Company struct {
gorm.Model
Name string
Addresses []Address
}
type Address struct {
gorm.Model
CompanyID uint64
Street string
City string
Country string
}
I want to take all the companies(and their addresses) which have address in a specific location. Something like this:
SELECT company.*, address.* FROM company
INNER JOIN address ON address.company_id = company.id AND address.country = 'Bulgaria'
So if a company does not have address at the specific location, I will not get it as a result at all. I was trying something like that:
db.Joins("Addresses", "addresses.country = ?", "Bulgaria").Find(&companies)
However, it doesn't work, because GORM doesn't take the second argument of Joins(when preloading join used), so I should check the generated query and make something like that:
db.Where(`"Address".country = ?`, "Bulgaria").Joins("Addresses").Find(&companies)
Is there a better way/not hacky way? Have in mind all of the above code is mock of the real problem, I didn't want to expose the original models/queries.
You can use Preload to load Addresses into the Company object.
Based on your described conditions, where you don't want to load companies that don't match your filter, you should use an INNER JOIN
Two options:
First, if your table is named company, then your query should look like this:
db.Table("company").
Preload("Addresses").
Joins("INNER JOIN addresses a ON a.company_id = company.id").
Where("a.country = ?", "Bulgaria").
Find(&companies)
Second, if your table is named companies, then you should try this:
db.Preload("Addresses").
Joins("INNER JOIN addresses a ON a.company_id = companies.id").
Where("a.country = ?", "Bulgaria").
Find(&companies)
If you are using Gorm v2 you perform CRUD operations on has-one and one-to-many via associations:
var companies []Company
var addresses []Address
countries := []string{"Bulgaria"}
db.Model(&Address).Where("country IN (?)", countries).Find(&addresses)
// The key is using the previously fetched variable as the model for the association query
db.Model(&addresses).Association("CompanyID").Find(&companies)
This also works for many-to-many relations.
When comparing Gorm VS raw SQL queries (db.Raw(SQLquery)) you will typically see a performance hit. On the upside, you will not have to deal with the added complexity of raw sql string building in Go, which is pretty nice. I personally use a combination of both raw and gorm-built queries in my projects :)
Suppose I have 3 hypothetical models;
class State(models.Model):
name = models.CharField(max_length=20)
class Company(models.Model):
name = models.CharField(max_length=60)
state = models.ForeignField(State)
class Person(models.Model):
name = models.CharField(max_length=60)
state = models.ForeignField(State)
I want to be able to return results in a Django app, where the results, if using SQL directly, would be based on a query such as this:
SELECT a.name as 'personName',b.name as 'companyName', b.state as 'State'
FROM Person a, Company b
WHERE a.state=b.state
I have tried using the select_related() method as suggested here, but I don't think this is quite what I am after, since I am trying to join two tables that have a common foreign-key, but have no key-relationships amongst themselves.
Any suggestions?
Since a Person can have multiple Companys in the same state. It is not a good idea to do the JOIN at the database level. That would mean that the database will (likely) return the same Company multiple times, making the output quite large.
We can prefetch the related companies, with:
qs = Person.objects.select_related('state').prefetch_related('state__company')
Then we can query the Companys in the same state with:
for person in qs:
print(person.state.company_set.all())
You can use a Prefetch-object [Django-doc] to prefetch the list of related companies in an attribute of the Person, for example:
from django.db.models import Prefetch
qs = Person.objects.prefetch_related(
Prefetch('state__company', Company.objects.all(), to_attr='same_state_companies')
)
Then you can print the companies with:
for person in qs:
print(person.same_state_companies)
I have 4 model like this
class Site(models.Model):
name = models.CharField(max_length=200)
def get_lowest_price(self, mm_date):
'''This method returns lowest product price on a site at a particular date'''
class Category(models.Model):
name = models.CharField(max_length=200)
site = models.ForeignKey(Site)
class Product(models.Model):
name = models.CharField(max_length=200)
category = models.ForeignKey(Category)
class Price(models.Model):
date = models.DateField()
price = models.IntegerField()
product = models.ForeignKey(Product)
Here every have many category, every category have many product. Now product price can change every day so price model will hold the product price and date.
My problem is I want list of site filter by price range. This price range will depends on the get_lowest_price method and can be sort Min to Max and Max to Min. Already I've used lambda expression to do that but I think it's not appropriate
sorted(Site.objects.all(), key=lambda x: x.get_lowest_price(the_date))
Also I can get all site within a price range by running a loop but this is also not a good idea. Please help my someone to do the query in right manner.
If you still need more clear view of the question please see the first comment from "Ishtiaque Khan", his assumption is 100% right.
*In these models writing frequency is low and reading frequency is high.
1. Using query
If you just wanna query using a specific date. Here is how:
q = Site.objects.filter(category__product__price__date=mm_date) \
.annotate(min_price=Min('category__product__price__price')) \
.filter(min_price__gte=min_price, min_price__lte=max_price)
It will return a list of Site with lowest price on mm_date fall within range of min_price - max_price. You can also query for multiple date using query like so:
q = Site.objects.values('name', 'category__product__price__date') \
.annotate(min_price=Min('category__product__price__price')) \
.filter(min_price__gte=min_price, min_price__lte=max_price)
2. Eager/pre-calculation, you can use post_save signal. Since the write frequency is low this will not be expensive
Create another Table to hold lowest prices per date. Like this:
class LowestPrice(models.Model):
date = models.DateField()
site = models.ForeignKey(Site)
lowest_price = models.IntegerField(default=0)
Use post_save signal to calculate and update this every time there. Sample code (not tested)
from django.db.models.signals import post_save
from django.dispatch import receiver
#receiver(post_save, sender=Price)
def update_price(sender, instance, **kwargs):
cur_price = LowestPrice.objects.filter(site=instance.product.category.site, date=instance.date).first()
if not cur_price:
new_price = LowestPrice()
new_price.site = instance.product.category.site
new_price.date = instance.date
else:
new_price = cur_price
# update price only if needed
if instance.price<new_price.lowest_price:
new_price.lowest_price = instance.price
new_price.save()
Then just query directly from this table when needed:
LowestPrice.objects.filter(date=mm_date, lowest_price__gte=min_price, lowest_price__lte=max_price)
Solution:
from django.db.models import Min
Site.objects.annotate(
price_min=Min('categories__products__prices__price')
).filter(
categories__products__prices__date=the_date,
).distinct().order_by('price_min') # prefix '-' for descending order
For this to work, you need to modify the models by adding a related_name attribute to the ForeignKey fields.
Like this -
class Category(models.Model):
# rest of the fields
site = models.ForeignKey(Site, related_name='categories')
Similary, for Product and Price models, add related_name as products and prices in the ForeignKey fields.
Explanation:
Starting with related_name, it describes the reverse relation from one model to another.
After the reverse relationship is setup, you can use them to inner join the tables.
You can use the reverse relationships to get the price of a product of a category on a site and annotate the min price, filtered by the_date. I have used the annotated value to order by min price of the product, in ascending order. You can use '-' as a prefix character to do in descending order.
Do it with django queryset operations
Price.objects.all().order_by('price') #add [0] for only the first object
or
Price.objects.all().order_by('-price') #add [0] for only the first object
or
Price.objects.filter(date= ... ).order_by('price') #add [0] for only the first object
or
Price.objects.filter(date= ... ).order_by('-price') #add [0] for only the first object
or
Price.objects.filter(date= ... , price__gte=lower_limit, price__lte=upper_limit ).order_by('price') #add [0] for only the first object
or
Price.objects.filter(date= ... , price__gte=lower_limit, price__lte=upper_limit ).order_by('-price') #add [0] for only the first object
I think this ORM query could do the job ...
from django.db.models import Min
sites = Site.objects.annotate(price_min= Min('category__product__price'))
.filter(category__product__price=mm_date).unique().order_by('price_min')
or /and for reversing the order :
sites = Site.objects.annotate(price_min= Min('category__product__price'))
.filter(category__product__price=mm_date).unique().order_by('-price_min')
I have the following models (simplified):
class Category(models.model):
# ...
class Product(models.model):
# ...
class ProductCategory(models.Model):
product = models.ForeignKey(Product)
category = models.ForeignKey(Category)
# ...
class ProductImage(models.Model):
product = models.ForeignKey(Product)
image = models.ImageField(upload_to=product_image_path)
sort_order = models.PositiveIntegerField(default=100)
# ...
I want to construct a query that will get all the products associated with a particular category. I want to include just one of the many associated images--the image with the lowest sort_order--in the queryset so that a single query gets all of the data needed to show all products within a category.
In raw SQL I would might use a GROUP BY something like this:
SELECT * FROM catalog_product p
LEFT JOIN catalog_productcategory c ON (p.id = c.product_id)
LEFT JOIN catalog_productimage i ON (p.id = i.product_id)
WHERE c.category_id=2
GROUP BY p.id HAVING i.sort_order = MIN(sort_order)
Can this be done without using a raw query?
Edit - I should have noted what I've tried...
# inside Category model...
products = Product.objects.filter(productcategory__category=self) \
.annotate(Min('productimage__sort_order'))
While this query does GROUP BY, I do not see any way to (a) get the right ProductImage.image into the QuerySet eg. HAVING clause. I'm effectively trying to dynamically add a field to the Product instance (or the QuerySet) from a specific ProductImage instance. This may not be the way to do it with Django.
It isn't quite a raw query, but it isn't quite public api either.
You can add a group by clause to the queryset before it is evaluated:
qs = Product.objects.filter(some__foreign__key__join=something)
qs.group_by = 'some_field'
results = list(qs)
Word of caution, though: this behaves differently depending on the db backend.
catagory = Catagory.objects.get(get_your_catagory)
qs = Product.objects.annotate(Min('productimage__sortorder').filter(productcategory__category = catagory)
This should hit the DB only once, because querysets are lazy.
I'm working on developing a category roll-up report for a Magento (1.6) store.
To that end, I want to get an Order Item collection for a subset of products - those product whose unique category id (that's a Magento product attribute that I created) match a particular value.
I can get the relevant result set by basing the collection on catalog/product.
$collection = Mage::getModel('catalog/product')
->getCollection()
->addAttributeToFilter('unique_category_id', '75')
->joinTable('sales/order_item', 'product_id=entity_id', array('price'=>'price','qty_ordered' => 'qty_ordered'));
Magento doesn't like it, since there are duplicate entries for the same product id.
How do I craft the code to get this result set based on Order Items? Joining in the product collection filtered by an attribute is eluding me. This code isn't doing the trick, since it assumes that attribute is on the Order Item, and not the Product.
$collection = Mage::getModel('sales/order_item')
->getCollection()
->join('catalog/product', 'entity_id=product_id')
->addAttributeToFilter('unique_category_id', '75');
Any help is appreciated.
The only way to make cross entity selects work cleanly and efficiently is by building the SQL with the collections select object.
$attributeCode = 'unique_category_id';
$alias = $attributeCode.'_table';
$attribute = Mage::getSingleton('eav/config')
->getAttribute(Mage_Catalog_Model_Product::ENTITY, $attributeCode);
$collection = Mage::getResourceModel('sales/order_item_collection');
$select = $collection->getSelect()->join(
array($alias => $attribute->getBackendTable()),
"main_table.product_id = $alias.entity_id AND $alias.attribute_id={$attribute->getId()}",
array($attributeCode => 'value')
)
->where("$alias.value=?", 75);
This works quite well for me. I tend to skip going the full way of joining the eav_entity_type table, then eav_attribute, then the value table etc for performance reasons. Since the attribute_id is entity specific, that is all that is needed.
Depending on the scope of your attribute you might need to add in the store id, too.