Django ORM: Get first instance for each foreignkey - sql

I have the following models:
class Author(models.Model):
name = models.CharField(max_length=100)
class Book(models.Model):
title = models.CharField(max_length=100)
author = models.ForeignKey(Author, on_delete=models.CASCADE)
number = models.IntegerField()
Some Authors might have no Book. Is there a way to get a list or set containing exactly one Book per Author who wrote at least on Book? I'm looking for a solution in one single SQL transaction.
For example, if I have the following entries:
Authors:
Albert Camus
Friedrich Nietzsche
Sigmund Freud
Books:
Thus Spoke Zarathustra by Nietzsche
The Myth of Sisyphus by Camus
The Rebel by Camus
I want a query which returns [Thus Spoke Zarathustra, The Myth of Sisyphus], or [Thus Spoke Zarathustra, The Rebel].
Bonus points of the query returns the books with the lowest number.

You should be able to achieve this using a Subquery - for example
from django.db.models import OuterRef, Subquery
books = Book.objects.filter(author_id=OuterRef('pk')).order_by('number')
authors = Author.objects.annotate(book_title=Subquery(books.values('title')))
for author in authors:
print(author.name, author.book_title)

You'll have to use raw queries. Something like this:
query='''
SELECT b1.id, b1.title, b1.author_id, b1.number from Books_book b1, (
SELECT author_id, min(number) as min_number
from Books_book
GROUP BY author_id
) as b2
WHERE b1.author_id=b2.author_id AND b1.number = b2.min_number
'''
books_list = Book.objects.raw(query)[:]
Now books_list contains one book for each author(with lowest number), as required

Related

Select Related With Multiple Conditions

Using the Django ORM is it possible to perform a select_related (left join) with conditions additional to the default table1.id = table2.fk
Using the example models:
class Author(models.Model):
name = models.TextField()
age = models.IntegerField()
class Book(models.Model):
title = models.TextField()
and the raw sql
SELECT 'Book'.*, 'Author'.'name'
FROM 'Book'
LEFT JOIN
'Author'
ON 'Author'.'id' = 'Book'.'author_id'
AND 'Author'.'age' > 18 ;<---this line here is what id like to use via the ORM
I understand that in this simple example you can perform the filtering after the join, but that hasn't worked in my specific case. As i am doing sums across multiple left joins that require filters.
# gets all books which has author with age higher than 18
books = Book.objects.filter(author__age__gt=18)
returns queryset.
Then you can loop trough the queryset to access specific values and print them:
for b in books:
print(b.title, b.author.name, b.author.age)

Return results from more than one database table in Django

Suppose I have 3 hypothetical models;
class State(models.Model):
name = models.CharField(max_length=20)
class Company(models.Model):
name = models.CharField(max_length=60)
state = models.ForeignField(State)
class Person(models.Model):
name = models.CharField(max_length=60)
state = models.ForeignField(State)
I want to be able to return results in a Django app, where the results, if using SQL directly, would be based on a query such as this:
SELECT a.name as 'personName',b.name as 'companyName', b.state as 'State'
FROM Person a, Company b
WHERE a.state=b.state
I have tried using the select_related() method as suggested here, but I don't think this is quite what I am after, since I am trying to join two tables that have a common foreign-key, but have no key-relationships amongst themselves.
Any suggestions?
Since a Person can have multiple Companys in the same state. It is not a good idea to do the JOIN at the database level. That would mean that the database will (likely) return the same Company multiple times, making the output quite large.
We can prefetch the related companies, with:
qs = Person.objects.select_related('state').prefetch_related('state__company')
Then we can query the Companys in the same state with:
for person in qs:
print(person.state.company_set.all())
You can use a Prefetch-object [Django-doc] to prefetch the list of related companies in an attribute of the Person, for example:
from django.db.models import Prefetch
qs = Person.objects.prefetch_related(
Prefetch('state__company', Company.objects.all(), to_attr='same_state_companies')
)
Then you can print the companies with:
for person in qs:
print(person.same_state_companies)

Django annotate code

Just stumbled upon some guy code
He have models like this
class Country(models.Model):
name = models.CharField(max_length=100)
class TourDate(models.Model):
artist = models.ForeignKey("Artist")
date = models.DateField()
country = models.ForeignKey("Country")
And is querying like this
ireland = Country.objects.get(name="Ireland")
artists = Artist.objects.all().extra(select = {
"tourdate_count" : """
SELECT COUNT(*)
FROM sandbox_tourdate
JOIN sandbox_country on sandbox_tourdate.country_id = sandbox_country.id
WHERE sandbox_tourdate.artist_id = sandbox_artist.id
AND sandbox_tourdate.country_id = %d """ % ireland.pk,
}).order_by("-tourdate_count",)
My question is why He have underscores like sandbox_tourdate but it isn't in model field
Is that created automatically like some sort of pseudo-field?
sandbox_tourdate isn't the name of the field, it's the name of the table. Django's naming convention is to use appname_modelname as the table name, although this can be overridden. In this case, I guess the app is called 'sandbox'.
I don't really know why that person has used a raw query though, that is quite easily expressed in Django's ORM syntax.

Filtering model with HABTM relationship

I have 2 models - Restaurant and Feature. They are connected via has_and_belongs_to_many relationship. The gist of it is that you have restaurants with many features like delivery, pizza, sandwiches, salad bar, vegetarian option,… So now when the user wants to filter the restaurants and lets say he checks pizza and delivery, I want to display all the restaurants that have both features; pizza, delivery and maybe some more, but it HAS TO HAVE pizza AND delivery.
If I do a simple .where('features IN (?)', params[:features]) I (of course) get the restaurants that have either - so or pizza or delivery or both - which is not at all what I want.
My SQL/Rails knowledge is kinda limited since I'm new to this but I asked a friend and now I have this huuuge SQL that gets the job done:
Restaurant.find_by_sql(['SELECT restaurant_id FROM (
SELECT features_restaurants.*, ROW_NUMBER() OVER(PARTITION BY restaurants.id ORDER BY features.id) AS rn FROM restaurants
JOIN features_restaurants ON restaurants.id = features_restaurants.restaurant_id
JOIN features ON features_restaurants.feature_id = features.id
WHERE features.id in (?)
) t
WHERE rn = ?', params[:features], params[:features].count])
So my question is: is there a better - more Rails even - way of doing this? How would you do it?
Oh BTW I'm using Rails 4 on Heroku so it's a Postgres DB.
This is an example of a set-iwthin-sets query. I advocate solving these with group by and having, because this provides a general framework.
Here is how this works in your case:
select fr.restaurant_id
from features_restaurants fr join
features f
on fr.feature_id = f.feature_id
group by fr.restaurant_id
having sum(case when f.feature_name = 'pizza' then 1 else 0 end) > 0 and
sum(case when f.feature_name = 'delivery' then 1 else 0 end) > 0
Each condition in the having clause is counting for the presence of one of the features -- "pizza" and "delivery". If both features are present, then you get the restaurant_id.
How much data is in your features table? Is it just a table of ids and names?
If so, and you're willing to do a little denormalization, you can do this much more easily by encoding the features as a text array on restaurant.
With this scheme your queries boil down to
select * from restaurants where restaurants.features #> ARRAY['pizza', 'delivery']
If you want to maintain your features table because it contains useful data, you can store the array of feature ids on the restaurant and do a query like this:
select * from restaurants where restaurants.feature_ids #> ARRAY[5, 17]
If you don't know the ids up front, and want it all in one query, you should be able to do something along these lines:
select * from restaurants where restaurants.feature_ids #> (
select id from features where name in ('pizza', 'delivery')
) as matched_features
That last query might need some more consideration...
Anyways, I've actually got a pretty detailed article written up about Tagging in Postgres and ActiveRecord if you want some more details.
This is not "copy and paste" solution but if you consider following steps you will have fast working query.
index feature_name column (I'm assuming that column feature_id is indexed on both tables)
place each feature_name param in exists():
select fr.restaurant_id
from
features_restaurants fr
where
exists(select true from features f where fr.feature_id = f.feature_id and f.feature_name = 'pizza')
and
exists(select true from features f where fr.feature_id = f.feature_id and f.feature_name = 'delivery')
group by
fr.restaurant_id
Maybe you're looking at it backwards?
Maybe try merging the restaurants returned by each feature.
Simplified:
pizza_restaurants = Feature.find_by_name('pizza').restaurants
delivery_restaurants = Feature.find_by_name('delivery').restaurants
pizza_delivery_restaurants = pizza_restaurants & delivery_restaurants
Obviously, this is a single instance solution. But it illustrates the idea.
UPDATE
Here's a dynamic method to pull in all filters without writing SQL (i.e. the "Railsy" way)
def get_restaurants_by_feature_names(features)
# accepts an array of feature names
restaurants = Restaurant.all
features.each do |f|
feature_restaurants = Feature.find_by_name(f).restaurants
restaurants = feature_restaurants & restaurants
end
return restaurants
end
Since its an AND condition (the OR conditions get dicey with AREL). I reread your stated problem and ignoring the SQL. I think this is what you want.
# in Restaurant
has_many :features
# in Feature
has_many :restaurants
# this is a contrived example. you may be doing something like
# where(name: 'pizza'). I'm just making this condition up. You
# could also make this more DRY by just passing in the name if
# that's what you're doing.
def self.pizza
where(pizza: true)
end
def self.delivery
where(delivery: true)
end
# query
Restaurant.features.pizza.delivery
Basically you call the association with ".features" and then you use the self methods defined on features. Hopefully I didn't misunderstand the original problem.
Cheers!
Restaurant
.joins(:features)
.where(features: {name: ['pizza','delivery']})
.group(:id)
.having('count(features.name) = ?', 2)
This seems to work for me. I tried it with SQLite though.

Relational Algebra check for error

Hi could someone please verify my work. Im not sure if im doing any of this correctly and would greatly appreciate any help. I am not allow to use the Bow tie operator. Thank you.
Question:
Books (ISBN, Title, Authors, Publisher, Ed, Year, Genre)
Patron (MemberNumber, FirstName, LastName, AddressLn1, AddressLn2, City, State, Zipcode)
Loan (MemberNumber,ISBN,DateLoaned,DateDue, DateReturned)
Business Logic
• You may assume that the library only has one copy of each book.
• Each book may have many authors. If a particular book has multiple authors, they are listed as a comma separated string. You may assume that the same author always uses the same exact name and no two authors will have the same name.
• Year is stored as an integer.
• DateLoaned, DateDue, and DateReturned are stored as a date.
• When a book is initially lent out, DateReturned is set to be NULL, upon its return, the value is updated.
1.1. Find all books that were loaned out after 12/22/2012. Show the ISBN, Title, and DateDue.
1.2. Find all library patrons who have borrowed a book titled "Database Systems". Show their FirstName, LastName, and DateLoaned.
1.3. Find all books that were ever loaned out. Display the ISBN.
1.4. Find all books returned before 12/22/2012. Display the ISBN.
1.5. Find all books returned on or after 12/22/2012. Display the ISBN.
1.6. Find all books returned either (before 12/22/2012) or (on or after 12/22/2012) Display the ISBN.
1.7. In 1 sentence explain the difference between 1.3 and 1.6.
1.8. Find all patrons who have never borrowed a book.
1.9. Find all books with Genre "Mystery" that have NEVER been loaned out.
1.10. Create a new attribute ImportantDates. A date is important if it is in the Loan relation either as a DateLoaned or a DateDue. Display ImportantDates.
1.11. Find all library patrons who have borrowed a book with an author "James Stewart". You may use the expression LIKE "%James Stewart%" in your Relational Algebra.
1.12. Find all library patrons who have never borrowed a book with an author "James Stewart". You may use the expression LIKE "%James Stewart%" in your Relational Algebra.
1.13. Find all library patrons who have only borrowed a book with an author "James Stewart". If they have ever borrowed a book without the author "James Stewart" they should be excluded. You may use the expression LIKE "%James Stewart%" in your Relational Algebra. 
Answer:
1.1) πISBN,TITLE,DATEDUE(σDATELOANED > 12222012 AND BOOKS.ISBN = LOAN.ISBN(LOAN X BOOKS)
1.2) πFIRSTNAME,LASTNAME,DATELOANED(σTITLE = "DATABASE SYSTEMS" AND BOOKS.ISBN = LOAN.ISBN AND PATRON.MEMBERNUMBER = LOAN.MEMBERNUMBER(BOOKS X PATRON X LOAN))
1.3) πISBN(σDATELOANED <> "NULL" AND BOOK.ISBN = LOAN.ISBN(LOAN X BOOKS))
1.4) πISBN(σDATELOANED < 12222012 AND BOOK.ISBN = LOAN.ISBN(LOAN X BOOKS))
1.5) πISBN(σDATELOANED >= 12222012 AND BOOK.ISBN = LOAN.ISBN(LOAN X BOOKS))
1.6) πISBN(σDATELOANED >= 12222012 OR DATELOANED <12222012 AND BOOK.ISBN = LOAN.ISBN(LOAN X BOOKS))
1.7) 1.3 AND 1.6 are the same as they both find books that have been loaned.
1.8) σDATELOANED = "NULL" AND PATRON.MEMBERNUMBER = LOAN.MEMBERNUMBER(LOAN X PATRON)
1.9) σGENRE = "MYSTERY" AND BOOKS.ISBN = LOAN.ISBN AND DATELOANED = "NULL"
1.10) LOAN(DATELOANED,DATEDUE) -> IMPORTANTDATE
Could you please give me an example of either 1.11, 1.12, or 1.13 as I have no clue in how to use the LIKE expression.
The professor told you how to do it:
You may use the expression LIKE "%James Stewart%" in your Relational Algebra.
It has been about a year since I have had to use relational algebra, but it will be whatever your pre-conditions are (this is a task for you) followed by the line:
LIKE %James Stewart%
The SQL statement would look something like this:
Select * from patrons p where Books.author LIKE %James Stewart%
You will find in your studies relational algebra does not deal with functions of SQL it just looks at the purely mathematical side of things.