Django SQL query count - sql

I have the following models in Django:
class Author(models.Model):
name = models.CharField(max_length=120)
country = models.CharField(max_length=100)
class Book(models.Model):
title = models.CharField(max_length=1024)
publisher = models.CharField(max_length=255)
published_date = models.DateField()
author = models.ForeignKey(Author)
There are 9 records in the Author table and 4 in the Book table.
How many SQL queries would be issued when Book.objects.select_related().all() is evaluated?
My guess was 4, because there are 4 rows in the Book table, so 1 query each to search for all the authors related to each book. Why is my answer wrong?
The possible choices are 5, 4, 10 and 1.

select_related(*fields)
Returns a QuerySet that will “follow” foreign-key relationships, selecting additional related-object data when it executes its query. This is a performance booster which results in a single more complex query but means later use of foreign-key relationships won’t require database queries.

Related

Relations Has Many Through with 3 models - TypeORM

I have three models: User, Category and Feed.
The first one is User and it has a One to Many relationship with the second model which is Category. Category has a userId column and a Many to One relationship with User.
Category has a One To Many relationship with the third and last model: Feed. Similarly, Feed has a column categoryId and a Many To One relationship with Category.
I want to access the Feeds of a certain categoryId (for example of categoryId = 2) but only where the userId on this category is a certain value too (for example of userId = 1).
This relation is the has_many_through for Ruby on Rails programmers...
How can I build this query using TypeORM ?
Alternatively, if you have an idea on how to write it in pure SQL I'll take it too.
I'm also thinking about creating a column userId directly through Feeds to have a One To Many relationship between User and Feed. Do you think it'll be more optimized to do so ?
Many thanks.
This is how you would achieve that through pure SQL, I'm not much help on the TypeORM front unfortunately.
SELECT f.*
FROM Feeds f
LEFT JOIN Category g
ON f.categoryId = g.categoryId
WHERE f.categoryId = 2
AND g.userId = 1

Django - join between different models without foreign key

Imagine I have two simple models (it's not really what I have but this will do):
Class Person(models.Model):
person_id = models.TextField()
name = models.TextField()
#...some other fields
Class Pet(models.Model):
person_id = models.TextField()
pet_name = models.TextField()
species = models.TextField()
#...even more fields
Here's the key difference between this example and some other questions I read about: my models don't enforce a foreign key, so I can't use select_related()
I need to create a view that shows a join between two querysets in each one. So, let's imagine I want a view with all owners named John with a dog.
# a first filter
person_query = Person.objects.filter(name__startswith="John")
# a second filter
pet_query = Pet.objects.filter(species="Dog")
# the sum of the two
magic_join_that_i_cant_find_and_possibly_doesnt_exist = join(person_query.person_id, pet_query.person_id)
Now, can I join those two very very simple querysets with any function?
Or should I use raw?
SELECT p.person_id, p.name, a.pet_name, a.species
FROM person p
LEFT JOIN pet a ON
p.person_id = a.person_id AND
a.species = 'Dog' AND
p.name LIKE 'John%'
Is this query ok? Damn, I'm not sure anymore... that's my issue with queries. Everything is all at once. But consecutive queries seem so simple...
If I reference in my model class a "foreign key" (for select_related() use), will it be enforced in the database after the migration? (I need that it DOESN'T happen)
Make a models.ForeignKey but use db_constraint=False.
See https://docs.djangoproject.com/en/3.0/ref/models/fields/#django.db.models.ForeignKey.db_constraint
Also, if this model is managed=False, ie it is a legacy db table and you're not using Django migrations, the constraint won't ever be made in the first place and it's fine.
If you create a FK in the model, Django will create a constraint on migration, so you want to avoid that in your case.
I don't think there is a way to join in the database in Django if you don't declare the field to join as a foreign key. The only thing you can do is to do the join in Python, which might or might not be OK. Think that prefetch_related does precisely this.
The code would be something like:
person_query = Person.objects.filter(name__startswith="John")
person_ids = [person.id for person in person_query]
pet_query = Pet.objects.filter(species="Dog", person_id__in=person_ids).order_by('person_id')
pets_by_person_id = {person_id: pet_group for person_id, pet_group in itertools.groupby(pet_query, lambda pet: pet.person_id)}
# Now everytime you need the pets for a certain person
pets_by_person_id(person.id)
# You can also set it in all objects for easy retrieval
for person in person_query:
person.pets = pets_by_person_id(person.id)
The code might not be 100% accurate, but you get the idea I hope.

Django ORM - understanding foreign key queries

I'm in a process of optimizing my queries. Assume I have these models:
class Author(models.Model):
name = models.CharField(max_length=20)
class Book(models.Model):
name = models.CharField(max_length=20)
author = models.ForeignKey(Author)
A simple task here would be to get all the books of a given author, assume I have the author-ID.
In standard SQL I would only need to query the books table.
But In django code I do:
# given authorID
author = Author.objects.get(pk=authorID)
books = Book.objects.filter(author=author)
Which would take two queries. How can I avoid the first query ?
Try something like:
Book.objects.filter(author_id=authorID)
This will return all the books where author's foreign key is authorID.

How to write query of queries in HQL (hibernate)

I'm trying to SUM some data via query of queries. Its a fairly complicated sql query over complicated relationships that I'd like to translate into HQL.
I'll use a simplified version of the data relationships to make discussion easier.
So how could I translate this into HQL? Is query of queries even possible in HQL?
Example:
Suppose we have a Movie Critic that reviews movies online, and we want to return totals for the number of movies he's reviewed, the number of movies he loved, & the number of movies he hates.
Tables:
Critic
Movie
Review (link table between Critic & Movie with LoveFlag, if LoveFlag is false he hates the movie)
SQL Query:
(this is a made up scenario, the solution i'm working on is for facility management. I wrote this query on stack overflow, so there very well could be flaws in it.)
SELECT criticSummary.id
, COUNT(criticSummary.reviewId) as totalReviews
, SUM(criticSummary.isLoved) as totalLoved
, SUM(criticSummary.isHated) as totalHated
FROM (
SELECT DISTINCT critic.id AS id,
review.id AS reviewId,
review.isLoved AS isLoved,
CASE WHEN review.isLoved = 1 THEN 0 ELSE 1 END AS isHated
FROM [critic] critic
INNER JOIN [review] review
ON (
review.criticId = critic.id
AND review.active = 1
)
WHERE critic.active = 1
) AS criticSummary
GROUP BY criticSummary.id
Do you really need to use HQL? You have some options to simplify things:
1 Hibernate Formula Fields in Critic Entity
#Formula("select count(*) from reviews where id = id and loved = 1")
public int totalMoviesLoved;
#Formula("select count(*) from reviews where id = id and loved = 0")
public int totalMoviesHated;
public int getTotalMoviesReviewed(){
return get totalMoviesHated + totalMoviesLoved;
}
2 Create a Database View
Create a Database view, say critic_summary_data
Create an entity mapped to this view (works just the same as a table)
Map this in Critic Entity
#OneToOne
private CriticSummaryData summaryData;

Best way to fetch tree of objects stored in an RDBMS

This question is intended to be software / platform agnostic. I am just looking for generic SQL code.
Consider the following (very simple for example's sake) tables:
Table: Authors
id | name
1 | Tyson
2 | Gordon
3 | Tony
etc
Table: Books
id | author | title
1 | 1 | Tyson's First Book
2 | 2 | Gordon's Book
3 | 1 | Tyson's Second Book
4 | 3 | Tony's Book
etc
Table: Stores
id | name
1 | Books Overflow
2 | Books Exchange
etc
Table: Stores_Books
id | store | book
1 | 1 | 1
2 | 2 | 4
3 | 1 | 3
4 | 2 | 2
As you can see, there is a one-to-many relationship between Books and Authors, and a many-to-many relationship between Books and Stores.
Question one: What is the best query to eager load one author and their books (and where the books are sold) into an object-oriented program where each row is representative of an object instance?
Question two: What is the best query to eager load the entire object tree into an object-oriented program where each row is representative of an object instance?
Both of these situations are easy to imagine with lazy loading. In either situation you would fetch the author with one query and then as soon as you need their books (and what stores the books are sold at) you would use another query to get that information.
Is lazy loading the best way to do this or should I use a join and parse the result when creating the object tree (in an attempt to eager load the data)? In this situation what would be the optimal join / target output from the database in order to make parsing as simple as possible?
As far as I can tell, with eager loading, I would need to manage a dictionary or index of some sort of all the objects while I am parsing the data. Is this actually the case or is there a better way?
That's a tough question to answer. I've done this before by writing a query that returns everything as a flat table and then looping through the results, creating objects or structures as the most-significant columns change. I think that works better than multiple database calls because there's a lot of overhead involved in each call, though depending on how many smaller entities there are to each big entity that might not be best.
The following might apply to both your questions 1 and 2.
SELECT a.id, a.name, b.id, b.name FROM authors a LEFT JOIN books b ON a.id=b.author
(pseudocode, in your program that makes the db call)
while (%row=fetchrow) {
if ($row{a.id} != currentauthor.id) {
currentauthor.id=$row{a.id};
currentauthor.name=$row{a.name};
}
currentbook=new book($row{b.id, b.name});
push currentauthor.booklist, currentbook;
}
[edit] I just realized I didn't answer the second part of your question. Depending on the size of the data for stores and what I intended doing with it, I would either
Before looping through books/authors as above, slurp the whole stores table into a structure in my program, much like the book/author structure above but indexed by the storeid, and then do a lookup in that structure every time I read a book record and store a reference to the store table
or, if there are many stores,
Join the stores onto the books and have an additional nested loop to add stores objects within the part of the code that adds a book.
Here's a relevant Wikipedia article: http://en.wikipedia.org/wiki/Object-relational_impedance_mismatch
I hope that helps!
In an OO program you don't use SQL, instead you let that be done invisibly by your Persistence mechanism. To explain:
If you have an object-oriented program then you want an object model that natuarally represents the concepts of Author, Book and Store. You then have an "Object/Relational mapping" problem. Somehow you want to get data from the database using SQL and yet work naturally with your objects.
In the Java world we do that with the Java Persistence API (JPA). You don't actually write the SQL instead you just "annotate" the Java Class to say "This class corresponds to that Table, this attribute to that column", and then do some interesting things with the JOINs and can in fact choose either Lazy or Eager loading as it makes sense.
So you might end up with an Author class (I'm making attributes public here for brevity, in real life we have private attributes and getters and setters.
#Entity
public Class Author {
public int id;
public String name;
// more in a minute
That class is annotated as an entity and so JPA with match up the atrributes in the objects with their columns in the corresponding table. The annotations have more capabilities so that you can specify mappings between names of attributes and columns that don't exactly match; mappings such as
PUBLISHED_AUTHOR => Author,
FULL_NAME => name
Now what about JOINS and relationships? The author class has a collection of Books
#Entity
public Class Author {
public int id;
public String name;
public List<Book> books;
and the Book class has an attribute that is it's author
#Entity
public Class Book {
public int id;
public String title
public Author author
The JPA Entity Manager class fetches an instance of Book using a find method (I'll not go into detail here)
int primaryKey = 1;
Book aBook = em.find( primaryKey); // approximately
Now your code can just go
aBook.author.name
You never see the fact that SQL was used to fetch the data for Book, and by the time you ask for the author attribute has also fetched the author data. A SQL JOIN may well have been used, you don't need to know. You can control whether the fetch is Eager or Lazy by more annotations.
Similarly
int primaryKey = 2
Author author = em.find( primaryKey );
author.books.size() ; // how many books did the author write?
we get a list of all the books as well as the authors other data, SQL happened, we didn't see it.
Here is some T-SQL to get you started:
1.
select a.name, b.title from Authors a join Books b on a.id = b.author
2.
select a.name, b.title, s.name
from Authors a
join Books b on a.id = b.author
join Stores_Books sb on sb.book = b.id
join Stores s on s.id = sb.store