Django 1.6 ORM Joins - sql

I'm trying to get all of the assets in a particular portfolio to display on a page.
I need to know: A). How to get that portfolios primary key, B). How to write a join in code, and C). If I'm even going about this in the right way (would another CBV or FBV be more appropriate or is the function get_assets() fine?).
Database setup:
class Portfolios(models.Model):
#code
class PortfoliosAssets(models.Model):
portfolio = models.ForeignKey(Portfolios)
asset = models.ForeignKey(Assets)
class Assets(models.Model):
#code
SQL I want to write with the ORM:
SELECT A.ticker
FROM assets A
INNER JOIN portfolios_assets PA ON PA.asset = A.id
WHERE PA.portfolio = --portfolio_pk
Code:
class ShowPortfolios(DetailView):
model = Portfolios
template_name = 'show_portfolios.html'
def get_assets(self):
#obviously not how to get the portfolios pk or columns from the ASSETS table.
assets = PortfoliosAssets.objects.get(portfolio=portfolio_pk)
for asset in assets:
#run some query to get each asset's info but this seems obviously wrong.

The relationship between Portfolio and Asset is a many-to-many. You should define that explicitly, and remove the PortfolioAsset model completely (Django will create an equivalent m2m join table for you).
class Portfolio(models.Model):
assets = models.ManyToManyField("Asset")
(Note the convention is to use singular names for models, not plurals.)
Once that's done, you don't need an extra method at all: you can simply access the asset from the portfolio via portfolio.assets.all(). Or, in the template:
{% for asset in portfolio.assets.all %}
{{ asset.ticker }}
{% endfor %}

Related

How to access model metadata from custom materialization

I am currently writing a custom dbt materialization and I would like to know what is the best way / pattern to access the "current" model metadata from the materialization itself.
Background
My model consists of two files:
sample_model.yaml (with the model metadata)
version: 2
models:
- name: sample_model
description: This is a test view to test a materialization
config:
schema: temp
materialized: custom_view
columns:
- name: custom_view_column_a
description: This is a test column A in a view
- name: custom_view_column_b
description: This is a test column B in a view
sample_model.sql (with the "actual" model)
SELECT
1 AS custom_view_column_a,
2 AS custom_view_column_b
My solution
In my custom materialization (custom_view) I would like, for example, to access the columns defined in the model metadata (sample_model.yaml). For the moment I could access them using the graph variable, in this way:
{% set models = [] %}
{% for node in graph.nodes.values() | selectattr("resource_type", "equalto", "model") | selectattr("name", "equalto", this.identifier) %}
{% do models.append(node) %}
{% endfor %}
{% set model_metadata = models|first %}
{% set model_columns = model_metadata.get("columns") %}
Possible improvements
This approach works quite well, however it "feels" a bit like (ab)using a sort of "global variable". Also the graph could become very large considering it stores both the metadata and the SQL of all the models in the project!
Is there any other (local) object / variable I can access from the materialization that only stores the metadata of the model it's currently being materialized?
{{ model }} gives you the data from the graph for the current model node. I think it should work inside a materialization:
{% set model_metadata = model %}
You may want to gate it with execute -- I'm not really sure if the first parsing pass templates the materialization code:
{% set model_metadata = model if execute else {} %}

Where to write predefined queries in django?

I am working with a team of engineers, and this is my first Django project.
Since I have done SQL before, I chose to write the predefined queries that the front-end developers are supposed to use to build this page (result set paging, simple find etc.).
I just learned Django QuerySet, and I am ready to use it, but I do not know on which file/class to write them.
Should I write them as methods inside each class in models.py? Django documentation simply writes them in the shell, and I haven't read it say where to put them.
Generally, the Django pattern is that you will write your queries in your views in the views.py file. Here you will take each of your predefined queries for a given URL and return a response that renders a template (that presumably your front end team will build with you.) or returns a JSON response (for example through Django Rest Framework for an SPA front-end).
The tutorial is strong on this, so that may be a better bet for where to put things than the docs itself.
Queries can be run anywhere, but django is built to receive Requests through the URL schema, and return a response. This is typically done in the views.py, and each view is generally called by a line in the urls.py file.
If you're particularly interested in following the fat models approach and putting them there, then you might be interested in the Manager objects, which are what define querysets that you get through, for example MyModel.objects.all()
My example view (for a class based view, which provides information about a list of matches:
class MatchList(generics.ListCreateAPIView):
"""
Retrieve, update or delete a Match.
"""
queryset = Match.objects.all()
serializer_class = MatchSerialiser
That queryset could be anything, though.
A function based view with a different queryset would be:
def event(request, event_slug):
from .models import Event, Comment, Profile
event = Event.objects.get(event_url=event_slug)
future_events = Event.objects.filter(date__gt=event.date)
comments = Comment.objects.select_related('user').filter(event=event)
final_comments = []
return render(request, 'core/event.html', {"event": event, "future_events": future_events})
edit: That second example is quite old, and the query would be better refactored to:
future_events=Event.objects.filter(date__gt=event.date).select_related('comments')
Edit edit: It's worth pointing out, QuerySet isn't a language, in the way that you're using it. It's django's API for the Object Relational Mapper that sits on top of the database, in the same way that SQLAlchemy also does - in fact, you can swap out or use SQLAlchemy instead of using the Django ORM, if you really wanted. Mostly you'll hear people talking about the Django ORM. :)
If you have some model SomeModel and you wanted to access its objects via a raw SQL query you would do: SomeModel.objects.raw(raw_query).
For example: SomeModel.objects.raw('SELECT * FROM myapp_somemodel')
https://docs.djangoproject.com/en/1.11/topics/db/sql/#performing-raw-queries
Django file structure:
app/
models.py
views.py
urls.py
templates/
app/
my_template.html
In models.py
class MyModel(models.Model):
#field definition and relations
In views.py:
from .models import MyModel
def my_view():
my_model = MyModel.objects.all() #here you use the querysets
return render('my_template.html', {'my_model': my_model}) #pass the object to the template
In the urls.py
from .views import my_view
url(r'^myurl/$', my_view, name='my_view'), # here you write the url that points to your view
And finally in my_template.html
# display the data using django template
{% for obj in object_list %}
<p>{{ obj }}</p>
{% endfor %}

Rails: Class method scoping on the properties of an associated model

This is a somewhat more complicated version of the question I asked previously.
Background:
So what I need is to display a list of articles. An article belongs to a media outlet. A media is located in a particular country and publishes articles in a particular language. So the data structure is as follows:
Article belongs to Media; Media has many Articles
Media belongs to a Country; Country has many Media
Media belongs to a Language; Language has many Media
Now, if I wanted to filter articles by media, I could use the following class method (I prefer class methods over scopes, because I am passing a parameter and am using a conditional statement inside the method):
def self.filter_by_media(parameter)
if parameter == "all"
all
else
where(media_id: parameter)
end
end
Question:
How to write a class method that would filter Articles based by properties of its associated model, the Media? For example, I want to get a list of articles published by media located a certain counrty or in several countries (there is also a default country when the user does not make any choice). Here’s what I tried:
# parameter can be either string 'default' or an array of id’s
def self.filter_by_country(parameter)
if parameter == "default"
joins(:media).where(media: [country_id: 1])
else
joins(:media).where(media: [country_id: parameter])
end
end
But that doesn’t work, and I am not conversant enough with SQL to figure out how to make this work. Could you please help?
Update:
I’m trying out #carlosramireziii's suggestion. I changed arrays into hashes (don't know what possessed me to use arrays in the first place), but I’m getting the following error in the Rails console (to avoid confusion, in my database, media is called agency):
def self.filter_by_country(parameter)
if parameter == "default"
joins(:agency).where(agency: {country_id: 1})
else
joins(:agency).where(agency: {country_id: parameter})
end
end
in Rails console:
> Article.filter_by_country('default')
=> Article Load (1.9ms) SELECT "articles".* FROM "articles" INNER JOIN "agencies" ON "agencies"."id" = "articles"."agency_id" WHERE "agency"."country_id" = 1
PG::UndefinedTable: ERROR: missing FROM-clause entry for table "agency"
LINE 1: ...ON "agencies"."id" = "articles"."agency_id" WHERE "agency"."...
^
: SELECT "articles".* FROM "articles" INNER JOIN "agencies" ON "agencies"."id" = "articles"."agency_id" WHERE "agency"."country_id" = 1
Update 2
My mistake in the Update section above is that I did not pluralize agency in the where clause. The part where(agency: {country_id: 1}) should have read where(agencies: {country_id: 1}). The pluralized word agencies here refers to the name of the table that is being joined.
You are very close, you just need to use a nested hash instead of an array.
Try this
def self.filter_by_country(parameter)
if parameter == "default"
joins(:media).where(media: { country_id: 1 })
else
joins(:media).where(media: { country_id: parameter })
end
end

Django: How to add counts of non-related object?

I have two indirectly related tables - Posts and Follower_to_followee
models.py:
class Post(models.Model):
auth_user = models.ForeignKey(User, null=True, blank=True, verbose_name='Author', help_text="Author")
title = models.CharField(blank=True, max_length=255, help_text="Post Title")
post_content = models.TextField (help_text="Post Content")
class Follower_to_followee(models.Model):
follower = models.ForeignKey(User, related_name='user_followers', null=True, blank=True, help_text="Follower")
followee = models.ForeignKey(User, related_name='user_followees', null=True, blank=True, help_text="Followee")
The folowee is indirectly related to post auth_user (post author) in posts. It is, though, directly related to Django user table and user table is directly related to post table.
How can I select all followees for a specific follower and include post counts for each followee in the result of the query without involving the user table? Actually, at this point I am not even clear how to do that involving the user table. Please help.
It's possible to write query generating single SQL, try something like
qs = User.objects.filter(user_followees__follower=specific_follower).annotate(
post_count=models.Count('post'))
for u in qs:
print u, u.post_count
Check the second part of https://stackoverflow.com/a/13293460/165603 (things work similarly except the extra M2M manager)
When being used inside User.objects.filter, both user_followees__follower=foo and user_followers__followee=foo would cause joining of the table of the Follower_to_followee model and a where condition checking for follower=foo or followee=foo
(Note that user_followees__followee=foo or user_followerers__follower=foo works differently from above, Django ORM simplifies them smartly and would generate something like User.objects.filter(pk=foo.pk)).
I'm not entirely sure I understand the question, but here is a simple solution. Note that this could be written more succinctly, but I broke it up so you can see each step.
How can I select all followees for a specific follower?
# First grab all the follower_to_followee entries for a given
# follower called: the_follower
follows = Follower_to_followee.objects.filter(follower=the_follower)
followee_counts = []
# Next, let's iterate through those objects and pick out
# the followees and their posts
for follow in follows:
followee = follow.followee
# post for each followee
followee_posts = Post.objects.filter(auth_user=followee).count()
# Count number of posts in the queryset
count = followee_posts.count()
# Add followee/post_counts to our list of followee_counts
followee_counts.append((followee, count))
# followee_counts is now a list of followee/post_count tuples
For get post counts you can use this:
#get follower
follower = User.objects.get(username='username_of_fallower')
#get all followees for a specific follower
for element in Follower_to_followee.objects.filter(follower=follower):
element.followee.post_set.all().count()
views.py
def view_name(request):
followers = Follower_to_followee.objects.filter(user=request.user)
.......
html
{{user}}<br/>
My followers:<br/>
{% follower in followers %}
<p>{{follower}} - {{follower.user.follower_to_followee_set.count}}</p>
{% endfor %}

Django aggregate query

I have a model Page, which can have Posts on it. What I want to do is get every Page, plus the most recent Post on that page. If the Page has no Posts, I still want the page. (Sound familiar? This is a LEFT JOIN in SQL).
Here is what I currently have:
Page.objects.annotate(most_recent_post=Max('post__post_time'))
This only gets Pages, but it doesn't get Posts. How can I get the Posts as well?
Models:
class Page(models.Model):
name = models.CharField(max_length=50)
created = models.DateTimeField(auto_now_add = True)
enabled = models.BooleanField(default = True)
class Post(models.Model):
user = models.ForeignKey(User)
page = models.ForeignKey(Page)
post_time = models.DateTimeField(auto_now_add = True)
Depending on the relationship between the two, you should be able to follow the relationships quite easily, and increase performance by using select_related
Taking this:
class Page(models.Model):
...
class Post(models.Model):
page = ForeignKey(Page, ...)
You can follow the forward relationship (i.e. get all the posts and their associated pages) efficiently using select_related:
Post.objects.select_related('page').all()
This will result in only one (larger) query where all the page objects are prefetched.
In the reverse situation (like you have) where you want to get all pages and their associated posts, select_related won't work. See this,this and this question for more information about what you can do.
Probably your best bet is to use the techniques described in the django docs here: Following Links Backward.
After you do:
pages = Page.objects.annotate(most_recent_post=Max('post__post_time'))
posts = [page.post_set.filter(post_time=page.most_recent_post) for page in pages]
And then posts[0] should have the most recent post for pages[0] etc. I don't know if this is the most efficient solution, but this was the solution mentioned in another post about the lack of left joins in django.
You can create a database view that will contain all Page columns alongside with with necessary latest Post columns:
CREATE VIEW `testapp_pagewithrecentpost` AS
SELECT testapp_page.*, testapp_post.* -- I suggest as few post columns as possible here
FROM `testapp_page` LEFT JOIN `testapp_page`
ON test_page.id = test_post.page_id
AND test_post.post_time =
( SELECT MAX(test_post.post_time)
FROM test_post WHERE test_page.id = test_post.page_id );
Then you need to create a model with flag managed = False (so that manage.py sync won't break). You can also use inheritance from abstract Model to avoid column duplication:
class PageWithRecentPost(models.Model): # Or extend abstract BasePost ?
# Page columns goes here
# Post columns goes here
# We use LEFT JOIN, so all columns from the
# 'post' model will need blank=True, null=True
class Meta:
managed = False # Django will not handle creation/reset automatically
By doing that you can do what you initially wanted, so fetch from both tables in just one query:
pages_with_recent_post = PageWithRecentPost.objects.filter(...)
for page in pages_with_recent_post:
print page.name # Page column
print page.post_time # Post column
However this approach is not drawback free:
It's very DB engine-specific
You'll need to add VIEW creation SQL to your project
If your models are complex it's very likely that you'll need to resolve table column name clashes.
Model based on a database view will very likely be read-only (INSERT/UPDATE will fail).
It adds complexity to your project. Allowing for multiple queries is a definitely simpler solution.
Changes in Page/Post will require re-creating the view.