Query data from two associated tables - sql

For my app:
user has_many images, image belongs_to user
image has_one location, location belongs_to image
Perhaps the location's fields should just be part of the image. But regardless, I'm trying to write this query in Rails:
SELECT image.caption, location.latitude, location.longitude
FROM image, location
WHERE location.image_id = image.id
AND image.user_id = 5
or alternatively, if it's easier:
SELECT image.*, location.*
FROM image, location
WHERE location.image_id = image.id
AND image.user_id = 5
How would I write this as an ActiveRecord query?

I think you want to read about Eager Loading Associations.
#images = Image.includes(:location).where("images.user_id = ?", 5)
This will find Image instances where user_id = 5. It then runs a 2nd query that will JOIN and build the associated Location instance (thats what the .includes(:location) will do for you).
This more closely matches your alternative query, as it does select all columns from images and location tables.
You can build an Array based on this containing a hash with only the keys you're interested in through something like this.
#hash_object = #images.collect { |i| { caption: i.caption, latitude: i.location.latitude, longitude: i.location.longitude } }
If you want to build this with only a single query, you can use .joins(:location) in combination with .includes(:location)
Image.joins(:location).includes(:location).where("images.user_id = ?", 5)
Important: This will omit Image instances who have no assoicated Location. You can modify the joins() a bit to help with this, but the above will have this omission.
If you really want only specific columns to be selected, read up on Selecting Specific Columns though there are warnings for the use of this
If the select method is used, all the returning objects will be read only.
and
Be careful because this also means you’re initializing a model object with only the fields that you’ve selected.
In Rails master (not out in 3.2.11) you can pass multiple columns to .pluck() but this appears to only be restricted to a single table (you wouldn't be able to get the locations table's :latitude and :longitude when plucking from Image). It's good to know about though.

Related

Rails - Proper associations/data model for content display "cooldown"

I have a user model and a content model. When a user views a piece of content, I need to make sure the user does not see that content again for say, 48 hours.
What's the Rails way to model this out? I'd like to have a table with a user_id, content_id, and a timestamp that the view was recorded, then have a worker clear out entries with timestamps > 2 days. This way when a user requests more content, I can filter out content that has an entry where user_id and content_id match.
Don't think it should matter, but I'm using MySQL with Rails 3.2.
I think you can do the following in your model.
class User < ActiveRecord::Base
...
has_many :contents, -> { where(["EXTRACT(HOUR FROM last_viewed_at) > ? OR last_viewed_at IS ?", 48, nil)}
end
I used or condition to make it nil because when it is initialized or new record created so that user can be able to see it.
I am not sure how you are using your worker.
Please suggest me if I am missing anything. I am not intended to answer accurately, rather trying a way I can realize what is possible.

Matching nested model association attribute with includes

Suppose I have the following models:
class Post < ActiveRecord::Base
has_many :authors
class Author < ActiveRecord::Base
belongs_to :post
And suppose the Author model has an attribute, name.
I want to search for all posts with a given author "alice", by that author's name. Say there is another author "bob" who co-authored a post with alice.
If I search for the first result using includes and where:
post = Post.includes(:authors).where("authors.name" => "alice").first
You'll see that the post only has one author now, even if in fact there are more:
post.authors #=> [#<Author id: 1, name: "alice", ...>]
post.reload
post.authors #=> [#<Author id: 1, name: "alice", ...>, #<Author id: 2, name: "bob", ...>]
The problem seems to be the combination of includes and where, which limits the scope correctly to the desired post, but at the same time hides all associations except for the one that is matched.
I want to end up with an ActiveRecord::Relation for chaining, so the reload solution above is not really satisfactory. Replacing includes by joins solves this, but does not eager load the associations:
Post.joins(:authors).where("authors.name" => "alice").first.authors
#=> [#<Author id: 1, name: "alice", ...>, #<Author id: 2, name: "bob", ...>]
Post.joins(:authors).where("authors.name" => "alice").first.authors.loaded?
#=> false
Any suggestions? Thanks in advance, I've been banging my head over this problem for a while.
I see what you're doing as expected behaviour, at least that's how SQL works... You're restricting the join on authors to where authors.id = 1, so why would it load any others? ActiveRecord just takes the rows that the database returned, it has no way of knowing if there are others, without doing another query based on the posts.id.
Here's one possible solution with a subquery, this will work as a chainable relation, and executes in one query:
relation = Post.find_by_id(id: Author.where(id:1).select(:post_id))
If you add the includes, you will see the queries happen one of two ways:
relation = relation.includes(:authors)
relation.first
# 1. Post Load SELECT DISTINCT `posts`.`id`...
# 2. SQL SELECT `posts`.`id` AS t0_r0, `posts`.`title` AS t0_r1, ...
relation.all.first
# 1. SQL SELECT `posts`.`id` AS t0_r0, `posts`.`title` AS t0_r1, ...
So depending on the scenario, ActiveRecord decides whether to look up the id with a simpler query before loading all the associated authors. Sometimes it makes more sense to run the query in 2 steps.
Coming back to this question after a long long time, I realized there is a better way to do this. The key is to do not one but two joins, one with includes and one with Arel using a table alias:
posts = Post.arel_table
authors = Author.arel_table.alias("matching_authors")
join = posts.join(authors, Arel::Nodes::InnerJoin).
on(authors[:post_id].eq(posts[:id])).join_sources
post = Post.includes(:authors).joins(join).
where(matching_authors: { name: "Alice" }).first
The SQL for this query is quite long since it has includes, but the key point is that it has not one but two joins, one (from includes) using a LEFT OUTER JOIN on the alias posts_authors, the other (from the Arel join) using an INNER JOIN on the alias matching_authors. The WHERE only applies to the latter alias, so results on the association in the returned results are not limited by this condition.
I ran into the same issue (which I describe as: where clause filters the associated model, rather than the primary model, when includes is used to prevent N+1 queries).
After flailing around trying various solutions, I found that using preload in conjunction with joins solves this for me. The Rails documentation is not super useful here. But apparently preload will explicitly use two separate queries, one to filter/select the primary model, and a second query to load the associated models. This blog post also has some insights that helped lead me to the solution.
Applying this to your models would be something like:
post = Post.preload(:authors).joins(:authors).where("authors.name" => "alice").first
I suspect that under the covers this is doing the same thing as your accepted answer, but at a nicer level of abstraction.
I wish the Rails docs were more explicit about how to do this. It's subtle enough that I wrote a bunch of tests around this precise situation in my code base.
Actually, it's because this code:
post = Post.includes(:authors).where("authors.name" => "alice").first
returns the first matched record because of the ".first". I think if you did this:
post = Post.includes(:authors).where("authors.name" => "alice")
you would get back all posts with "alice" and her other co-authors if I understand what you're asking correctly.

Django aggregate query

I have a model Page, which can have Posts on it. What I want to do is get every Page, plus the most recent Post on that page. If the Page has no Posts, I still want the page. (Sound familiar? This is a LEFT JOIN in SQL).
Here is what I currently have:
Page.objects.annotate(most_recent_post=Max('post__post_time'))
This only gets Pages, but it doesn't get Posts. How can I get the Posts as well?
Models:
class Page(models.Model):
name = models.CharField(max_length=50)
created = models.DateTimeField(auto_now_add = True)
enabled = models.BooleanField(default = True)
class Post(models.Model):
user = models.ForeignKey(User)
page = models.ForeignKey(Page)
post_time = models.DateTimeField(auto_now_add = True)
Depending on the relationship between the two, you should be able to follow the relationships quite easily, and increase performance by using select_related
Taking this:
class Page(models.Model):
...
class Post(models.Model):
page = ForeignKey(Page, ...)
You can follow the forward relationship (i.e. get all the posts and their associated pages) efficiently using select_related:
Post.objects.select_related('page').all()
This will result in only one (larger) query where all the page objects are prefetched.
In the reverse situation (like you have) where you want to get all pages and their associated posts, select_related won't work. See this,this and this question for more information about what you can do.
Probably your best bet is to use the techniques described in the django docs here: Following Links Backward.
After you do:
pages = Page.objects.annotate(most_recent_post=Max('post__post_time'))
posts = [page.post_set.filter(post_time=page.most_recent_post) for page in pages]
And then posts[0] should have the most recent post for pages[0] etc. I don't know if this is the most efficient solution, but this was the solution mentioned in another post about the lack of left joins in django.
You can create a database view that will contain all Page columns alongside with with necessary latest Post columns:
CREATE VIEW `testapp_pagewithrecentpost` AS
SELECT testapp_page.*, testapp_post.* -- I suggest as few post columns as possible here
FROM `testapp_page` LEFT JOIN `testapp_page`
ON test_page.id = test_post.page_id
AND test_post.post_time =
( SELECT MAX(test_post.post_time)
FROM test_post WHERE test_page.id = test_post.page_id );
Then you need to create a model with flag managed = False (so that manage.py sync won't break). You can also use inheritance from abstract Model to avoid column duplication:
class PageWithRecentPost(models.Model): # Or extend abstract BasePost ?
# Page columns goes here
# Post columns goes here
# We use LEFT JOIN, so all columns from the
# 'post' model will need blank=True, null=True
class Meta:
managed = False # Django will not handle creation/reset automatically
By doing that you can do what you initially wanted, so fetch from both tables in just one query:
pages_with_recent_post = PageWithRecentPost.objects.filter(...)
for page in pages_with_recent_post:
print page.name # Page column
print page.post_time # Post column
However this approach is not drawback free:
It's very DB engine-specific
You'll need to add VIEW creation SQL to your project
If your models are complex it's very likely that you'll need to resolve table column name clashes.
Model based on a database view will very likely be read-only (INSERT/UPDATE will fail).
It adds complexity to your project. Allowing for multiple queries is a definitely simpler solution.
Changes in Page/Post will require re-creating the view.

Mapped two arrays... now... can i map three?

This mapping worked:
#fbc = FbComments.where("reviewee_id = ?", current_user.id)
#users = User.order("last_name")
#fb_comments = #fbc.map! { |fb| [fb, #users.find_by_id(fb.user_id)] }
So two arrays are mapped... one with comments and one with the user data of the person that made the comments. But I also need the user's profile picture data. Do i change the original mapping method to include a third array somehow (e.g. #fbc + #users + #pictures), or do i have to map another array on the result of mapping the first two (e.g. #fb_comments + #pictures)?
Profile pictures, like comments, have a user_id that is matched to the id of the user who made the comments.
Thanks.
I'm not sure why you're doing this the way you are. Why not use a join (.includes) to get everything in one query?
#fbc = FbComments.where("reviewee_id = ?", current_user.id).includes(:user => :picture)
#fbc.first.user # => The first user in the results
#fbc.first.user.picture # => The first user's picture
(I'm assuming here that profile picture data is its own model called Picture. Change it to fit your app if necessary.)
Take a look at the documentation and scroll down to "Eager loading of associations."

Rails3: Cascading Select Writer's Block

I have a big, flat table:
id
product_id
attribute1
attribute2
attribute3
attribute4
Here is how I want users to get to products:
See a list of unique values for attribute1.
Clicking one of those gets you a list of unique values for attribute2.
Clicking one of those gets you a list of unique values for attribute3.
Clicking one of those gets you a list of unique values for attribute4.
Clicking one of those shows you the relevant products.
I have been coding Rails for about 4 years now. I just can't unthink my current approach to this problem.
I have major writer's block. Seems like such an easy problem. But I either code it with 4 different "step" methods in my controller, or I try to write one "search" method that attempts to divine the last level you selected, and all the previous values that you selected.
Both are major YUCK and I keep deleting my work.
What is the most elegant way to do this?
Here is a solution that may be an option. Just off the top of my head and not tested (so there is probably a bit more elegant solution). You could use chained scopes in your model:
class Product < ActiveRecord::Base
scope :with_capacity, lambda { |*args| args.first.nil? ? nil : where(:capacity=>args.first) }
scope :with_weight, lambda { |*args| args.first.nil? ? nil : where(:weight=>args.first) }
scope :with_color, lambda { |*args| args.first.nil? ? nil : where(:color=>args.first) }
scope :with_manufacturer, lambda { |*args| args.first.nil? ? nil : where(:manufacturer=>args.first) }
self.available_attributes(products,attribute)
products.collect{|product| product.send(attribute)}.uniq
end
end
The code above will give you a scope for each attribute. If you pass a parameter to the scope, then it will give you the products with that attribute value. If the argument is nil, then the scope will return the full set (I think ;-). You could keep track of the attributes they are drilling down in in the session with 2 variables (page_attribute and page_attribute_value) in your controller. Then you call the entire chain to get your list of products (if you want to use them on the page). Next you can get the attribute values by passing in the set of products and the attribute name to Product.available_attributes. Note that this method (Product.available_attributes) is a total hack and would be inefficient for a large set of data, so you may want to make this another scope and use :select=>"DISTINCT(your_attribute)" or something more database efficient instead of iterating thru the full set of products as I did in the hack method.
class ProductsController < ApplicationController
def show
session[params[:page_attribute].to_sym] = params[:page_attribute_value]
#products = Product.all.with_capacity(session[:capacity]).with_weight(session[:weight]).with_color(session[:color]).with_manufacturer(session[:manufacturer])
#attr_values = Product.available_attributes(#products,params[:page_attribute])
end
end
Again, I want to warn you that I did not test this code, so its totally possible that some of the syntax is incorrect, but hopefully this will give you a starting point. Holla if you have any questions about my (psuedo) code.