Rails ActiveRecord finding questions by tag in named scope - sql

I want the equivalent of SO search by tag, so I need an exists query but I also still need to left join on all tags. I've tried a couple of approaches and I'm out of ideas.
The Qustion - Tag relationship is through has_and_belongs_to_many both ways (i.e. I have a QuestionTags joiner table)
e.g.
Question.join(:tags).where('tag.name = ?', tag_name).includes(:tags)
I would expect this to do what I need but actually it just mashes up the includes with the join and I just end up with basically an inner join.
Question.includes(:tags)
.where("exists (
select 1 from questions_tags
where question_id = questions.id
and tag_id = (select id
from tags
where tags.name = ?))", tag_name)
This fetches the correct results but a) is really ugly and b) gives a deprecation warning as again it seems to confuse the includes with the join:
DEPRECATION WARNING: It looks like you are eager loading table(s) (one
of: questions, tags) that are referenced in a string SQL sn ippet. For
example:
Post.includes(:comments).where("comments.title = 'foo'")
Note I'm trying to write these as named scopes.
Let me know if the question isn't clear. Thanks in advance.

OK, got it. I know no built in syntax to do it. I have used an alternative before, You can do like this:
Question.include(:tags).where("questions.id IN (
#{ Question.joins(:tags).where('tags.name = ?', tag_name).select('questions.id').to_sql})")
You can also join this subquery to your questions table instead of using IN. Alternatively if You are not against adding gems and You are using Postgres, use this gem.
It provides really neat syntax for advanced queries.

Use preload instead of includes:
Question.preload(:tags).where("exists ....

Related

Many to many relation, multi where clause on the same column and hibernate

Sorry for the bad question title, couldnt think of anything better.
Anyway, my tables are Tags - Poststags - Posts. Poststags is a junction table for many to many relation. I need to select all posts with given tags, I dont know how many tags the user will choose to search for. One way I found to do this is in the code below, however i would need to loop all the tags given by the user and construct the query string from there since the number of tags is unknown. Seems like a pretty bad solution to me.
Another solution would be to store all tags in one column in the Posts table as a pure string, but I dont want to do that because of other application requirements.
I have a working sql query, since I was trying pure sql before trying to implement it in hibernate, but I dont like doing a select of all posts containing each tag and then joining each query, is there a way to specify the same column multiple times in the WHERE clause? Something along the lines WHERE pt.tag_id = x AND pt.tag_id = y? (I know this won't work). IN operator won't work either since it will give me Posts that contain any of the supplied tags and not just the posts containing ALL of the supplied tags.
Also how would I implement such a query in HQL(if subqueries like this are even supported?). Or can I somehow manage this via criteria? Or do I have to resort to using createSQLQuery method of a hibernate session?
SELECT * FROM
( SELECT * FROM posts p
inner join poststags pt on pt.post_id = p.id
WHERE pt.tag_id = 1 ) AS A
INNER JOIN
( SELECT * FROM posts p
inner join poststags pt on pt.post_id = p.id
WHERE pt.tag_id = 2 ) AS B ON A.id = B.id
And yes, I know this query is not returning the Post entity itself, but I can handle that later.
Don't use hibernate or ORM for this kind of complex select, it may work, but in a bad way.
Your use case should be solved by full text search, which means each Post will need have its own tags.
I don't see much value to make Tag an entity. It's just a string.
Full text search could be heavy for database , A better way is using elasticsearch to help. Spring has integration with spring-data-elasticsearch and it's not difficult to use. Elasyicsearch is very powerful for free text search.
Here is a solution that 'should' work using Criteria queries in Hibernate.
Assuming that you have an entity for Post and an entity for PostTag and PostTag has reference to Post (which I think it should given the example query that you provided), I believe that something like this should do what you want:
static DetachedCriteria getPostTagCriteria(String tagString)
{
DetachedCriteria criteria = DetachedCriteria.forClass(PostTag.class, "uniqueName_" + postTagId);
criteria.createAlias("tag", "tag");
criteria.add(Restrictions.eq("tag.tagString", tagString));
criteria.setProjection(Projections.property("postId"));
return criteria;
}
static List<Post> getPosts(List<String> tagStrings)
{
Criteria criteria = getCurrentSession().createCriteria(Post.class, "post");
for(String tagString : tagStrings)
{
criteria.add(Property.forName("post.id").in(getPostTagCriteria(tagString)));
}
List<Post> ret = criteria.list();
return ret;
}
This assumes that you have reasonable entities to represent Post, PostTag and Tag and that they all reference each other in obvious parent/child sort of ways that I have completely made up here.
But, the general idea of creating multiple detached criteria objects based on your input should solve your problem. This solution also comes with the same caveats regarding SQL complexity mentioned above. You will be creating a sub-query for each tag passed in. So, depending on your indexes and table sizes, you may need to consider a different approach.

Inverse of IN in Rails

I feel foolish, but I cannot find the answer to this.
If I have a User with many attributes, given a list of attributes, I can ask rails something like this:
User.where("attributes.id IN ?", list_of_attribute_ids)
With the appropriate joins or includes or whatever.
However, I have no idea how to find the inverse set of those users. That is, given 100 users, if the result return 75 entries, I don't know how to find the other 25!
I thought
User.where("attributes.id NOT IN ?", list_of_attribute_ids)
might work (similarly, User.where.not), but it doesn't! Instead, it looks for those users where any of their attributes are not one of the list, which is useful, but not what I want.
The only way I know how to do it, is with something like:
User.where.not(id: User.where("attributes.id IN ?", list_of_attribute_ids).pluck(:id))
Which is sort of like the SQL for select user where id not in (gather a list of ids).
But this is massively non-performant, and generally just can't cope with a database with more than a few (hundred) entries.
How do you do this?
I think you could use left outer joins, like #Vishal mentioned in the comments.
See the guides: http://guides.rubyonrails.org/active_record_querying.html#left-outer-joins
rails 4:
joins("LEFT OUTER JOIN <something>")
rails 5:
left_outer_joins(:something)

Rails Activerecord query selective include

I am having trouble optimizing a large activerecord query. I need to include an associated model in my request but due to the size of the return set I only want to include a couple of the associated columns. For example I have:
Post.includes(:user).large_set
While I am looking for something like:
Post.includes(:user.name, :user.profile_pic).large_set
I need to actually use the name and profile pic attributes so Post.joins(:user) is not an option as far as I understand.
select is what you are looking for:
Post.select("posts.*, users.name, users.profile_pic").large_set
http://guides.rubyonrails.org/active_record_querying.html#selecting-specific-fields
You'll have to use join to accomplish what you want, as includes does not have this functionality. Or you could white your own includes method :-)

ActiveRecord: can't use `pluck` after `where` clause with eager-loaded associations

I have an app that has a number of Post models, each of which belongs_to a User model. When these posts are published, a PublishedPost model is created that belongs_to the relevant Post model.
I'm trying to build an ActiveRecord query to find published posts that match a user name, then get the ids of those published posts, but I'm getting an error when I try to use the pluck method after eager-loading my associations and searching them with the where method.
Here's (part of) my controller:
class PublishedPostsController < ApplicationController
def index
ar_query = PublishedPost.order("published_posts.created_at DESC")
if params[:searchQuery].present?
search_query = params[:searchQuery]
ar_query = ar_query.includes(:post => :user)
.where("users.name like ?", "%#{search_query}%")
end
#found_ids = ar_query.pluck(:id)
...
end
end
When the pluck method is called, I get this:
ActiveRecord::StatementInvalid: Mysql2::Error: Unknown column 'users.name' in 'where clause': SELECT id FROM `published_posts` WHERE (users.name like '%Andrew%') ORDER BY published_posts.created_at DESC
I can get the results I'm looking for with
#found_ids = ar_query.select(:id).map{|r| r.id}
but I'd rather use pluck as it seems like the cleaner way to go. I can't figure out why it's not working, though. Any ideas?
You need to and should do joins instead of includes here.
The two functions are pretty similar except that the data from joins is not returned in the result of the query whereas the data in an includes is.
In that respect, includes and pluck are kind of antithetical. One says to return me all the data you possibly can, whereas the other says to only give me only this one little bit.
Since you only want a small amount of the data, you want to do joins. (Strangely select which also seems somewhat antithetical still works, but you would need to remove the ambiguity over id in this case.)
Try it out in the console and you'll see that includes causes a query that looks kind of like this: SELECT "posts"."id" as t0_ro, "posts"."text" as t0_r1, "users"."id" as t1_r0, "users"."name" as t1_r1 ... When you tack on a pluck statement all those crazy tx_ry columns go away and are replaced by whatever you specified.
I hope that helps, but if not maybe this RailsCast can. It is explained around the 5 minute mark.
http://railscasts.com/episodes/181-include-vs-joins
If you got here by searching "rails pluck ambiguous column", you may want to know you can just replace query.pluck(:id) with:
query.pluck("table_name.id")
Your query wouldn't work as it is written, even without the pluck call.
Reason being, your WHERE clause includes literal SQL referencing the users table which Rails doesn't notice and decides to use multiple queries and join in memory ( .preload() ) instead of joining in the database level ( .eager_load() ):
SELECT * from published_posts WHERE users.name like "pattern" ORDER BY published_posts.created_at DESC
SELECT * from posts WHERE id IN ( a_list_of_all_post_ids_in_publised_posts_returned_above )
SELECT * from users WHERE id IN ( a_list_of_all_user_ids_in_posts_returned_above )
The first of the 3 queries fails and it is the error you get.
To force Rails use a JOIN here, you should either use the explicit .eager_load() instead of .includes(), or add a .references() clause.
Other than that, what #Geoff answered stands, you don't really need to .includes() here, but rather a .joins().

Rails double match from has_and_belongs_to_many

Say that I have a has_and_belongs_to_many relationship where I have posts and categories. It is simple to find all the posts in a category, or all the categories that a particular post is a member of. However, what if I want to find a list of posts that belong to multiple categories? For example, a list of posts that are on the topic of security in Rails, I might want the posts that belong to the categories "Security" and "Rails".
Is it possible to do this with the finder methods build into ActiveRecord, or will I need to use SQL? Can someone please explain how?
You can use includes or joins, like:
#result = Post.includes(:categories).where("categories.name = 'Security' OR categories.name = 'Rails'")
or
#result = Post.joins(:categories).where("categories.name = 'Security' OR categories.name = 'Rails'")
I also suggest to check this railscast to understand the difference between joins and includes, so you can decide what is better in your case.
i don't know anything about rails, but i'm attempting a similar thing with some sql. this may or may not work for either of us....
i have a table of articles, and a look-up table of applied categories. to get an article that has the 'security' category and the 'rails' category, i'm joining the article table to the category table, of course, but also re-joining it a second time. each join of the category table uses a hint in the table alias name (ie language or topic)
pseudo code:
SELECT article.*,
category_language.category_id,
category_topic.category_id
FROM category category_language
INNER JOIN article ON category_language.articleID = article.articleID
INNER JOIN category category_topic ON article.articleID = category_topic.articleID
WHERE category_language.category_id in (420) /* rails */
and category_topic.category_id in (421) /* security */
this isn't completely ironed out, and i hope that if i am showing my ignorance here, someone will speak up.