Rails Activerecord query selective include - sql

I am having trouble optimizing a large activerecord query. I need to include an associated model in my request but due to the size of the return set I only want to include a couple of the associated columns. For example I have:
While I am looking for something like:
Post.includes(:user.name, :user.profile_pic).large_set
I need to actually use the name and profile pic attributes so Post.joins(:user) is not an option as far as I understand.

select is what you are looking for:
Post.select("posts.*, users.name, users.profile_pic").large_set

You'll have to use join to accomplish what you want, as includes does not have this functionality. Or you could white your own includes method :-)


issues related find_by_sql in rails

I am using Rails 4 and I need to use find_by_sql in my active record model.Now I am facing two serious problems. First one is that it does not give me a particluar data whether it is giving #Employee:0x0000000b2a1718 as result. My model name is Employee and tbale name is employees. I am using pg.
Please tell me is there any solution.
Second problem is that how can I write any rails variable with in the sql query used in find_by_sql. For example I want to execute find_by_sql("select firstname from employee where id=#var"), where #var is a ruby variable.
The actual query I need to execute is select firstname from employee where comapact_string like %#var% using find_by_sql.
There's degrees of customization when making a query. The simplest form is where you can use the built-in finders:
Employee.where(id: #var).pluck(:first name)
That will do a direct match, and if one's found, give you the first_name column value. No model is produced with pluck.
If you want to do an approximate match with LIKE you write out the WHERE clause more formally:
Employee.where('id LIKE ?', "%{#var}%").pluck(:first_name)
It's rare you need to write out an entire query with find_by_sql, but if you do you must be extremely cautious about what data you put in the query. It's strongly recommended to use placeholder values whenever possible, and if you absolutely must bypass this, escape everything no matter the source.

Inverse of IN in Rails

I feel foolish, but I cannot find the answer to this.
If I have a User with many attributes, given a list of attributes, I can ask rails something like this:
User.where("attributes.id IN ?", list_of_attribute_ids)
With the appropriate joins or includes or whatever.
However, I have no idea how to find the inverse set of those users. That is, given 100 users, if the result return 75 entries, I don't know how to find the other 25!
I thought
User.where("attributes.id NOT IN ?", list_of_attribute_ids)
might work (similarly, User.where.not), but it doesn't! Instead, it looks for those users where any of their attributes are not one of the list, which is useful, but not what I want.
The only way I know how to do it, is with something like:
User.where.not(id: User.where("attributes.id IN ?", list_of_attribute_ids).pluck(:id))
Which is sort of like the SQL for select user where id not in (gather a list of ids).
But this is massively non-performant, and generally just can't cope with a database with more than a few (hundred) entries.
How do you do this?
I think you could use left outer joins, like #Vishal mentioned in the comments.
See the guides: http://guides.rubyonrails.org/active_record_querying.html#left-outer-joins
rails 4:
joins("LEFT OUTER JOIN <something>")
rails 5:

How to simulate ActiveRecord Model.count.to_sql

I want to display the SQL used in a count. However, Model.count.to_sql will not work because count returns a FixNum that doesn't have a to_sql method. I think the simplest solution is to do this:
Model.where(nil).to_sql.sub(/SELECT.*FROM/, "SELECT COUNT(*) FROM")
This creates the same SQL as is used in Model.count, but is it going to cause a problem further down the line? For example, if I add a complicated where clause and some joins.
Is there a better way of doing this?
You can try
Model.select("count(*) as model_count").to_sql
You may want to dip into Arel:
I find I often want to find sub counts, so I embed the count(*) into another query:
child_counts = ChildModel.select(Arel.star.count)
which then gives you all the child counts for each of the models:
FROM "child_models"
WHERE "models"."id" = "child_models"."model_id"
) child_count
FROM "models"
ORDER BY "models"."id" ASC
Best of luck
Not sure if you are trying to solve this in a generic way or not. Also not sure what kind of scopes you are using on your Model.
We do have a method that automatically calls a count for a query that is put into the ui layer. I found using count(:all) is more stable than the simple count, but sounds like that does not overlap your use case. Maybe you can improve your solution using the except clause that we use:
scope.except(:select, :includes, :references, :offset, :limit, :order)
The where clause and the joins necessary for the where clause work just fine for us. We tend to want to keep the joins and where clause since that needs to be part of the count. While you definitely want to remove the includes (which should be removed by rails automatically in my opinion), but the references (much trickier especially in the case where it references a has_many and requires a distinct) that starts to throw a wrench in there. If you need to use references, you may be able to convert these over to a left_join.
You may want to double check the parameters that these "join" methods take. Some of them take table names and others take relation names. Later rails version have gotten better and take relation names - be sure you are looking at the docs for the right version of rails.
Also, in our case, we spend more time trying to get sub selects with more complicated relationships, we have to do some munging. Looks like we are not dealing with where clauses as much.

sql count filtering - rails way

Suppose I have Posts and posts' Comments. I want to filter all the Posts that have more than 10 comments. I began writing something like Posts.includes(:comments).group("post.id").count("comments.id"), to obtain a hash of posts and their counts, and I can extract the information from there, but I want some one-line straightforward way to do that
Sure I can use some pure sql syntax statements, but I want it in a pure rails way. Any idea ?
Assuming the models are named in the more typical singular form of Post and Comment and have the usual association relationship, then the following should work:
Post.joins(:comments).group('posts.id').having('count(comments.id) > 10')

Any way to merge two queries in solr?

In my project, we use solr to index a lot of different kind of documents, by example Books and Persons, with some common fields (like the name) and some type-specific fields (like the category, or the group people belong to).
We would like to do queries that can find both books and persons, with for each document type some filters applied. Something like:
find all Books and Persons with "Jean" in the name and/or content
but only Books from category "fiction" and "fantasy"
and only Persons from the group "pangolin"
everything sorted by score
A very simple way to do that would be:
q = name:jean content:jean
(type:book AND category:(fiction fantasy))
(type:person AND group:pangolin)
But alas, as fq are cached, I'd prefer something allowing me simpler and so more reusable fq like :
fq=category(fiction fantasy),
Is there a way to tell solr to merge or combine many queries? Something like 'grouping' fq together.
I read a bit about nested queries with _query_, but the very few documentation about it makes me think it's not the solution I'm looking for.
As Geert-Jan mentioned it in his answer, the possibility to do OR between fq is a solr asking feature, but with very little support by now: https://issues.apache.org/jira/browse/SOLR-1223
So I managed to simulate what I want to in a simple way:
for each field a document type can have, we have to define everytime a value (so if in my own example Books can have no category, at index time we still have to define something like category=noCategoryCode
when using a filter on one of this fields in a query on multiple types, we add a non-present condition in the filter, so fq=category:fiction becomes fq=category:fiction (*:* AND -category:*)
By this way, all other types (like Person) will pass through this filter, and the filter stands quite atomic and often used - so caching is still useful.
So, my full example becomes:
q = name:jean content:jean
fq= type:(book person)
fq= category:(fiction fantasy) (*:* AND -category:*)
fq= group:(pangolin) (*:* AND -group:*)
Still, can't wait SOLR-1223 to be patched :)
You can apply multiple filter queries at the same time
q=name:jean content:jean&fq=type:book&fq=type:person&fq=category(fiction fantasy)&fq=group:pangolin
Perhaps I am not understanding your issue, but the only difference between a query and a filter is that the filter is cached. If you don't care about the caching, just modify their query:
real query +((type:book category:fiction) (type:person group:pangolin))