Problem with active-record and sql - sql

I have a little problem: I can't compose sql-query inside AR.
So, I have Project and Task models, Project has_many Tasks. Task has aasm-field (i.e. "status"; but it doesn't matter, i can be simple int or string field).
So, I want on my projects index page list all (last) projects and for every project I want count it's active, pending and resolved (for example) tasks.
Like this, just look:
First project (1 active, 2 pending,
10 resolved)
Second projects (4
active, 2 pending, 2 resolved)
So, sure I can do it with #projects = Project.all and then in view:
- #projects.each do |project|
= project.title
= project.tasks(:conditions => {:status => "active"}).count #sure it should be in model, just for example
= project.tasks(:conditions => {:status => "pending"}).count
# ...
- end
This is good, but makes 1+N*3 (for 3 task statuses) queries, i want 1. The question is simple: how?.

You could do a find with grouping and counting. Something like:
status_counts = project.tasks.find(:all,
:group => 'status',
:select => 'status, count(*) as how_many')
This will return you a list of Task-like objects with status and how_many attributes which you can then use to give your summary. E.g.
<%= status_counts.map { |sc| "#{sc.how_many} #{sc.status} }.to_sentence %>

Maybe you could, in your Project controller:
Fetch all your projects: Project.all
Fetch all your tasks: Task.all
Then, create a hash with something like
#statuses = Hash.new
#tasks.each do |t|
#statuses[:t.project_id][:t.status] += 1
end
And then use it in your view:
First project (<%= #statuses[:#project.object_id][:active] %> active)
This is not the prefect solution, but it is easy to implement and only use two (big) queries. Of course, this would re-create a hash every time, so you might want to look into database indexes or cache systems.
Also, named scopes would be interesting, like Task.active.

I'd suggest using a counter cache in your project model to prevent needing to recount all tasks on each display of the index page - have an active_count, pending_count and resolved_count, and update them whenever the task changes state.
If you just want to modify your existing code, try:
project.tasks.count(:conditions => "status = 'active'")
You could also add a scope to your task model that would enable you to do something like:
project.tasks.active.count
EDIT
Ok so I'm half asleep - got the wrong impression from your question :/
Yep, you can do it in one query - use find_by_sql to get your projects along with the grouped counts for the tasks. You'll be able to access the group counts in the resulting array of projects.

So, the right answer is:
Projects.all(:joins => :tasks,
:select => 'projects.*,
sum(tasks.status="pending") as pending_count,
sum(tasks.status = "accepted") as accepted_count,
sum(tasks.status = "rejected") as rejected_count',
:group => 'projects.id')

Related

Rails: complex search on 3 models, return only newest - how to do this?

I'm trying to add an advanced search option to my app in which the user can search for certain links based on attributes from 3 different models.
My app is set up so that a User has_many :websites, Website has_many :links, and Link has_many :stats
I know how create SQL queries with joins or includes etc in Rails but I'm getting stuck since I only want to retrieve the latest stat for each link and not all of them - and I don't know the most efficient way to do this.
So for example, let's say a user has 2 websites, each with 10 links, and each link has 100 stats, that's 2,022 objects total, but I only want to search through 42 objects (only 1 stat per link).
Once I get only those 42 objects in a database query I can add .where("attribute like ?", user_input) and return the correct links.
Update
I've tried adding the following to my Link model:
has_many :stats, dependent: :destroy
has_many :one_stat, class_name: "Stat", order: "id ASC", limit: 1
But this doesn't seem to work, for example if I do:
#links = Link.includes(:one_stat).all
#links.each do |l|
puts l.one_stat.size
end
Instead of getting 1, 1, 1... I get the number of all the stats: 125, 40, 76....
Can I use the limit option to get the results I want or does it not work that way?
2nd Update
I've updated my code according to Erez's advice, but still not working properly:
has_one :latest_stat, class_name: "Stat", order: "id ASC"
#links = Link.includes(:latest_stat)
#links.each do |l|
puts l.latest_stat.indexed
end
=> true
=> true
=> true
=> false
=> true
=> true
=> true
Link.includes(:latest_stat).where("stats.indexed = ?", false).count
=> 6
Link.includes(:latest_stat).where("stats.indexed = ?", true).count
=> 7
It should return 1 and 6, but it's still checking all the stats rather than the latest only.
Sometimes, you gotta break through the AR abstraction and get your SQL on. Just a tiny bit.
Let's assume you have really simple relationships: Website has_many :links, and Link belongs_to :website and has_many :stats, and Stat belongs_to :link. No denormalization anywhere. Now, you want to build a query that finds, all of their links, and, for each link, the latest stat, but only for stats with some property (or it could be websites with some property or links with some property).
Untested, but something like:
Website
.includes(:links => :stats)
.where("stats.indexed" => true)
.where("stats.id = (select max(stats2.id)
from stats stats2 where stats2.link_id = links.id)")
That last bit subselects stats that are part of each link and finds the max id. It then filters out stats (from the join at the top) that don't match that max id. The query returns websites, which each have some number of links, and each link has just one stat in its stats collection.
Some extra info
I originally wrote this answer in terms of window functions, which turned out to be overkill, but I think I should cover it here anyway, since, well, fun. You'll note that the aggregate function trick we used above only works because we're determining which stat to use based on its ID, which exactly the property we need to filter the stats from the join by. But let's say you wanted only the first stat as ranked by some criteria other than ID, such as as, say, number_of_clicks; that trick won't work anymore because the aggregation loses track of the IDs. That's where window functions come in.
Again, totally untested:
Website
.includes(:links => :stats)
.where("stats.indexed" => true)
.where(
"(stats.id, 1) in (
select id, row_number()
over (partition by stats2.id order by stats2.number_of_clicks DESC)
from stat stats2 where stats2.link_id = links.id
)"
)
That last where subselects stats that match each link and order them by number_of_clicks ascending, then the in part matches it to a stat from the join. Note that window queries aren't portable to other database platforms. You could also use this technique to solve the original problem you posed (just swap stats2.id for stats2.number_of_clicks); it could conceivably perform better, and is advocated by this blog post.
I'd try this:
has_one :latest_stat, class_name: "Stat", order: "id ASC"
#links = Link.includes(:latest_stat)
#links.each do |l|
puts l.latest_stat
end
Note you can't print latest_stat.size since it is the stat object itself and not a relation.
Is this what you're looking for?
#user.websites.map { |site| site.links.map { |link| link.stats.last } }.flatten
For a given user, this will return an array with that contains the last stats for the links on that users website.

Rails Order Find By Boolean

I have a simple find in rails 3 that gathers users accounts.
Account.where(:user_id => #user)
The Account model has a 'default' boolean field. As a user adds many accounts I would like the default account to always be first in the loop. Order doesn't seem to work with a boolean field.
Account.where(:user_id => #user, :order => "default DESC")
Is there a way to order the query to handle this or should I just split the queries and find the default account in a separate find?
Try Account.where(:user_id => #user).order("default DESC") - putting :order in your where() clause isn't going to sort the result set.
A cleaner solution might be to add a scope, though.
scope :default_first, order(arel_table[:default].desc)
Then you could just call (assuming your relations are set up properly):
#user.accounts.default_first.all

An efficient way to track user login dates and IPs history

I'm trying to track user login history for stat purposes but its not clear to me what the best way to go about it would be. I could have a separate table that records users and their login stats with a date, but that table could get REALLY big. I could track some historic fields in the User model/object itself in a parse-able field and just update it (them) with some delimited string format. e.g. split on :, get the last one, if an included date code isn't today, add an item (date+count) otherwise increment, then save it back. At least with this second approach it would be easy to remove old items (e.g. only keep 30 days of daily logins, or IPs), as a separate table would require a task to delete old records.
I'm a big fan of instant changes. Tasks are useful, but can complicate things for maintenance reasons.
Anyone have any suggestions? I don't have an external data caching solution up or anything yet. Any pointers are also welcome! (I've been hunting for similar questions and answers)
Thanks!
If you have the :trackable module, I found this the easiest way. In the User model (or whichever model you're authenticating)
def update_tracked_fields!(request)
old_signin = self.last_sign_in_at
super
if self.last_sign_in_at != old_signin
Audit.create :user => self, :action => "login", :ip => self.last_sign_in_ip
end
end
(Inspired by https://github.com/plataformatec/devise/wiki/How-To:-Turn-off-trackable-for-admin-users)
There is a nice way to do that through Devise.
Warden sets up a hook called after_set_user that runs after setting a user. So, supposed you have a model Login containing an ip field, a logged_in_at field and user_id field, you can only create the record using this fields.
Warden::Manager.after_set_user :except => :fetch do |record, warden, options|
Login.create!(:ip => warden.request.ip, :logged_in_at => Time.now, :user_id => record.id)
end
Building upon #user208769's answer, the core Devise::Models::Trackable#update_tracked_fields! method now calls a helper method named update_tracked_fields prior to saving. That means you can use ActiveRecord::Dirty helpers to make it a little simpler:
def update_tracked_fields(request)
super
if last_sign_in_at_changed?
Audit.create(user: self, action: 'login', ip: last_sign_in_ip)
end
end
This can be simplified even further (and be more reliable given validations) if audits is a relationship on your model:
def update_tracked_fields(request)
super
audits.build(action: 'login', ip: last_sign_in_ip) if last_sign_in_at_changed?
end
Devise supports tracking the last signed in date and the last signed in ip address with it's :trackable module. By adding this module to your user model, and then also adding the correct fields to your database, which are:
:sign_in_count, :type => Integer, :default => 0
:current_sign_in_at, :type => Time
:last_sign_in_at, :type => Time
:current_sign_in_ip, :type => String
:last_sign_in_ip, :type => String
You could then override the Devise::SessionsController and it's create action to then save the :last_sign_in_at and :last_sign_in_ip to a separate table in a before_create callback. You should then be able to keep them as long you would like.
Here's an example (scribd_analytics)
create_table 'page_views' do |t|
t.column 'user_id', :integer
t.column 'request_url', :string, :limit => 200
t.column 'session', :string, :limit => 32
t.column 'ip_address', :string, :limit => 16
t.column 'referer', :string, :limit => 200
t.column 'user_agent', :string, :limit => 200
t.column 'created_at', :timestamp
end
Add a whole bunch of indexes, depending on queries
Create a PageView on every request
We used a hand-built SQL query to take out the ActiveRecord overhead on
this
Might try MySQL's 'insert delayed´
Analytics queries are usually hand-coded SQL
Use 'explain select´ to make sure MySQL isusing the indexes you expect
Scales pretty well
BUT analytics queries expensive, can clog upmain DB server
Our solution:
use two DB servers in a master/slave setup
move all the analytics queries to the slave
http://www.scribd.com/doc/49575/Scaling-Rails-Presentation-From-Scribd-Launch
Another option to check is Gattica with Google Analytics
I hate answering my own questions, especially given that you both gave helpful answers. I think answering my question with the approach I took might help others, in combination with your answers.
I've been playing with the Impressionist Gem (the only useful page view Gem since the abandoned RailStat) with good results so far. After setting up the basic migration, I found that the expected usage follows Rail's MVC design very closely. If you add "impressionist" to a Controller, it will go looking for the Model when logging the page view to the database. You can modify this behaviour or just call impressionist yourself in your Controller (or anywhere really) if you're like me and happen to be testing it out on a Controller that doesn't have a Model.
Anyways, I got it working with Devise to track successful logins by overriding the Devise::SessionsController and just calling the impressionist method for the #current_member: (don't forget to check if it's nil! on failed login)
class TestSessionController < Devise::SessionsController
def create
if not #current_member.nil?
impressionist(#current_member)
end
super
end
end
Adding it to other site parts later for some limited analytics is easy to do. The only other thing I had to do was update my routes to use the new TestSessionController for the Devise login route:
post 'login' => 'test_session#create', :as => :member_session
Devise works like normal without having to modify Devise in anyway, and my impressionist DB table is indexed and logging logins. I'll just need a rake task later to trim it weekly or so.
Now I just need to work out how to chart daily logins without having to write a bunch of looping, dirty queries...
There is also 'paper_trail' gem, that allows to track model changes.

Rails 3: ActiveRecord query with :includes only returns results that have related values in included table?

I've got a model (a Feature) that can have many Assets. These Assets each have an issue_date. I'm struggling with what seems like a simple ActiveRecord query to find all Features and their Assets with an issue_date of tomorrow, regardless of if there are Assets or not — preferably with one query.
Here's my query right now.
Feature.includes(:assets).where(:assets => { :issue_date => Date.tomorrow })
Unfortunately, this returns only the Features that have Assets with an issue_date of tomorrow. Even stranger, the generated SQL looks like this (tomorrow's obviously the 19th).
SELECT `features`.* FROM `features` WHERE `assets`.`issue_date` = '2011-08-19'
Shouldn't this have an LEFT JOIN in there somewhere? That's the sort of thing I'm going for. Using joins instead of includes does an INNER JOIN, but that's not what I want. Strangely enough, it seems like I'm getting an INNER JOIN-type of behavior. When I run that includes query above, the actual SQL that's spit out looks something like this...
SELECT `features`.`id` AS t0_r0, `features`.`property_id` AS t0_r1,
// every other column from features truncated for sanity
`assets`.`feature_id` AS t1_r1, `assets`.`asset_type` AS t1_r2,
// all other asset columns truncated for sanity
FROM `features`
LEFT OUTER JOIN `assets` ON `assets`.`feature_id` = `features`.`id`
WHERE `assets`.`issue_date` = '2011-08-19'
Which looks like it should work right but it doesn't. I get only the Features that have Assets with an issue_date of tomorrow. Any idea what I'm doing wrong?
I've tried the older, Rails v2 way of doing it…
Feature.find(:all,
:include => :assets,
:conditions => ['assets.issue_date = ?', Date.tomorrow])
Which gives me the same results. There's one Feature I know that doesn't have any Assets for tomorrow, and it's not in that list.
I've also poked around and found similar questions, but I couldn't seem to find one that explained this opposite behavior I'm seeing.
Edit: I'm so close. This gets me all the Feature objects.
Feature.joins("LEFT OUTER JOIN assets on assets.feature_id = feature.id AND asset.issue_date = #{Date.tomorrow}")
It does not, however, get me the matching Assets bundled into the object. With feature as a returned item in the query, feature.assets makes another call to the database, which I don't want. I want feature.assets to return only those I've specified in that LEFT OUTER JOIN call. What else do I need to do to my query?
I thought this would get me what I needed, but it doesn't. Calling feature.assets (with feature as an item returned in my query) does another query to look for all assets related to that feature.
Feature.joins("LEFT OUTER JOIN assets on assets.feature_id = feature.id AND asset.issue_date = #{Date.tomorrow}")
So here's what does work. Seems a little cleaner, too. My Feature model already has a has_many :assets set on it. I've set up another association with has_many :tomorrows_assets that points to Assets, but with a condition on it. Then, when I ask for Feature.all or Feature.name_of_scope, I can specify .includes(:tomorrows_assets). Winner winner, chicken dinner.
has_many :tomorrows_assets,
:class_name => "Asset",
:readonly => true,
:conditions => "issue_date = '#{Date.tomorrow.to_s}'"
I can successfully query Features and get just what I need included with it, only if it matches the specified criteria (and I've set :readonly because I know I'll never want to edit Assets like this). Here's an IRB session that shows the magic.
features = Feature.includes(:tomorrows_assets)
feature1 = features.find_all{ |f| f.name == 'This Feature Has Assets' }.first
feature1.tomorrows_assets
=> [#<Asset id:1>, #<Asset id:2>]
feature2 = features.find_all{ |f| f.name == 'This One Does Not' }.first
feature2.tomorrows_assets
=> []
And all in only two SQL queries.
I had a very similar problem and managed to solve it using the following query;
Feature.includes(:assets).where('asset.id IS NULL OR asset.issue_date = ?', Date.tomorrow)
This will load all features, regardless of whether it has any assets. Calling feature.asset will return an array of assets if available without running another query
Hope that helps someone!
You have to specify the SQL for outer joins yourself, the joins method only uses inner joins.
Feature.joins("LEFT OUTER JOIN assets ON assets.feature_id = features.id").
where(:assets => {:issue_date => Date.tomorrow})
Have you tried:
Feature.joins( :assets ).where( :issue_date => Date.tomorrow );
The guide here suggests the includes method is used to reduce the number of queries on a secondary table, rather than to join the two tables in the way you're attempting.
http://guides.rubyonrails.org/active_record_querying.html

Find all Products by Tag in Rails with ActiveRecord

This is probably something very simple but I'm looking for the optimal way of retrieving all Products by Tag, so to speak. This is with Spree, so I should stick to the way they have modeled their data. It's actually Product and Taxon (like category, brand, etc.)
So if Product has_and_belongs_to_many :taxons and Taxon has_and_belongs_to_many :products, what's the best way to find all products by a Taxon?
Something like:
#taxon = Taxon.find_by_permalink('categories/')
#products = Product.find_by_taxon(#taxon)
... but I'm not sure what goes into that last method (just made up the name).
Probably you're going to just simply say if there's only one Taxon
#products = #taxon.products
If there's multiple we require a slightly different method. But even then you could just
#products = #taxons.inject([]) {|taxon| taxon.products}
#taxon = Taxon.find_by_permalink('categories', :include => :products)
This will eager-load the products so you can access them through
#taxon.products
without it hitting the database again. This is the more efficient form of just using .products that avoids N+1 query problems.
Won't Taxon.find_by_permalink('categories/').products suffice?
EDIT: Oh and for multiple taxons you could try something like this:
Product.find(:all, :include => :products_taxons, :conditions => { :products_taxons => {:taxon_id => [1,2,3]} }) # will find products with taxons with id 1, 2 or 3
I was able to get this working in Spree 2.1.0.beta with the following customizations:
Based on the answer here: Finding records with two specific records in another table
I added a new product scope in /app/models/spree/product_decorator.rb
Spree::Product.class_eval do
add_search_scope :in_all_taxons do |*taxons|
taxons = get_taxons(taxons)
id = arel_table[:id]
joins(:taxons).where(spree_taxons: { id: taxons }).group(id).having(id.count.eq(taxons.size))
end
end
Then used the new scope by adding it to /app/models/spree/base_decorator.rb
Spree::Core::Search::Base.class_eval do
def get_base_scope
base_scope = Spree::Product.active
base_scope = base_scope.in_all_taxons(taxon) unless taxon.blank?
base_scope = get_products_conditions_for(base_scope, keywords)
base_scope = add_search_scopes(base_scope)
base_scope
end
end
Now I can use the standard search helper to retrieve products (which means I can still supply keywords, etc along with the multiple taxons):
# taxon_ids is an array of taxon ids
#searcher = build_searcher(params.merge(:taxon => taxon_ids))
#products = #searcher.retrieve_products
This works for me and felt pretty painless. However, I'm open to better options.
If you want to find a product by its tags you can use tagged_with
Example
Spree::Product.tagged_with("example")
Will return the products with the tag "example"
Source: https://github.com/mbleigh/acts-as-taggable-on