Paginating joined results with calculated columns - ruby-on-rails-3

We are calculating statistics for our client. Statistics are calculated for each SpecialtyLevel, and each statistic can have a number of error flags (not to be confused with validation errors). Here are the relationships (all the classes below are nested inside multiple modules, which I have omitted here for simplicity):
class SpecialtyLevel < ActiveRecord::Base
has_many :stats,
:class_name =>"Specialties::Aggregate::Stat",
:foreign_key => "specialty_level_id"
.......
end
class Stat < Surveys::Stat
belongs_to :specialty_level
has_many :stat_flags,
:class_name => "Surveys::PhysicianCompensation::Stats::Specialties::Aggregate::StatFlag",
:foreign_key => "stat_id"
......
end
class StatFlag < Surveys::Stats::StatFlag
belongs_to :stat, :class_name => "Surveys::PhysicianCompensation::Stats::Specialties::Aggregate::Stat"
......
end
In the view, we display one row for each SpecialtyLevel, with one column for each Stat and another column indicating whether or not there are any error flags for that SpecialtyLevel. The client wants to be able to sort the table by the number of error flags. To achieve this, I've created a scope in the SpecialtyLevel class:
scope :with_flag_counts,
select("#{self.columns_with_table_name.join(', ')}, count(stat_flags.id) as stat_flags_count").
joins("INNER JOIN #{Specialties::Aggregate::Stat.table_name} stats on stats.specialty_level_id = #{self.table_name}.id
LEFT OUTER JOIN #{Specialties::Aggregate::StatFlag.table_name} stat_flags on stat_flags.stat_id = stats.id"
).
group(self.columns_with_table_name.join(', '))
Now each row returned from the database will have a stat_flags_count field that I can sort by. This works fine, but I run into a problem when I try to paginate using this code:
def always_show_results_count_will_paginate objects, options = {}
if objects.total_entries <= objects.per_page
content_tag(:div, content_tag(:span, "Showing 0-#{objects.total_entries} of #{objects.total_entries}", :class => 'info-text'))
else
sc_will_paginate objects, options = {}
end
end
For some reason, objects.total_entries returns 1. It seems that something in my scope causes Rails to do some really funky stuff with the result set that it gives me.
The question is, is there another method I can use to return the correct value? Or is there a way that I can adjust my scope to prevent this meddling from occurring?

The group statement makes me suspicious. You may want to fire up a debugger and step through the code and see what's actually getting returned.
Is there a special reason you're using a scope and not just an attribute on the SpecialtyLevel model? Couldn't you just add a def on SpecialtyLevel that would function as a "virtual attribute" that just returns the length of the list of StatFlags?

The answer here is to calculate total_entries separately and pass that into the paginate method, for example:
count = SpecialtyLevel.for_participant(#participant).count
#models = SpecialtyLevel.
with_flag_counts.
for_participant(#participant).
paginate(:per_page => 10, :page => page, :total_entries => count)

Related

Add Arbitrary Attribute to SQL Query from Joins Record without WHERE clause (Active record)

I'm trying to create an attribute in my select statement that depends on whether or not an association exists. I'm not sure if it's possible with a single query, and the goal is to not have to iterate a list afterward.
Here is the structure.
class Project < ApplicationRecord
has_many :subscriptions
has_many :users, through: :subscriptions
end
class User < ApplicationRecord
has_many :subscriptions
has_many :projects, through: :subscriptions
end
class Subscription < ApplicationRecord
belongs_to :project
belongs_to :user
end
Knowing a project, the goal of the query is to return ALL users and include on them a new attribute call subscribed - denoting whether or not they are subscribed.
non-working code (pseudo code):
project = Project.find_by(name: 'has_subscribers')
query = 'users.*, (subscriptions.project_id = ?) AS subscribed'
users = User.includes(:subscriptions).select(query, project.id)
user.first.subscribed
# => true or false
I'm open to whether or not there is a better way of going about this. However, the information is:
You know the project record.
You query a list of ALL users
Each user record has a subscribed attribute, denoting whether its
subscribed to the given project
Solution:
I was able to figure out a straight forward solution using the bool_or aggregate method. Coalesce ensures that the value returned is false instead of nil, should no subscriptions exists.
query = "users.*, COALESCE(bool_or(subscriptions.project_id = '#{project_id}'::uuid), false) as subscribed"
User.left_outer_joins(:subscriptions)
.select(query)
.group('users.id')
Yep, you can do this:
User.joins(:projects).select(Arel.star, Subscription.arel_table[:project_id])
Which will result in a SQL query like this:
SELECT *, "subscriptions"."project_id" FROM "users" INNER JOIN "subscriptions" ON "subscriptions"."user_ud" = "users"."id";
If you want to specify a specific project (i.e. use an expression), you can do it with Arel like this:
User.joins(:projects).select(Arel.star, Subscription.arel_table[:project_id].eq(42))
Unfortunately, you won't have a column name alias, and you can't call as on an Arel::Nodes::Equality instance. I don't know enough about the internals of Arel to have a way out of that box. But you can do this if you want the composability of Arel (e.g. if this is going to be something that needs to work with multiple models or columns):
User.joins(:projects).select(Arel.star, Subscription.arel_table[:project_id].eq(42).to_sql + " as has_project")
This is a bit clunky, but it works and provides a user.has_project method that returns a boolean. You can pretty it up like so:
class User
scope :with_project_status, lambda do |project_id|
has_project =
Subscription.arel_table[:project_id].
eq(project_id).to_sql + " as has_project"
joins(:projects).select(Arel.star, has_project)
end
end
User.with_project_status(42).where(active: true)

Using includes with AREL functions

I have the following classes in my application:
class Prompt
has_many :entries
end
class Entry
belongs_to :prompt
belongs_to :user
def self.approved
where("is_approved")
end
end
class User
has_many :entries
end
And I want to display a table of all "approved" entries for a given prompt and the users that they belong_to. To generate this list I do the following query:
prompt = Prompt.find(prompt_id, :include => {:entries => :user})
But when I run the following loop, it makes a query for each user rather than using the prefetched users
prompt.entries.approved.each do |entry|
puts entry.user.id
end
How do I rewrite this so that it doesn't do a query for each iteration of the loop?
It does a query for each user because entries.approved is calling the query *where('is_approved')* for each entry. Your find statement is merely pulling all of the prompts and creating objects with access to their child attributes. I think what you need is a where statement that selects all of the entries that have the attribute 'is_approved' and then run through a loop printing their ids.
Maybe try something like #entries = Entry.where(is_approved: true).includes(entries: user) )
Then
entries.user.each do |user|
puts user.id
end
Hope thats helps.

looping through a rails data object from the controller

I have called a function from a class to find all the items related to a particular ID in a many to many HABTM relationship.
Procedures -> Tasks with a join table: procedures_tasks
I call the information like #example = Procedure.get_tasks(1,1)
I would like to be able to iterate through the data returned so that I can create an instance of each task_id related to the procedure in question
def self.get_tasks(model_id, operating_system_id)
find(:first, :select => 'tasks.id, procedures.id', :conditions => ["model_id = ? AND operating_system_id = ?", model_id, operating_system_id], :include => [:tasks])
end
I tried rendering the data as i normally would and then using .each do |f| in the view layer, but i get:
undefined method `each' for #<Procedure:0x2b879be1db30>
Original Question:
I am creating a rails application to track processes we perform. When a new instance of a process is created I want to automatically create rows for all the tasks that will need to be performed.
tables:
decommissions
models
operating_systems
procedures
tasks
procedures_tasks
host_tasks
procedures -> tasks is many to many through the procedures_tasks join table.
when you start a new decommissioning process you specify a model and OS, the model and OS specify which procedure you follow, each procedure has a list of tasks available in the join table. I am wanting to create a entry in host_tasks for each task relevant to the procedure relevant to the decommission being created.
I've done my head in over this for days, any suggestions?
class Procedure < ActiveRecord::Base
has_and_belongs_to_many :tasks
#has_many :tasks, :through => :procedures_tasks
# has_many :procedures_tasks
belongs_to :model
belongs_to :operating_system
validates_presence_of :name
validates_presence_of :operating_system_id
validates_presence_of :model_id
def self.get_tasks(model_id, operating_system_id)
find(:first, :select => 'tasks.id, procedures.id', :conditions => ["model_id = ? AND operating_system_id = ?", model_id, operating_system_id], :include => [:tasks])
end
end
the get_tasks method will retrieve the tasks associated with the procedure, but I don't know how to manipulate the data pulled from the database in rails, I haven't been able to access the attributes of the returned object through the controller because they haven't been rendered yet?
ideally i would like to be able to format this data so that I only have an array of the task_id's which i can then loop through creating new rows in the appropriate table.
It wasn't looping through because I was using the :first option when finding the data. I changed it to :all which allowed me to .each do |f| etc.
Not the best option, but there will only ever be one option anyway, so it won't cause a problem.

Ruby-on-Rails: How to pull out most recent entries from a limited subset of a database table

Imagine something like a model User who has many Friends, each of who has many Comments, where I'm trying to display to the user the latest 100 comments by his friends.
Is it possible to draw out the latest 100 in a single SQL query, or am I going to have to use Ruby application logic to parse a bigger list or make multiple queries?
I see two ways of going about this:
starting at User.find and use some complex combination of :join and :limit. This method seems promising, but unfortunately, would return me users and not comments, and once I get those back, I'd have lots of models taking up memory (for each Friend and the User), lots of unnecessary fields being transferred (everything for the User, and everything about the name row for the Friends), and I'd still have to step through somehow to collect and sort all the comments in application logic.
starting at the Comments and using some sort of find_by_sql, but I just can't seem to figure out what I'd need to put in. I don't know how you could have the necessary information to pass in with this to limit it to only looking at comments made by friends.
Edit: I'm having some difficult getting EmFi's solution to work, and would appreciate any insight anyone can provide.
Friends are a cyclic association through a join table.
has_many :friendships
has_many :friends,
:through => :friendships,
:conditions => "status = #{Friendship::FULL}"
This is the error I'm getting in relevant part:
ERROR: column users.user_id does not exist
: SELECT "comments".* FROM "comments" INNER JOIN "users" ON "comments".user_id = "users".id WHERE (("users".user_id = 1) AND ((status = 2)))
When I just enter user.friends, and it works, this is the query it executes:
: SELECT "users".* FROM "users" INNER JOIN "friendships" ON "users".id = "friendships".friend_id WHERE (("friendships".user_id = 1) AND ((status = 2)))
So it seems like it's mangling the :through to have two :through's in one query.
Given the following relationships:
class User < ActiveRecord::Base
has_many :friends
has_many :comments
has_many :friends_comments, :through => :friends, :source => :comments
end
This statement will execute a single SQL statement. Associations essentially create named scopes for you that aren't evaluated until the end of the chain.
#user.friends_comments.find(:limit => 100, :order => 'created_at DESC')
If this is a common query, the find can be simplified into its own scope.
class Comments < ActiveRecord::Base
belongs_to :user
#named_scope was renamed to scope in Rails 3.2. If you're working
#if you're working in a previous version uncomment the following line.
#named_scope :recent, :limit => 100, : order => 'created at DESC'
scope :recent, :limit => 100, :order => 'created_at DESC'
end
So now you can do:
#user.friends_comments.recent
N.B.: The friends association on user may be a cyclical one through a join table, but that's not important to this solution. As long as friends is a working association on User, the preceding will work.

In Rails, how can I return a set of records based on a count of items in a relation OR criteria about the relation?

I'm writing a Rails app in which I have two models: a Machine model and a MachineUpdate model. The Machine model has many MachineUpdates. The MachineUpdate has a date/time field. I'm trying to retrieve all Machine records that have the following criteria:
The Machine model has not had a MachineUpdate within the last 2 weeks, OR
The Machine model has never had any updates.
Currently, I'm accomplishing #1 with a named scope:
named_scope :needs_updates,
:include => :machine_updates,
:conditions => ['machine_updates.date < ?', UPDATE_THRESHOLD.days.ago]
However, this only gets Machine models that have had at least one update. I also wanted to retrieve Machine models that have had no updates. How can I modify needs_updates so the items it returns fulfills that criteria as well?
One solution is to introduce a counter_cache:
# add a machine_updates_count integer database column (with default 0)
# and add this to your Machine model:
counter_cache :machine_updates_count
and then add OR machine_updates_count = 0 to your SQL conditions.
However, you can also solve the problem without a counter cache by using a LEFT JOIN:
named_scope :needs_updates,
:select => "machines.*, MAX(machine_updates.date) as last_update",
:joins => "LEFT JOIN machine_updates ON machine_updates.machine_id = machines.id",
:group => "machines.id",
:having => ["last_update IS NULL OR last_update < ?", lambda{ UPDATE_THRESHOLD.seconds.ago }]
The left join is necessary so that you are sure you are looking at the most recent MachineUpdate (the one with MAX date).
Note also that you have to put your condition in a lambda so it is evaluated every time the query is run. Otherwise it will be evaluated only once (when your model is loaded on application boot-up), and you will not be able to find Machines that have come to need updates since your app started.
UPDATE:
This solution works in MySQL and SQLite, but not PostgreSQL. Postgres does not allow naming of columns in the SELECT clause that are not used in the GROUP BY clause (see discussion). I'm very unfamiliar with PostgreSQL, but I did get this to work as expected:
named_scope :needs_updates, lambda{
cols = Machine.column_names.collect{ |c| "\"machines\".\"#{c}\"" }.join(",")
{
:select => cols,
:group => cols,
:joins => 'LEFT JOIN "machine_updates" ON "machine_updates"."machine_id" = "machines"."id"',
:having => ['MAX("machine_updates"."date") IS NULL OR MAX("machine_updates"."date") < ?', UPDATE_THRESHOLD.days.ago]
}
}
If you can make changes in the table, then you can use the :touch method of the belongs_to association.
For instance, add a datetime column to Machine named last_machine_update. Thereafter in the belongs_to of MachineUpdate, add :touch => :last_machine_update. This will cause that field to become updated with the last time you either added or modified a MachineUpdate connected to that Machine, thus removing the need for the named scope.
Otherwise I would probably do it like Alex proposes.
I just ran into a similar problem. It's actually pretty simple:
Machine.all(
:include => :machine_updates,
:conditions => "machine_updates.machine_id IS NULL OR machine_update.date < ?", UPDATE_THRESHOLD.days.ago])
If you were doing a named scope, just use lambdas to ensure that the date is re-calculated every time the named scope is called
named_scope :needs_updates, lambda { {
:include => :machine_updates,
:conditions => "machine_updates.machine_id IS NULL OR machine_update.date < ?", UPDATE_THRESHOLD.days.ago]
} }
If you want to avoid returning all of the MachineUpdate records in your query, then you need to use the :select option to only return the columns you want
named_scope :needs_updates, lambda { {
:select => "machines.*",
:conditions => "machine_updates.machine_id IS NULL OR machine_update.date < ?", UPDATE_THRESHOLD.days.ago]
} }